Skip to main content
FIELD REPORT · AI

Integrating Custom AI Agents with Slack, Teams, and Email

Concrete architectures for shipping AI agents into Slack, Microsoft Teams, and email. Frameworks, code, rate limiting, deduping, and where the production failures show up first.

PUBLISHED
April 27, 2026
READ TIME
10 MIN
AUTHOR
ONE FREQUENCY

Custom AI agents in 2026 are not a research project. They are a delivery problem. The model layer is solved well enough — you call Claude, GPT-5, or Gemini and you get a response. The hard parts are integration: getting the agent into Slack without violating rate limits, surfacing it in Teams with the right enterprise controls, and parsing inbound email reliably enough that users trust the response. This guide covers the architectures that work, with real TypeScript code, and where to expect the first production failure.

Architecture overview

Every agent integration follows the same shape:

  1. Inbound event from the platform (Slack message, Teams activity, inbound email)
  2. Authentication and verification of the event signature
  3. Deduplication to prevent double-processing on retries
  4. Routing to the right agent logic
  5. Model call with conversation context
  6. Outbound delivery back to the platform
  7. Observability — logs, metrics, traces

The platforms differ in event shape and delivery mechanism, but the architecture stays constant. Build a shared "agent core" library, then layer thin platform adapters on top.

Slack integration with Bolt for JavaScript

Slack remains the easiest integration target. The Bolt for JavaScript framework (slack/bolt-js, currently 4.x) handles the event subscription, OAuth, and message delivery patterns. The canonical setup:

import { App, ExpressReceiver } from '@slack/bolt'
import { Anthropic } from '@anthropic-ai/sdk'

const receiver = new ExpressReceiver({
  signingSecret: process.env.SLACK_SIGNING_SECRET!,
  processBeforeResponse: true,
})

const app = new App({
  token: process.env.SLACK_BOT_TOKEN,
  receiver,
})

const anthropic = new Anthropic()

app.event('app_mention', async ({ event, client, logger }) => {
  // Acknowledge fast — Slack expects a response within 3 seconds
  // Heavy work happens after this returns
  const threadTs = event.thread_ts ?? event.ts
  const channel = event.channel

  // Deduplicate via event_id; Slack retries aggressively on timeout
  const eventId = (event as any).event_id ?? event.ts
  if (await alreadyProcessed(eventId)) return
  await markProcessed(eventId)

  // Pull thread history for conversation context
  const history = await client.conversations.replies({
    channel,
    ts: threadTs,
    limit: 20,
  })

  const messages = (history.messages ?? []).map(m => ({
    role: m.bot_id ? 'assistant' as const : 'user' as const,
    content: m.text ?? '',
  }))

  try {
    const response = await anthropic.messages.create({
      model: 'claude-sonnet-4-5',
      max_tokens: 1024,
      system: 'You are a helpful enterprise assistant. Be direct and concise.',
      messages,
    })

    const text = response.content
      .filter(b => b.type === 'text')
      .map(b => (b as any).text)
      .join('\n')

    await client.chat.postMessage({
      channel,
      thread_ts: threadTs,
      text,
    })
  } catch (err) {
    logger.error('agent failure', { err, eventId })
    await client.chat.postMessage({
      channel,
      thread_ts: threadTs,
      text: 'I hit an error. Try again in a moment.',
    })
  }
})

await app.start(Number(process.env.PORT ?? 3000))

Things that bite you in production:

  • Slack's 3-second rule. Slack expects a 200 OK within 3 seconds of delivery. If your model call takes 8 seconds, Slack retries up to 3 times. Without deduplication, the user gets three answers. The fix is the processBeforeResponse: true flag combined with explicit deduplication.
  • Rate limits. Slack Web API enforces tier-based rate limits (chat.postMessage is Tier 1: 1 message per channel per second). Use the Bolt rate-limit middleware or queue outbound messages.
  • Threading. Reply in the originating thread, not the channel. The thread_ts of the event tells you where to post.
  • Bot loop prevention. Check event.bot_id before responding. Without this, two bots in the same channel can spiral.

Microsoft Teams integration

Teams gives you three integration paths. Pick based on the use case.

Bot Framework path

The most flexible path is a Teams bot via the Microsoft Bot Framework (botbuilder, currently 4.22.x). This is what you use for conversational agents that need rich message types, adaptive cards, proactive messages, and channel-level deployment.

import {
  CloudAdapter,
  ConfigurationServiceClientCredentialFactory,
  createBotFrameworkAuthenticationFromConfiguration,
  TeamsActivityHandler,
  TurnContext,
} from 'botbuilder'
import { Anthropic } from '@anthropic-ai/sdk'

const credentialsFactory = new ConfigurationServiceClientCredentialFactory({
  MicrosoftAppId: process.env.MS_APP_ID,
  MicrosoftAppPassword: process.env.MS_APP_PASSWORD,
  MicrosoftAppType: 'SingleTenant',
  MicrosoftAppTenantId: process.env.MS_TENANT_ID,
})

const auth = createBotFrameworkAuthenticationFromConfiguration(null, credentialsFactory)
const adapter = new CloudAdapter(auth)
const anthropic = new Anthropic()

class EnterpriseAgent extends TeamsActivityHandler {
  constructor() {
    super()
    this.onMessage(async (context: TurnContext, next) => {
      await context.sendActivities([{ type: 'typing' }])

      const text = context.activity.text ?? ''
      const conversationId = context.activity.conversation.id

      const history = await loadHistory(conversationId)
      history.push({ role: 'user', content: text })

      const response = await anthropic.messages.create({
        model: 'claude-sonnet-4-5',
        max_tokens: 1024,
        system: 'You are an enterprise assistant.',
        messages: history,
      })

      const reply = response.content
        .filter(b => b.type === 'text')
        .map(b => (b as any).text)
        .join('\n')

      history.push({ role: 'assistant', content: reply })
      await saveHistory(conversationId, history)

      await context.sendActivity({ type: 'message', text: reply })
      await next()
    })
  }
}

const bot = new EnterpriseAgent()

// Express handler
app.post('/api/messages', (req, res) => {
  adapter.process(req, res, async context => bot.run(context))
})

Adaptive Cards are how you ship rich content in Teams. For an agent that returns structured data — a table, a status, a confirmation prompt — render an Adaptive Card payload rather than markdown. Teams renders markdown poorly compared to Slack.

Message extension path

For "search and insert" scenarios — find a customer record, paste a meeting summary — a Teams message extension is the right surface. Less code than a full bot, deeper integration with the compose box.

Copilot Studio path

For low-code or low-engineering-effort agent deployments inside Teams, Copilot Studio is increasingly the right call. Build the agent topics in Copilot Studio, deploy to Teams in two clicks, and inherit Microsoft's identity, governance, and audit logging. The trade-off is less flexibility — you are bound to the Copilot Studio runtime.

Things that bite you in Teams:

  • Authentication. Teams bots use Entra ID app registrations, not personal access tokens. Single-tenant vs multi-tenant matters and changes the deployment shape.
  • Channel vs personal chat scope. An agent must explicitly declare which scopes it supports in the app manifest. Personal chat is one-to-one; channel scope requires @ mention to trigger.
  • Proactive messaging. Sending an unsolicited message to a user requires a stored conversation reference. Capture it on first contact and persist it.

Email integration

Email is the hardest of the three. The platforms are heterogeneous, the rate limits are unpredictable, and the parsing is messy. Three patterns that work:

Inbound parsing via SES or SendGrid

For organizations that already run AWS SES or SendGrid for outbound mail, inbound parsing is a small extension. SES receipt rules deliver inbound mail to an S3 bucket and trigger a Lambda. SendGrid's Inbound Parse Webhook POSTs to your endpoint.

import { simpleParser } from 'mailparser'

export const handler = async (event: any) => {
  const raw = await fetchRawEmailFromS3(event)
  const parsed = await simpleParser(raw)

  const from = parsed.from?.value[0]?.address ?? ''
  const subject = parsed.subject ?? ''
  const body = parsed.text ?? ''

  // Dedupe by Message-Id
  if (await alreadyProcessed(parsed.messageId)) return
  await markProcessed(parsed.messageId!)

  // Route based on the To address or subject
  const agent = routeAgent(parsed.to, subject)
  const reply = await agent.respond({ from, subject, body })

  await sendEmail({
    to: from,
    subject: `Re: \${subject}`,
    body: reply,
    inReplyTo: parsed.messageId,
  })
}

IMAP polling

For mailboxes you cannot easily route to a webhook (shared inbox at a customer, a legacy domain), IMAP polling with imapflow works. Poll every 30 to 60 seconds, mark messages read after processing, and persist last-processed UID per folder to recover from restarts.

Microsoft Graph webhooks

For M365 mailboxes, the cleanest pattern is a Graph change notification subscription on the target mailbox. Graph posts to your webhook when new mail arrives. You retrieve the message via the Graph API, process, and reply via the Graph API.

import { Client } from '@microsoft/microsoft-graph-client'

app.post('/api/graph-notifications', async (req, res) => {
  if (req.query.validationToken) {
    res.type('text/plain').send(req.query.validationToken)
    return
  }
  res.sendStatus(202)

  for (const notification of req.body.value) {
    const messageId = notification.resourceData.id
    const message = await graph.api(`/users/\${userId}/messages/\${messageId}`).get()

    if (await alreadyProcessed(messageId)) continue
    await markProcessed(messageId)

    const reply = await runAgent(message.bodyPreview, message.from.emailAddress.address)
    await graph.api(`/users/\${userId}/messages/\${messageId}/reply`).post({
      message: { body: { contentType: 'Text', content: reply } },
    })
  }
})

Things that bite you with email:

  • Reply threading. Set the In-Reply-To and References headers correctly or recipients see unrelated threads.
  • HTML vs plain text. Always parse text first; fall back to HTML stripped of tags.
  • Auto-reply loops. Detect X-Autoreply, Auto-Submitted, and "out of office" signals before responding. Without this, you will eventually loop with another bot.
  • Sender verification. Trust SPF, DKIM, and DMARC results from your inbound platform before acting on instructions in the email body.

Frameworks to lean on

A few frameworks have proven themselves for the agent core:

  • Vercel AI SDK (ai, 4.x). Strong streaming primitives, model-agnostic, light. Excellent for Slack and Teams where you want token-by-token delivery.
  • LangChain.js (currently 0.3.x). Heavier, more opinionated. Useful for complex agent graphs and tool use. Be selective — the abstractions can hurt as much as they help.
  • Cloudflare Workers AI. If you want low-latency edge inference for simple tasks, Workers AI gives you Llama and Mistral models at the edge with no cold-start.
  • Anthropic SDK and OpenAI SDK for direct API calls. Both are clean, well-typed, and worth using over LangChain for simple flows.

Rate limiting, deduping, conversation memory, observability

Four operational concerns that determine whether your agent survives the second week.

Rate limiting. Wrap your model client with a per-tenant token bucket. The Anthropic API rate-limits per-organization at 4,000 requests per minute and 400,000 tokens per minute on the standard tier (varies by tier). Slack and Teams enforce per-app rate limits separately. Plan for both.

Deduping. Every platform retries. Slack retries on 3-second timeout. Teams retries on 5xx. Graph subscriptions occasionally double-deliver. Use a Redis or DynamoDB-backed idempotency store keyed on the platform's event ID with a 24-hour TTL.

Conversation memory. Persist conversation state in DynamoDB, Redis, or a Postgres table keyed by conversation ID. For Slack, the thread_ts is your key. For Teams, the conversation reference. For email, the thread headers. Bound the context window — load the last 10 to 20 messages, not the entire history.

Observability. Most production failures show up first in logs, not in metrics. Log every inbound event, every model call (with token counts), every outbound delivery, and every error. Pipe to a SIEM or to a tool like the agent-observability-metrics piece we published. The first thing you will find: model calls that occasionally take 30+ seconds, which silently exceeds your platform timeout.

Where the first production failure shows up

In order of likelihood:

  1. Duplicate responses. Deduplication is missing or broken. Users see two answers. The fix is the idempotency store described above.
  2. Silent timeouts. A long model call exceeds the platform's webhook timeout. The platform retries. You see no error, just confused users. Always ack the platform fast and process async.
  3. Rate-limit cascades. A burst of inbound events triggers a burst of outbound messages, hitting the platform's rate limit. Use queues with backoff.
  4. Permission drift. Slack scopes expire, Teams app permissions get revoked, Graph subscriptions expire after 4230 minutes. Re-acquire and refresh on a schedule.
  5. Prompt injection in inbound content. A user pastes a malicious instruction; your agent follows it. Sanitize, scope tool access tightly, and audit.

The copilot-governance-checklist on our site covers the broader governance frame for AI in the enterprise. Read it alongside this guide.

Next steps

Build the agent core once and deploy it to all three surfaces. The marginal cost of adding the second and third platform is small if the core is well-factored. Start with Slack if your culture lives there, with Teams if M365 is the standard, and with email if you have a clear single-use-case workflow that demands it.

View All Insights
NEXT STEP

Ready to ship the next outcome?

One Frequency Consulting brings 25+ years of technology leadership and military discipline to every engagement. First call is operator-grade scoping — sixty minutes, no charge.