How multi-channel AI agents actually work under the hood
A deep dive into the openclawOS kernel: routing, sessions, identity, memory and the tricky parts of making one agent feel coherent across WhatsApp, Telegram, Discord and the rest.
- architecture
- deep dive
- kernel
- sessions
A multi-channel AI agent looks simple from the outside: you message it in Telegram, it replies with the same personality and memory it had when you talked to it in WhatsApp last night. Under the hood, that simplicity is the result of half a dozen design decisions that each could have been done differently.
This is a tour of openclawOS’s architecture — specifically the parts that make “the agent feels coherent across messengers” actually true.
The kernel metaphor isn’t decoration
The Gateway is a long-running daemon. It owns:
- A SQLite database (sessions, memory, bindings, credentials).
- A WebSocket fan-out (Control UI, mobile nodes, debug clients).
- A supervisor for channel apps (separate Node processes).
- A supervisor for agents (in-process by default, sub-process when isolation matters).
This shape comes from a microkernel OS: small core, isolated workers, well-defined IPC. A crash in the Discord adapter does not take down WhatsApp. An infinite loop in a custom agent does not freeze the Gateway. You get to ship unstable code in a stable container.
Channels: the messy edge
Every messenger has its own personality. WhatsApp’s Baileys protocol is a careful reverse-engineering of WhatsApp Web. Telegram is a clean REST + websocket API. Discord is a half-REST, half-WebSocket gateway. Signal is signal-cli wrapping Java internals. iMessage is AppleScript and SQLite.
The Gateway hides all of this behind a tiny internal protocol:
inbound: Channel → Gateway: { senderId, body, attachments?, replyContext? }
outbound: Gateway → Channel: { recipientId, body, attachments?, reactions? }
A channel app is responsible for translating between its native protocol and this normalised form. The Gateway never sees a Discord button or a WhatsApp QR code — it sees senderId and body.
The wins:
- Adding a channel is a self-contained project. The Mattermost adapter doesn’t have to understand sessions or routing — that’s the Gateway’s job.
- The Gateway is a single mental model regardless of how many channels are paired.
- You can ship a custom binary that includes only the channels you use, for minimal attack surface.
Bindings: the routing layer
A binding is the answer to “when a message arrives at this place, what should happen?”. Bindings are a small declarative language:
trigger:
channel: telegram
scope: { groupId: "-1001234..." }
match:
mentions: ["@pi_bot"]
agent: pi
session:
scope: user
isolation: strict
hooks:
beforeSend:
- call: censor_pii
afterReceive:
- call: log_to_jsonl
A binding can match on channel, sender, group, message body, or any combination. The matched message is routed to an agent running with a specified session strategy. Hooks let you intercept before/after the agent runs.
This is where the platform’s flexibility lives. You can have a coding Pi listening in your Discord while a research Pi listens in your iMessage — same Gateway, different bindings.
Sessions: identity, memory, branching
A session is one continuous conversation between a sender and an agent. Sessions belong to sender identity, not channel. This is the magic that makes multi-channel feel like one agent.
How identity gets resolved
Every message arrives with a channel-native sender ID — a phone number, a Discord user ID, an email. The Gateway maps these into a single openclawOS identity via a small table you can edit:
identity "dipankar":
whatsapp: +49 1577 1234567
imessage: me@example.com
telegram: @dipankarsarkar
discord: 1234567890
When a message from any of those arrives, the session attaches to “dipankar”. One Pi, one memory, four messengers.
For unbound senders (people not in your identity table), the Gateway creates a per-channel anonymous identity. This is what gives a Discord stranger their own session without polluting yours.
Compaction
When a session’s context approaches the model’s window, the Gateway calls the LLM with a summarisation prompt and replaces the oldest turns with a tight summary block. The replaced turns stay on disk for audit; only the summary appears in subsequent prompts.
Compaction is opportunistic — when the next message arrives, the Gateway checks whether the prompt will fit and runs compaction inline if it won’t. The user sees a slight latency on that one message. The agent keeps the gist.
Branching
Sometimes you want to explore an idea without polluting the main thread. openclaw fork (or /fork in any channel that supports commands) duplicates the current session and lets you wander. Branches can be merged back, abandoned, or kept as their own threads forever.
Memory: vector + structured
Long-term memory is two layers:
- Vector store backed by
sqlite-vec. Pi can write semantically searchable notes (“user prefers Postgres over MySQL”) and recall them in future sessions. - Structured facts in normal SQLite tables. Pi can record arbitrary key-value facts about an identity (“dipankar’s wife: Sara”, “dipankar’s location: Berlin”).
Memory writes happen via explicit tool calls — Pi has to decide to remember. This is intentional: it avoids the failure mode where every conversation pollutes the long-term store.
Agents: in-process and bring-your-own
Pi is the default agent. It speaks to the configured LLM provider and runs its tool loop in-process. For most users, this is what they want.
For custom agents, openclawOS exposes a tiny interface:
export interface Agent {
name: string;
respond(input: AgentInput): AsyncIterable<AgentEvent>;
}
You implement it, drop the file under ~/.openclaw/agents/, hot-reload picks it up. The Gateway treats your agent the same as Pi — it gets a session, can call tools, can read memory.
For agents that need isolation (e.g. running untrusted user-submitted code), the Gateway supports running an agent in a worker subprocess with restricted file system and network access.
What this all gives you
The end result, for the user, is a chatbot that feels like a single coherent assistant across every messenger they use. The architecture exists to make that possible without inheriting half a dozen messenger-specific complications.
Most of the magic is, as usual, in the boring parts: a clean protocol, isolated processes, explicit identity, write-on-demand memory. None of these are novel ideas — they’re just the right ideas, applied carefully, in one place.
Frequently asked
Because the gateway daemon plays the same structural role: it owns the data plane, isolates processes (channel adapters and agents), schedules work, and brokers shared resources (sessions, memory, credentials). The OS metaphor is structural, not marketing.
Related reading
The complete guide to self-hosted AI gateways in 2026
What a self-hosted AI gateway is, why it matters, how openclawOS implements one, and a practical setup walkthrough — from zero to multi-channel agent in an afternoon.
How openclawOS uses sqlite-vec for agent memory
An honest look at how openclawOS stores Pi's memory: sessions in normal SQLite tables, vector recall via sqlite-vec, no Pinecone or Weaviate required.
How Pi's tool use works (and how to add your own tools)
Pi, the agent bundled with openclawOS, has a tool loop with browser, shell, file system, vector memory, and more. Here's how it works and how to extend it.