Why call it a "kernel"?

Because the gateway daemon plays the same structural role: it owns the data plane, isolates processes (channel adapters and agents), schedules work, and brokers shared resources (sessions, memory, credentials). The OS metaphor is structural, not marketing.

What's the hardest part of multi-channel?

Identity. Two messages from the same human in two different messengers need to feel like the same conversation — but you can't trust the messenger's user ID to be the same human across apps. openclawOS solves this with explicit identity linking.

How do sessions handle long conversations?

With compaction. When a session's context exceeds a threshold, the gateway calls the LLM with a summarisation prompt and replaces older turns with a tight summary. The user notices nothing; the agent keeps the gist; token usage stays bounded.

How multi-channel AI agents actually work under the hood ·…

A multi-channel AI agent looks simple from the outside: you message it in Telegram, it replies with the same personality and memory it had when you talked to it in WhatsApp last night. Under the hood, that simplicity is the result of half a dozen design decisions that each could have been done differently.

This is a tour of openclawOS’s architecture — specifically the parts that make “the agent feels coherent across messengers” actually true.

The kernel metaphor isn’t decoration

The Gateway is a long-running daemon. It owns:

A SQLite database (sessions, memory, bindings, credentials).
A WebSocket fan-out (Control UI, mobile nodes, debug clients).
A supervisor for channel apps (separate Node processes).
A supervisor for agents (in-process by default, sub-process when isolation matters).

This shape comes from a microkernel OS: small core, isolated workers, well-defined IPC. A crash in the Discord adapter does not take down WhatsApp. An infinite loop in a custom agent does not freeze the Gateway. You get to ship unstable code in a stable container.

Channels: the messy edge

Every messenger has its own personality. WhatsApp’s Baileys protocol is a careful reverse-engineering of WhatsApp Web. Telegram is a clean REST + websocket API. Discord is a half-REST, half-WebSocket gateway. Signal is signal-cli wrapping Java internals. iMessage is AppleScript and SQLite.

The Gateway hides all of this behind a tiny internal protocol:

inbound:  Channel → Gateway: { senderId, body, attachments?, replyContext? }
outbound: Gateway → Channel: { recipientId, body, attachments?, reactions? }

A channel app is responsible for translating between its native protocol and this normalised form. The Gateway never sees a Discord button or a WhatsApp QR code — it sees senderId and body.

The wins:

Adding a channel is a self-contained project. The Mattermost adapter doesn’t have to understand sessions or routing — that’s the Gateway’s job.
The Gateway is a single mental model regardless of how many channels are paired.
You can ship a custom binary that includes only the channels you use, for minimal attack surface.

Bindings: the routing layer

A binding is the answer to “when a message arrives at this place, what should happen?”. Bindings are a small declarative language:

trigger:
  channel: telegram
  scope: { groupId: "-1001234..." }
  match:
    mentions: ["@pi_bot"]
agent: pi
session:
  scope: user
  isolation: strict
hooks:
  beforeSend:
    - call: censor_pii
  afterReceive:
    - call: log_to_jsonl

A binding can match on channel, sender, group, message body, or any combination. The matched message is routed to an agent running with a specified session strategy. Hooks let you intercept before/after the agent runs.

This is where the platform’s flexibility lives. You can have a coding Pi listening in your Discord while a research Pi listens in your iMessage — same Gateway, different bindings.

Sessions: identity, memory, branching

A session is one continuous conversation between a sender and an agent. Sessions belong to sender identity, not channel. This is the magic that makes multi-channel feel like one agent.

How identity gets resolved

Every message arrives with a channel-native sender ID — a phone number, a Discord user ID, an email. The Gateway maps these into a single openclawOS identity via a small table you can edit:

identity "dipankar":
  whatsapp:   +49 1577 1234567
  imessage:   me@example.com
  telegram:   @dipankarsarkar
  discord:    1234567890

When a message from any of those arrives, the session attaches to “dipankar”. One Pi, one memory, four messengers.

For unbound senders (people not in your identity table), the Gateway creates a per-channel anonymous identity. This is what gives a Discord stranger their own session without polluting yours.

Compaction

When a session’s context approaches the model’s window, the Gateway calls the LLM with a summarisation prompt and replaces the oldest turns with a tight summary block. The replaced turns stay on disk for audit; only the summary appears in subsequent prompts.

Compaction is opportunistic — when the next message arrives, the Gateway checks whether the prompt will fit and runs compaction inline if it won’t. The user sees a slight latency on that one message. The agent keeps the gist.

Branching

Sometimes you want to explore an idea without polluting the main thread. openclaw fork (or /fork in any channel that supports commands) duplicates the current session and lets you wander. Branches can be merged back, abandoned, or kept as their own threads forever.

Memory: vector + structured

Long-term memory is two layers:

Vector store backed by sqlite-vec. Pi can write semantically searchable notes (“user prefers Postgres over MySQL”) and recall them in future sessions.
Structured facts in normal SQLite tables. Pi can record arbitrary key-value facts about an identity (“dipankar’s wife: Sara”, “dipankar’s location: Berlin”).

Memory writes happen via explicit tool calls — Pi has to decide to remember. This is intentional: it avoids the failure mode where every conversation pollutes the long-term store.

Agents: in-process and bring-your-own

Pi is the default agent. It speaks to the configured LLM provider and runs its tool loop in-process. For most users, this is what they want.

For custom agents, openclawOS exposes a tiny interface:

export interface Agent {
  name: string;
  respond(input: AgentInput): AsyncIterable<AgentEvent>;
}

You implement it, drop the file under ~/.openclaw/agents/, hot-reload picks it up. The Gateway treats your agent the same as Pi — it gets a session, can call tools, can read memory.

For agents that need isolation (e.g. running untrusted user-submitted code), the Gateway supports running an agent in a worker subprocess with restricted file system and network access.

What this all gives you

The end result, for the user, is a chatbot that feels like a single coherent assistant across every messenger they use. The architecture exists to make that possible without inheriting half a dozen messenger-specific complications.

Most of the magic is, as usual, in the boring parts: a clean protocol, isolated processes, explicit identity, write-on-demand memory. None of these are novel ideas — they’re just the right ideas, applied carefully, in one place.

How multi-channel AI agents actually work under the hood

The kernel metaphor isn’t decoration

Channels: the messy edge

Bindings: the routing layer

Sessions: identity, memory, branching

How identity gets resolved

Compaction

Branching

Memory: vector + structured

Agents: in-process and bring-your-own

What this all gives you

Frequently asked

Related reading

The complete guide to self-hosted AI gateways in 2026

How openclawOS uses sqlite-vec for agent memory

How Pi's tool use works (and how to add your own tools)

Run your own gateway.