Description
Introduce an UntrustedContext layer to openab so that external, user-editable metadata — starting with Discord channel topics — can be safely injected into downstream ACP agent prompts.
Today, openab's SenderContext carries only trusted, structured identity fields (sender_id, sender_name, channel_id, etc.). There is no mechanism to pass channel-level metadata like the channel topic or description. This means agents running on openab have no awareness of the channel's stated purpose — information that other platforms already provide.
Use Cases
- Channel-scoped behavior — A channel topic like
"Production alerts only — respond in English" tells the agent the expected tone, language, and scope without per-channel config.
- Project context — Teams using Discord forum channels set topics like
"Planning and coordination for Project X". The agent can use this to stay on-topic.
- Self-documenting channels — Operators can steer agent behavior by editing the channel topic in Discord UI, no config file changes needed.
- Security — Channel topics are user-editable by anyone with
MANAGE_CHANNELS permission. They must not be treated as trusted system instructions, or they become a prompt injection vector.
Prior Art Investigation
We investigated how OpenClaw and Hermes Agent handle channel topics:
OpenClaw (openclaw/openclaw, commit 7e13f3f)
OpenClaw has a mature UntrustedContext system with sanitization + isolation:
-
Read topic from Discord channel object — channel-access.ts L49-51: resolveDiscordChannelTopicSafe() safely reads channel.topic.
-
Pass through message handler — message-handler.process.ts L310: channelTopic: channelInfo?.topic.
-
Wrap as untrusted content — inbound-context.ts L59-66: Uses buildUntrustedChannelMetadata() to wrap the topic with security boundaries.
-
Sanitization pipeline — channel-metadata.ts and external-content.ts:
- Wraps content in
<<<EXTERNAL_UNTRUSTED_CONTENT id="random">>> markers with random IDs to prevent spoofing
- Strips LLM special tokens (
<|im_start|>, [INST], etc. — 20+ patterns)
- Neutralizes Unicode homoglyph attacks (fullwidth chars, angle bracket variants)
- Detects prompt injection patterns (
ignore previous instructions, you are now a..., etc.)
- Truncates to 400 chars/entry, 800 chars total
-
Isolation — The untrusted context is placed in a separate UntrustedContext[] array, never mixed into GroupSystemPrompt. The test at inbound-context.test.ts and message-handler.inbound-context.test.ts explicitly verify that a topic like "Ignore system instructions" stays in the untrusted layer.
-
Default on, no opt-out — Channel topic injection is always active for guild channels; DMs skip it automatically (isGuild: false).
Hermes Agent (NousResearch/hermes-agent, commit 192e7eb)
Hermes Agent also supports channel topics, but with a simpler (less secure) approach:
-
SessionSource.chat_topic — session.py L86: chat_topic: Optional[str] = None.
-
Discord reads topic — discord.py L3031-3038: _get_effective_topic() reads channel.topic, falls back to parent forum topic for threads.
-
Injected as plain text in system prompt — session.py L282-283: **Channel Topic:** {topic} — directly in the system prompt with no untrusted wrapping or sanitization.
Key Differences
| Aspect |
OpenClaw |
Hermes Agent |
openab (current) |
| Reads channel topic |
✅ |
✅ |
❌ |
| Untrusted context layer |
✅ (sanitize + isolate) |
❌ (plain text in system prompt) |
❌ (no layer exists) |
| Prompt injection protection |
✅ (boundary markers, token stripping, homoglyph defense) |
❌ |
N/A |
| Forum thread fallback |
✅ (via resolveDiscordChannelInfoSafe) |
✅ (via _get_effective_topic) |
N/A |
Proposed Design for openab
Architecture
openab — Proposed UntrustedContext Layer
═══════════════════════════════════════════════════════════════════
Discord Message
│
▼
┌───────────────────────────────────────────────────────────────┐
│ discord.rs │
│ │
│ msg.channel_id.to_channel(&ctx.http).await │
│ │ │
│ ▼ │
│ GuildChannel { │
│ topic: Some("Production alerts — respond in English"), │
│ name: "ops-alerts", │
│ thread_metadata, parent_id, owner_id, ... │
│ } │
│ │ │
│ │ gc.topic already available at line ~387 │
│ │ (currently only thread_metadata/parent_id used) │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ UntrustedContext::wrap(source, label, content) │ │
│ │ │ │
│ │ 1. Truncate (400 char/entry, 800 total) │ │
│ │ 2. Strip LLM special tokens │ │
│ │ 3. Neutralize boundary marker spoofing │ │
│ │ 4. Wrap in random-ID boundary markers │ │
│ └──────────────────────┬──────────────────────────────┘ │
│ │ │
└─────────────────────────┼──────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ ACP Prompt Assembly (adapter.rs) │
│ │
│ content_blocks: [ │
│ Text { │
│ "<sender_context>{...}</sender_context>" ← trusted │
│ │
│ "<<<EXTERNAL_UNTRUSTED_CONTENT id=\"a1b2\">>>" │
│ "Source: Channel metadata" │
│ "---" │
│ "UNTRUSTED channel metadata (discord)" │
│ "Discord channel topic:" │
│ "Production alerts — respond in English" │
│ "<<<END_EXTERNAL_UNTRUSTED_CONTENT id=\"a1b2\">>>" │
│ ← isolated │
│ "user's actual message" │
│ }, │
│ Image { ... }, ← if image attached │
│ ] │
└──────────────────────┬────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Downstream ACP Agent (Kiro CLI / Claude Code / etc) │
│ │
│ Agent sees channel topic as reference context, │
│ but knows it's untrusted external content. │
└───────────────────────────────────────────────────────────────┘
Protection Layers
The UntrustedContext wrapper provides sanitize + isolate defense-in-depth:
| Layer |
What it does |
Why |
| Truncation |
400 chars/entry, 800 chars total |
Limit attack surface |
| LLM token stripping |
Remove <|im_start|>, [INST], <s>, etc. |
Prevent chat-template escape |
| Homoglyph folding |
Normalize fullwidth chars, Unicode angle brackets |
Prevent visual spoofing of markers |
| Marker sanitization |
Replace spoofed <<<EXTERNAL_UNTRUSTED_CONTENT>>> |
Prevent boundary escape |
| Random boundary IDs |
<<<EXTERNAL_UNTRUSTED_CONTENT id="random_hex">>> |
Attacker can't predict end marker |
| Isolation |
Separate from system prompt / sender_context |
Semantic separation for LLM |
Files Changed
| File |
Change |
Description |
src/untrusted.rs |
NEW |
UntrustedContext wrapper — sanitize, truncate, wrap with boundary markers (~100 lines) |
src/adapter.rs |
MOD |
Add untrusted_context: Option<Vec<String>> to prompt assembly |
src/discord.rs |
MOD |
Read gc.topic and gc.name from the existing to_channel() call at L387, wrap as untrusted |
src/gateway.rs |
MOD |
Add topic: Option<String> and name: Option<String> to ChannelInfo for Telegram/LINE |
gateway/src/main.rs |
MOD |
Pass topic/name in gateway ChannelInfo |
Future Extensions
Once the UntrustedContext layer exists, it can carry more than just channel topics:
- Channel name — useful for agents to know where they are
- Slack channel purpose/description
- Telegram group description
- Webhook payloads (email bodies, CI notifications)
- Forum thread tags
References
Discord: https://discord.com/channels/1491295327620169908/1497578757333057577/1497930272635748473
Description
Introduce an UntrustedContext layer to openab so that external, user-editable metadata — starting with Discord channel topics — can be safely injected into downstream ACP agent prompts.
Today, openab's
SenderContextcarries only trusted, structured identity fields (sender_id,sender_name,channel_id, etc.). There is no mechanism to pass channel-level metadata like the channel topic or description. This means agents running on openab have no awareness of the channel's stated purpose — information that other platforms already provide.Use Cases
"Production alerts only — respond in English"tells the agent the expected tone, language, and scope without per-channel config."Planning and coordination for Project X". The agent can use this to stay on-topic.MANAGE_CHANNELSpermission. They must not be treated as trusted system instructions, or they become a prompt injection vector.Prior Art Investigation
We investigated how OpenClaw and Hermes Agent handle channel topics:
OpenClaw (
openclaw/openclaw, commit7e13f3f)OpenClaw has a mature UntrustedContext system with sanitization + isolation:
Read topic from Discord channel object —
channel-access.tsL49-51:resolveDiscordChannelTopicSafe()safely readschannel.topic.Pass through message handler —
message-handler.process.tsL310:channelTopic: channelInfo?.topic.Wrap as untrusted content —
inbound-context.tsL59-66: UsesbuildUntrustedChannelMetadata()to wrap the topic with security boundaries.Sanitization pipeline —
channel-metadata.tsandexternal-content.ts:<<<EXTERNAL_UNTRUSTED_CONTENT id="random">>>markers with random IDs to prevent spoofing<|im_start|>,[INST], etc. — 20+ patterns)ignore previous instructions,you are now a..., etc.)Isolation — The untrusted context is placed in a separate
UntrustedContext[]array, never mixed intoGroupSystemPrompt. The test atinbound-context.test.tsandmessage-handler.inbound-context.test.tsexplicitly verify that a topic like"Ignore system instructions"stays in the untrusted layer.Default on, no opt-out — Channel topic injection is always active for guild channels; DMs skip it automatically (
isGuild: false).Hermes Agent (
NousResearch/hermes-agent, commit192e7eb)Hermes Agent also supports channel topics, but with a simpler (less secure) approach:
SessionSource.chat_topic—session.pyL86:chat_topic: Optional[str] = None.Discord reads topic —
discord.pyL3031-3038:_get_effective_topic()readschannel.topic, falls back to parent forum topic for threads.Injected as plain text in system prompt —
session.pyL282-283:**Channel Topic:** {topic}— directly in the system prompt with no untrusted wrapping or sanitization.Key Differences
resolveDiscordChannelInfoSafe)_get_effective_topic)Proposed Design for openab
Architecture
Protection Layers
The UntrustedContext wrapper provides sanitize + isolate defense-in-depth:
<|im_start|>,[INST],<s>, etc.<<<EXTERNAL_UNTRUSTED_CONTENT>>><<<EXTERNAL_UNTRUSTED_CONTENT id="random_hex">>>Files Changed
src/untrusted.rssrc/adapter.rsuntrusted_context: Option<Vec<String>>to prompt assemblysrc/discord.rsgc.topicandgc.namefrom the existingto_channel()call at L387, wrap as untrustedsrc/gateway.rstopic: Option<String>andname: Option<String>toChannelInfofor Telegram/LINEgateway/src/main.rsChannelInfoFuture Extensions
Once the UntrustedContext layer exists, it can carry more than just channel topics:
References
channel-access.ts,inbound-context.tschannel-metadata.ts,external-content.tssession.pyL86,discord.pyL3031-3038,session.pyL282-283to_channel()call:discord.rsL387SenderContextstruct:adapter.rsL36-52Discord: https://discord.com/channels/1491295327620169908/1497578757333057577/1497930272635748473