feat: introduce UntrustedContext layer with channel topic support

## Description

Introduce an **UntrustedContext layer** to openab so that external, user-editable metadata — starting with **Discord channel topics** — can be safely injected into downstream ACP agent prompts.

Today, openab's `SenderContext` carries only trusted, structured identity fields (`sender_id`, `sender_name`, `channel_id`, etc.). There is no mechanism to pass channel-level metadata like the channel topic or description. This means agents running on openab have no awareness of the channel's stated purpose — information that other platforms already provide.

## Use Cases

1. **Channel-scoped behavior** — A channel topic like `"Production alerts only — respond in English"` tells the agent the expected tone, language, and scope without per-channel config.
2. **Project context** — Teams using Discord forum channels set topics like `"Planning and coordination for Project X"`. The agent can use this to stay on-topic.
3. **Self-documenting channels** — Operators can steer agent behavior by editing the channel topic in Discord UI, no config file changes needed.
4. **Security** — Channel topics are user-editable by anyone with `MANAGE_CHANNELS` permission. They **must not** be treated as trusted system instructions, or they become a prompt injection vector.

## Prior Art Investigation

We investigated how **OpenClaw** and **Hermes Agent** handle channel topics:

### OpenClaw (`openclaw/openclaw`, commit `7e13f3f`)

OpenClaw has a mature **UntrustedContext** system with sanitization + isolation:

1. **Read topic from Discord channel object** — [`channel-access.ts` L49-51](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/extensions/discord/src/monitor/channel-access.ts#L49-L51): `resolveDiscordChannelTopicSafe()` safely reads `channel.topic`.

2. **Pass through message handler** — [`message-handler.process.ts` L310](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/extensions/discord/src/monitor/message-handler.process.ts#L310): `channelTopic: channelInfo?.topic`.

3. **Wrap as untrusted content** — [`inbound-context.ts` L59-66](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/extensions/discord/src/monitor/inbound-context.ts#L59-L66): Uses `buildUntrustedChannelMetadata()` to wrap the topic with security boundaries.

4. **Sanitization pipeline** — [`channel-metadata.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/src/security/channel-metadata.ts) and [`external-content.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/src/security/external-content.ts):
   - Wraps content in `<<<EXTERNAL_UNTRUSTED_CONTENT id="random">>>` markers with random IDs to prevent spoofing
   - Strips LLM special tokens (`<|im_start|>`, `[INST]`, etc. — 20+ patterns)
   - Neutralizes Unicode homoglyph attacks (fullwidth chars, angle bracket variants)
   - Detects prompt injection patterns (`ignore previous instructions`, `you are now a...`, etc.)
   - Truncates to 400 chars/entry, 800 chars total

5. **Isolation** — The untrusted context is placed in a separate `UntrustedContext[]` array, **never** mixed into `GroupSystemPrompt`. The test at [`inbound-context.test.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/extensions/discord/src/monitor/inbound-context.test.ts) and [`message-handler.inbound-context.test.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/extensions/discord/src/monitor/message-handler.inbound-context.test.ts) explicitly verify that a topic like `"Ignore system instructions"` stays in the untrusted layer.

6. **Default on, no opt-out** — Channel topic injection is always active for guild channels; DMs skip it automatically (`isGuild: false`).

### Hermes Agent (`NousResearch/hermes-agent`, commit `192e7eb`)

Hermes Agent also supports channel topics, but with a simpler (less secure) approach:

1. **`SessionSource.chat_topic`** — [`session.py` L86](https://github.com/NousResearch/hermes-agent/blob/192e7eb21f5e2c4b8ef7b332e4423ea69a979754/gateway/session.py#L86): `chat_topic: Optional[str] = None`.

2. **Discord reads topic** — [`discord.py` L3031-3038](https://github.com/NousResearch/hermes-agent/blob/192e7eb21f5e2c4b8ef7b332e4423ea69a979754/gateway/platforms/discord.py#L3031-L3038): `_get_effective_topic()` reads `channel.topic`, falls back to parent forum topic for threads.

3. **Injected as plain text in system prompt** — [`session.py` L282-283](https://github.com/NousResearch/hermes-agent/blob/192e7eb21f5e2c4b8ef7b332e4423ea69a979754/gateway/session.py#L282-L283): `**Channel Topic:** {topic}` — directly in the system prompt with **no untrusted wrapping or sanitization**.

### Key Differences

| Aspect | OpenClaw | Hermes Agent | openab (current) |
|---|---|---|---|
| Reads channel topic | ✅ | ✅ | ❌ |
| Untrusted context layer | ✅ (sanitize + isolate) | ❌ (plain text in system prompt) | ❌ (no layer exists) |
| Prompt injection protection | ✅ (boundary markers, token stripping, homoglyph defense) | ❌ | N/A |
| Forum thread fallback | ✅ (via `resolveDiscordChannelInfoSafe`) | ✅ (via `_get_effective_topic`) | N/A |

## Proposed Design for openab

### Architecture

```
                    openab — Proposed UntrustedContext Layer
 ═══════════════════════════════════════════════════════════════════

  Discord Message
       │
       ▼
 ┌───────────────────────────────────────────────────────────────┐
 │  discord.rs                                                    │
 │                                                                │
 │  msg.channel_id.to_channel(&ctx.http).await                    │
 │       │                                                        │
 │       ▼                                                        │
 │  GuildChannel {                                                │
 │    topic: Some("Production alerts — respond in English"),      │
 │    name: "ops-alerts",                                         │
 │    thread_metadata, parent_id, owner_id, ...                   │
 │  }                                                             │
 │       │                                                        │
 │       │  gc.topic already available at line ~387               │
 │       │  (currently only thread_metadata/parent_id used)       │
 │       ▼                                                        │
 │  ┌─────────────────────────────────────────────────────┐      │
 │  │  UntrustedContext::wrap(source, label, content)     │      │
 │  │                                                     │      │
 │  │  1. Truncate (400 char/entry, 800 total)           │      │
 │  │  2. Strip LLM special tokens                       │      │
 │  │  3. Neutralize boundary marker spoofing            │      │
 │  │  4. Wrap in random-ID boundary markers             │      │
 │  └──────────────────────┬──────────────────────────────┘      │
 │                         │                                      │
 └─────────────────────────┼──────────────────────────────────────┘
                           │
                           ▼
 ┌───────────────────────────────────────────────────────────────┐
 │  ACP Prompt Assembly (adapter.rs)                              │
 │                                                                │
 │  content_blocks: [                                             │
 │    Text {                                                      │
 │      "<sender_context>{...}</sender_context>"     ← trusted   │
 │                                                                │
 │      "<<<EXTERNAL_UNTRUSTED_CONTENT id=\"a1b2\">>>"           │
 │      "Source: Channel metadata"                                │
 │      "---"                                                     │
 │      "UNTRUSTED channel metadata (discord)"                   │
 │      "Discord channel topic:"                                  │
 │      "Production alerts — respond in English"                  │
 │      "<<<END_EXTERNAL_UNTRUSTED_CONTENT id=\"a1b2\">>>"       │
 │                                                    ← isolated │
 │      "user's actual message"                                   │
 │    },                                                          │
 │    Image { ... },   ← if image attached                       │
 │  ]                                                             │
 └──────────────────────┬────────────────────────────────────────┘
                        │
                        ▼
 ┌───────────────────────────────────────────────────────────────┐
 │  Downstream ACP Agent (Kiro CLI / Claude Code / etc)          │
 │                                                                │
 │  Agent sees channel topic as reference context,                │
 │  but knows it's untrusted external content.                    │
 └───────────────────────────────────────────────────────────────┘
```

### Protection Layers

The UntrustedContext wrapper provides **sanitize + isolate** defense-in-depth:

| Layer | What it does | Why |
|---|---|---|
| **Truncation** | 400 chars/entry, 800 chars total | Limit attack surface |
| **LLM token stripping** | Remove `<\|im_start\|>`, `[INST]`, `<s>`, etc. | Prevent chat-template escape |
| **Homoglyph folding** | Normalize fullwidth chars, Unicode angle brackets | Prevent visual spoofing of markers |
| **Marker sanitization** | Replace spoofed `<<<EXTERNAL_UNTRUSTED_CONTENT>>>` | Prevent boundary escape |
| **Random boundary IDs** | `<<<EXTERNAL_UNTRUSTED_CONTENT id="random_hex">>>` | Attacker can't predict end marker |
| **Isolation** | Separate from system prompt / sender_context | Semantic separation for LLM |

### Files Changed

| File | Change | Description |
|---|---|---|
| `src/untrusted.rs` | **NEW** | UntrustedContext wrapper — sanitize, truncate, wrap with boundary markers (~100 lines) |
| `src/adapter.rs` | **MOD** | Add `untrusted_context: Option<Vec<String>>` to prompt assembly |
| `src/discord.rs` | **MOD** | Read `gc.topic` and `gc.name` from the existing `to_channel()` call at L387, wrap as untrusted |
| `src/gateway.rs` | **MOD** | Add `topic: Option<String>` and `name: Option<String>` to `ChannelInfo` for Telegram/LINE |
| `gateway/src/main.rs` | **MOD** | Pass topic/name in gateway `ChannelInfo` |

### Future Extensions

Once the UntrustedContext layer exists, it can carry more than just channel topics:

- **Channel name** — useful for agents to know where they are
- **Slack channel purpose/description**
- **Telegram group description**
- **Webhook payloads** (email bodies, CI notifications)
- **Forum thread tags**

## References

- OpenClaw channel topic: [`channel-access.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/extensions/discord/src/monitor/channel-access.ts), [`inbound-context.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/extensions/discord/src/monitor/inbound-context.ts)
- OpenClaw security runtime: [`channel-metadata.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/src/security/channel-metadata.ts), [`external-content.ts`](https://github.com/openclaw/openclaw/blob/7e13f3f5140168e31027b39a9855ade3fadf24d5/src/security/external-content.ts)
- Hermes Agent channel topic: [`session.py` L86](https://github.com/NousResearch/hermes-agent/blob/192e7eb21f5e2c4b8ef7b332e4423ea69a979754/gateway/session.py#L86), [`discord.py` L3031-3038](https://github.com/NousResearch/hermes-agent/blob/192e7eb21f5e2c4b8ef7b332e4423ea69a979754/gateway/platforms/discord.py#L3031-L3038), [`session.py` L282-283](https://github.com/NousResearch/hermes-agent/blob/192e7eb21f5e2c4b8ef7b332e4423ea69a979754/gateway/session.py#L282-L283)
- openab existing `to_channel()` call: [`discord.rs` L387](https://github.com/openabdev/openab/blob/37ec36337db8a9c8a9adb401c35d3300e36d37e1/src/discord.rs#L387)
- openab `SenderContext` struct: [`adapter.rs` L36-52](https://github.com/openabdev/openab/blob/37ec36337db8a9c8a9adb401c35d3300e36d37e1/src/adapter.rs#L36-L52)
- Related: #224 (STT feature — same investigation pattern)


Discord: https://discord.com/channels/1491295327620169908/1497578757333057577/1497930272635748473

File	Change	Description
`src/untrusted.rs`	NEW	UntrustedContext wrapper — sanitize, truncate, wrap with boundary markers (~100 lines)
`src/adapter.rs`	MOD	Add `untrusted_context: Option<Vec<String>>` to prompt assembly
`src/discord.rs`	MOD	Read `gc.topic` and `gc.name` from the existing `to_channel()` call at L387, wrap as untrusted
`src/gateway.rs`	MOD	Add `topic: Option<String>` and `name: Option<String>` to `ChannelInfo` for Telegram/LINE
`gateway/src/main.rs`	MOD	Pass topic/name in gateway `ChannelInfo`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: introduce UntrustedContext layer with channel topic support #578

Description

Use Cases

Prior Art Investigation

OpenClaw (`openclaw/openclaw`, commit `7e13f3f`)

Hermes Agent (`NousResearch/hermes-agent`, commit `192e7eb`)

Key Differences

Proposed Design for openab

Architecture

Protection Layers

Files Changed

Future Extensions

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aspect	OpenClaw	Hermes Agent	openab (current)
Reads channel topic	✅	✅	❌
Untrusted context layer	✅ (sanitize + isolate)	❌ (plain text in system prompt)	❌ (no layer exists)
Prompt injection protection	✅ (boundary markers, token stripping, homoglyph defense)	❌	N/A
Forum thread fallback	✅ (via `resolveDiscordChannelInfoSafe`)	✅ (via `_get_effective_topic`)	N/A

Layer	What it does	Why
Truncation	400 chars/entry, 800 chars total	Limit attack surface
LLM token stripping	Remove `<\|im_start\|>`, `[INST]`, `<s>`, etc.	Prevent chat-template escape
Homoglyph folding	Normalize fullwidth chars, Unicode angle brackets	Prevent visual spoofing of markers
Marker sanitization	Replace spoofed `<<<EXTERNAL_UNTRUSTED_CONTENT>>>`	Prevent boundary escape
Random boundary IDs	`<<<EXTERNAL_UNTRUSTED_CONTENT id="random_hex">>>`	Attacker can't predict end marker
Isolation	Separate from system prompt / sender_context	Semantic separation for LLM

feat: introduce UntrustedContext layer with channel topic support #578

Description

Description

Use Cases

Prior Art Investigation

OpenClaw (openclaw/openclaw, commit 7e13f3f)

Hermes Agent (NousResearch/hermes-agent, commit 192e7eb)

Key Differences

Proposed Design for openab

Architecture

Protection Layers

Files Changed

Future Extensions

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

OpenClaw (`openclaw/openclaw`, commit `7e13f3f`)

Hermes Agent (`NousResearch/hermes-agent`, commit `192e7eb`)