Ingest agent runtime sessions into WordPress (read-side of the skills pattern)

## Context

wp-coding-agents already handles the **write** side of the runtime-agnostic adapter pattern: skills defined in WordPress are synced out to runtime-native locations (`.opencode/skills/<slug>/SKILL.md`, `.claude/skills/<slug>/SKILL.md`, etc.). Each runtime has an adapter that knows where its files belong and how they're formatted.

This issue proposes the **read** side of that same pattern: ingesting agent session transcripts from runtime-native storage into WordPress, so plugins can build on session data without each one reinventing runtime-specific parsers.

## The gap

Every major agent runtime stores session transcripts locally in a runtime-specific format:

| Runtime | Typical location | Format |
|---|---|---|
| Claude Code | `~/.config/claude/projects/<hash>/*.jsonl` | JSONL (one message per line) |
| OpenCode | `~/.local/share/opencode/project/<hash>/session/*` | Session directory per conversation |
| Cursor | Local app storage (SQLite / JSON) | Proprietary |
| Cline / Continue / Aider | Various | Various |

A WP-CLI command, a pipeline that processes session history, an analytics plugin, a memory system — none of these can consume session data today without re-implementing per-runtime file discovery and parsing. That's a job wp-coding-agents is uniquely positioned to own, because it already owns the inverse (runtime-adapter → disk) for skills and agent configs.

## Storage model: index in WordPress, content on disk

**This is not the skills model.** Skills are small (~1-5 KB), low-volume (10-50 total), and benefit from being first-class editable WordPress content. Duplicating them to disk is cheap and the use case justifies it.

Sessions are the opposite:

| Property | Skills | Sessions |
|---|---|---|
| Size per unit | 1-5 KB | 10 KB – 10+ MB |
| Volume | 10-50 total | hundreds/month, grows forever |
| Churn | Low | High (new daily, existing append mid-run) |
| Write owner | WordPress | Runtime |
| Value of full copy in WP | High (editable content) | Low (mostly tool-call noise) |

Copying raw sessions into `wp_posts` or a custom table would bloat the database by GBs within months, duplicate data the runtime already stores canonically, and store content that's mostly not directly useful (intermediate reasoning, tool call IO, file reads).

**The correct pattern — borrowed from [markdown-database-integration](https://github.com/Automattic/markdown-database-integration):**
> SQLite is the projection. Files are the source of truth. The database is an indexing engine that maps to content on disk.

For sessions:

1. **Runtime session files remain the source of truth.** wp-coding-agents does not copy, move, or rewrite them.
2. **WordPress stores a lightweight index** — one row per session with metadata only: `session_id`, `runtime`, `project_path`, `started_at`, `ended_at`, `file_path`, `message_count`, `model`, `token_total`, `status`. ~200 bytes per session. Millions of sessions would still be a small table.
3. **On-demand parse.** When a consumer needs the actual messages, the adapter reads the file from disk and returns the normalized shape. No bulk content lives in the DB.
4. **Consumers persist what they derive, not everything they read.** Summaries, extracted decisions, linked PRs, salient events → those go into WP (via posts, custom tables, wikis, whatever the consumer wants). Raw transcripts stay on disk.

This keeps the WordPress DB small, keeps the runtime's session files authoritative, and gives consumers the full messages when they need them via a single API.

## Proposal

Add a runtime-session-ingest subsystem to wp-coding-agents that mirrors the existing skills-sync architecture, adapted for the index-vs-content distinction above.

### 1. Runtime adapters
Each supported runtime gets an adapter implementing:
- **Discovery** — enumerate session files/directories on the host for a given project path
- **Parse** — convert runtime-native format to a normalized message shape (on demand)
- **Metadata** — extract session ID, start/end timestamps, project path, model, token counts, runtime-specific context (without reading full content)

Adapters live alongside the existing skills-write adapters so new runtimes add both read and write support in one place.

### 2. Normalized session schema (on-demand, not stored in full)

```php
[
    'session_id'   => string,  // runtime-native, stable across reads
    'runtime'      => string,  // 'claude-code' | 'opencode' | ...
    'project_path' => string,
    'started_at'   => int,
    'ended_at'     => int|null,
    'file_path'    => string,  // absolute path to the source file on disk
    'messages'     => [
        [
            'id'        => string,
            'role'      => 'user' | 'assistant' | 'tool',
            'content'   => string|array,
            'tool_calls'=> array|null,
            'timestamp' => int,
        ],
        // ...
    ],
    'meta' => [
        'model'   => string|null,
        'tokens'  => array|null,
        'cost'    => float|null,
        // runtime-specific extras preserved here
    ],
]
```

### 3. Index schema (what does live in WP)

A single custom table, `wp_coding_agents_session_index` (or similar), holding only metadata — never message content:

```
session_id      VARCHAR  PRIMARY KEY
runtime         VARCHAR  INDEX
project_path    VARCHAR  INDEX
started_at      BIGINT   INDEX
ended_at        BIGINT
file_path       TEXT
file_mtime      BIGINT      -- for cheap change detection
message_count   INT
model           VARCHAR
token_total     INT
status          VARCHAR     -- active, completed, truncated, etc.
```

Change detection via `file_mtime` means re-indexing is cheap and idempotent — only sessions that actually changed get re-parsed.

### 4. Ingestion API

**WP-CLI:**
- `wp coding-agents sessions list [--runtime=<name>] [--project=<path>] [--since=<timestamp>]`
- `wp coding-agents sessions read <session-id>` — parses and returns full normalized session on demand
- `wp coding-agents sessions reindex [--runtime=<name>]` — rescans filesystem, updates index only

**Action hooks:**
- `do_action( 'wp_coding_agents_session_indexed', $index_row )` — fires when a new session is indexed or an existing index row is updated. Lightweight — just metadata.
- `do_action( 'wp_coding_agents_session_parsed', $session )` — fires when a consumer explicitly requests a full parse. Carries the complete normalized session.

Consumers that just want to react to "a new session happened" subscribe to `session_indexed`. Consumers that want to *process content* (summarizers, analyzers) subscribe to `session_parsed` or trigger a parse themselves via the API.

**Optional file-watch service** (later): a long-running WP-CLI command or background worker that tails session files and fires `session_indexed` on changes.

### 5. What this does NOT do

- Does not copy session content into WordPress.
- Does not summarize, score, or process message content — that's application logic on top.
- Does not write sessions back to runtimes — runtimes own their session storage.
- Does not opine on what consumers persist. Consumers are free to store derived knowledge wherever they want; raw sessions just stay on disk where the runtime already put them.

## Why wp-coding-agents

This project already:
- Maintains the list of supported runtimes
- Owns per-runtime path conventions (via the skills-write adapters)
- Knows how to detect which runtimes are installed on a given host
- Is the WordPress plugin any site running a local agent is already likely to install

Pushing session ingestion into a separate project would duplicate the adapter registry. Putting it in a specific consumer (memory system, analytics plugin) would force every consumer to re-implement it.

## Downstream consumers this unlocks

With the index + parse-on-demand API and the two action hooks as the seam, independent plugins can:
- Build persistent memory systems that summarize actual agent activity
- Power agent analytics (tokens, cost, tool usage trends)
- Enable search across historical sessions
- Feed session transcripts into knowledge bases or wikis
- Generate daily/weekly digests of work done via AI agents
- Detect patterns (frequently-failing tool calls, repeated questions, etc.)

None of these need to exist in this plugin. They just need the data to be accessible.

## Suggested first pass

1. Define the normalized session schema and the index table schema.
2. Implement one adapter end-to-end (Claude Code is well-documented and JSONL is trivially parseable).
3. Ship the index table, WP-CLI commands, and both action hooks.
4. Add OpenCode adapter as the second proof of the abstraction.
5. Document the adapter contract so community contributors can add Cursor / Cline / etc.

## Out of scope for this issue

- File-watch / real-time ingestion (follow-up)
- Any opinion on what consumers do with ingested sessions
- Any runtime-specific features beyond adapter read support

## Related

- #25 (closed) was about agent *sync* (write side for Claude Code); this is about session *read*.
- The existing skills-sync pipeline is the architectural template for the adapter layer.
- The MDI projection pattern (SQLite indexes → markdown files on disk) is the template for the storage model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingest agent runtime sessions into WordPress (read-side of the skills pattern) #41

Context

The gap

Storage model: index in WordPress, content on disk

Proposal

1. Runtime adapters

2. Normalized session schema (on-demand, not stored in full)

3. Index schema (what does live in WP)

4. Ingestion API

5. What this does NOT do

Why wp-coding-agents

Downstream consumers this unlocks

Suggested first pass

Out of scope for this issue

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Runtime	Typical location	Format
Claude Code	`~/.config/claude/projects/<hash>/*.jsonl`	JSONL (one message per line)
OpenCode	`~/.local/share/opencode/project/<hash>/session/*`	Session directory per conversation
Cursor	Local app storage (SQLite / JSON)	Proprietary
Cline / Continue / Aider	Various	Various

Property	Skills	Sessions
Size per unit	1-5 KB	10 KB – 10+ MB
Volume	10-50 total	hundreds/month, grows forever
Churn	Low	High (new daily, existing append mid-run)
Write owner	WordPress	Runtime
Value of full copy in WP	High (editable content)	Low (mostly tool-call noise)

Ingest agent runtime sessions into WordPress (read-side of the skills pattern) #41

Description

Context

The gap

Storage model: index in WordPress, content on disk

Proposal

1. Runtime adapters

2. Normalized session schema (on-demand, not stored in full)

3. Index schema (what does live in WP)

4. Ingestion API

5. What this does NOT do

Why wp-coding-agents

Downstream consumers this unlocks

Suggested first pass

Out of scope for this issue

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions