v0.8.65: Normalize provider usage telemetry for tokens, cache, reasoning, and quota

## Goal

Normalize provider usage telemetry across supported routes so CodeWhale records one structured usage schema regardless of provider, endpoint protocol, OAuth/subscription route, local runtime, or hosted aggregator.

This is usage observability, not pricing. Pricing and user-facing cost/usage display are handled in #3085, which should consume this telemetry where available.

## Current Evidence

- The Harbor stream and benchmark harness historically report aggregate input/output only in some paths.
- Codex/ChatGPT OAuth routes may expose subscription/quota-style usage rather than per-token pricing.
- DeepSeek-style APIs can report prompt cache hit/miss tokens.
- Hosted aggregators can expose provider-specific cached/reasoning fields.
- Local runtimes may report limited or no usage fields.
- Silent zero is misleading when a provider does not report a metric.

## Architecture Contract

Usage telemetry belongs to the selected route's response/stream decoding and normalized run record.

Normalize into explicit optional fields such as:

- input tokens;
- cached input/read tokens;
- cache write/miss tokens;
- output tokens;
- reasoning/thinking tokens;
- total tokens;
- provider raw usage payload;
- account/subscription quota or percent when applicable;
- telemetry provenance and freshness.

Use `null`/unknown for metrics a provider does not report. Do not treat unknown as zero.

Keep these concepts separate:

- `UsageTelemetry`: what happened on a turn.
- `PricingSku`: provider/offering price metadata.
- `UsageMeter`: quota, subscription, credits, or local/not-applicable display state.

## Scope

1. Define one normalized usage telemetry struct for turn/run records.
2. Map each supported provider/route response payload onto it.
3. Preserve provider-raw usage payloads for debugging and future mappings.
4. Emit it uniformly in Harbor stream, Fleet/subagent ledgers, benchmark harnesses, and persisted run records where applicable.
5. Document per-provider field mappings and unavailable fields.
6. Add recorded usage payload fixtures per provider route family.

## Non-Goals

- Changing the TUI's full cost display; that is #3085.
- Improving cache hit rate itself.
- Adding new providers.

## Acceptance Criteria

- Supported providers map response usage payloads to a normalized schema with explicit unknown/null fields.
- Codex/ChatGPT OAuth and other quota-style routes can report usage/quota metadata without fake token pricing.
- The benchmark/Harbor stream can consume cached/reasoning usage without provider-specific parsing.
- Docs list each provider's usage-field mapping and gaps.
- Tests cover token, cached token, reasoning token, quota-style, local/no-telemetry, and unknown/stale cases.

## Related

- #3085 provider/offering usage and pricing display.
- #3019 Codex/Responses route reliability and usage metadata.
- #2984 Codex/ChatGPT OAuth verification.
- #2963 DeepSeek Anthropic-compatible endpoint spike.
- #1177 low input cache hit diagnostics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.65: Normalize provider usage telemetry for tokens, cache, reasoning, and quota #2961

Goal

Current Evidence

Architecture Contract

Scope

Non-Goals

Acceptance Criteria

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

v0.8.65: Normalize provider usage telemetry for tokens, cache, reasoning, and quota #2961

Description

Goal

Current Evidence

Architecture Contract

Scope

Non-Goals

Acceptance Criteria

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions