Skip to content

feat(tui): #3083 /provider readiness dashboard — capability/metadata badges#3555

Merged
Hmbown merged 4 commits into
mainfrom
codex/issue-3083-provider-dashboard
Jun 24, 2026
Merged

feat(tui): #3083 /provider readiness dashboard — capability/metadata badges#3555
Hmbown merged 4 commits into
mainfrom
codex/issue-3083-provider-dashboard

Conversation

@Hmbown

@Hmbown Hmbown commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Closes #3083

Supersedes the draft #3504 (its reasoning-readiness commit is cherry-picked here with Hunter's authorship preserved) and completes the remaining /provider dashboard acceptance criteria.

What's included

  1. Reasoning readiness (from [codex] feat(tui): show provider reasoning readiness #3504, rebased onto main): ProviderReasoningSummary (support / controls / stream visibility / selected control) on each row, with the 4-region provider_picker.rs conflict resolved (struct now carries both reasoning and maturity; compact_hint carries both).
  2. GLM seed → bundled catalog (refactor): the reasoning projection no longer hand-seeds a single GLM-5.2 row. It sources from codewhale_config::catalog::bundled_catalog_offerings() (the single bundled Models.dev snapshot, same source v0.8.65: Provider-owned live catalogs and secret-free model cache #3385 uses), so every bundled provider with reasoning facts is covered and the row can't drift from the catalog.
  3. Capability + metadata badges (feat): wires the existing resolved_capability_profile() projection (no new capability logic, no wire-id substring inference) into each row:
    • metadata: context window + max output, humanized (ctx:1M out:128K), ? when unknown;
    • capability: tools / json / stream / cache — tri-state (y / n / ?) so unknown stays distinct from unsupported and is never silently omitted;
    • model-origin: origin:default | saved | custom;
    • self-hosted hint next to the base URL for local runtimes (Ollama/vLLM/SGLang).

Badges are computed in the row projection (testable without rendering — acceptance criterion) and ordered after route/base so provider identity survives narrow-width truncation.

Deferred (flagged, non-blocking)

Verification

  • cargo fmt; cargo clippy --workspace --all-targets --locked -- -D warnings — clean.
  • cargo test -p codewhale-tui -- provider_dashboard provider_picker route_resolver pricing usage_meter — green.
  • Full codewhale-tui bin suite: 5278 passed; only the known environmental/flaky papercuts failed (config_command_allow_shell_*, run_verifiers_background_*, run_tests_*), reproduced on pristine main.

New tests: provider_dashboard_row_surfaces_capability_and_metadata_badges, provider_dashboard_row_classifies_model_origin, model_origin_classifier_covers_default_saved_custom, self_hosted_provider_row_marks_self_hosted_in_hint, humanize_token_count_is_compact_and_marks_unknown (plus #3504's reasoning tests, rebased).

🤖 Generated with Claude Code

Hmbown and others added 3 commits June 24, 2026 11:55
Add a ProviderReasoningSummary to /provider dashboard rows so the row projection can display reasoning support, accepted controls, stream visibility, and the selected configured control.\n\nSeed GLM reasoning controls in catalog-shaped provider+wire data and reuse the Codex effort labels so Codex displays its xhigh scale without model substring inference.\n\nAdd focused coverage for the GLM and Codex provider dashboard projections.\n\nRefs #3083
…3083)

The reasoning readiness projection hand-seeded a single GLM-5.2 row in a
OnceLock. main's bundled Models.dev snapshot already carries that row
(reasoning + reasoning_options) plus every other provider's reasoning
facts, so the seed was redundant and prone to drift.

Source bundled_reasoning_catalog() from
codewhale_config::catalog::bundled_catalog_offerings() instead. GLM-5.2
still resolves to high/max controls; all bundled providers with reasoning
descriptors are now covered rather than just GLM.
#3083)

Wire the existing resolved_capability_profile() projection into the
/provider dashboard rows so each row surfaces, with no wire-id substring
inference:
- metadata badges: context window + max output (humanized; `?` when
  unknown);
- capability badges: tools, structured/JSON, streaming, cache — tri-state
  so unknown stays distinct from unsupported and is never silently
  dropped;
- model-origin: default / saved override / custom pass-through;
- a self-hosted hint next to the base URL for local runtimes.

Badges are computed in the row projection (testable without rendering)
and ordered after route/base so provider identity survives narrow-width
truncation. The live "test connection" probe is deferred to a follow-up
to honor the no-I/O-on-open constraint.
@claude

claude Bot commented Jun 24, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 2m 39s —— View job


PR Review — feat(tui): #3083 provider dashboard capability/metadata badges

Tasks

  • Gather context (PR metadata, changed files)
  • Read the full diff of provider_picker.rs
  • Review for correctness bugs, provider/model safety, reuse, tests, security
  • Post final review

Summary

Solid, well-structured work. Capability/metadata badges are correctly wired through resolved_capability_profile() without duplicating inference logic; the catalog-backed reasoning projection fixes the GLM drift issue cleanly; tests cover the new row properties end-to-end. Two correctness issues below; the rest is minor or advisory.


1. Self-hosted hint misses SGLang and vLLM (bug)

File: provider_picker.rs, ~line 415

let self_hosted = if self.auth_status == ProviderAuthStatus::Local {
    " (self-hosted)"
} else {
    ""
};

ProviderAuthStatus::Local is returned only for Ollama. SGLang and vLLM without a credential return ProviderAuthStatus::Optional, so they silently omit the (self-hosted) hint — contradicting the PR's stated goal of surfacing self-hosted posture for local runtimes. The test self_hosted_provider_row_marks_self_hosted_in_hint only exercises Ollama so it doesn't catch this.

Fix: widen the predicate:

let self_hosted = if matches!(
    self.auth_status,
    ProviderAuthStatus::Local | ProviderAuthStatus::Optional
) {
    " (self-hosted)"
} else {
    ""
};

Add a parallel test assertion for ApiProvider::Sglang (no-key path) to lock this in.

Fix this →


2. default_reasoning_stream_visibility inconsistency: SGLang in, vLLM out

File: provider_picker.rs, ~line 672

| ApiProvider::Sglang
// ...but ApiProvider::Vllm is absent → falls through to Unknown

Both SGLang and vLLM are OpenAI-compatible self-hosted runtimes. If the rationale for Sglang → StructuredThinking is "commonly serves DeepSeek-R1-class reasoning models," the same logic applies to vLLM. Either add ApiProvider::Vllm to the StructuredThinking arm or document why they diverge. This only fires when the catalog says reasoning: Some(true), so blast radius is limited to display, but the inconsistency will confuse a future maintainer adding providers.


3. humanize_token_count K-band truncates silently

File: provider_picker.rs, ~line 232

Some(v) if v >= 1_000 => format!("{}K", v / 1_000),

1_999"1K", 999_999"999K". For context windows in the 1K–999K range this is visually lossless (windows are almost always power-of-two multiples of 1024 or round numbers) but the test only exercises 131_072 → "131K", which happens to be exact. Worth noting because a 196K window (196_608) would display as "192K" (196_608 / 1_000), not "197K". Consider v / 1_024 with a Ki suffix if accuracy matters more than compactness, or leave as-is with a comment acknowledging truncation.


4. Good choices worth preserving

  • Provider/model safety (v0.8.65 EPIC: Separate provider facts, model facts, offerings, and route resolution #2608): No code infers a provider or model change from a wire-id prefix or freeform text. resolved_capability_profile(provider, wire_model) takes an explicit ApiProvider, fully respecting the invariant.
  • OnceLock<CatalogSnapshot> for bundled_reasoning_catalog(): Lazy-init, contention-free, correct for a read-only bundled snapshot.
  • eq_ignore_ascii_case in reasoning_catalog_offering: Correctly handles mixed-case model IDs (e.g. GLM-5.2 vs glm-5.2 in user config).
  • Tri-state SupportState propagated through the badge cluster: ? staying distinct from n was the right call; the comment in the code explaining why is appropriate.
  • Test model_origin_classifier_covers_default_saved_custom covers all four combinations including the Custom + has_saved_model override case cleanly.

@Hmbown

Hmbown commented Jun 24, 2026

Copy link
Copy Markdown
Owner Author

Status: CI green. Completes the remaining #3083 dashboard acceptance criteria.

Maintainer calls: (a) confirm it belongs in the v0.8.65 cleanup; (b) the live "test connection" probe is intentionally deferred to honor "opening /provider must not block on live I/O" — accept as a follow-up, or treat as a close-blocker for #3083?

Address provider dashboard review feedback.

- Treat Optional auth status as self-hosted in compact provider hints, covering SGLang and vLLM local runtimes.
- Classify vLLM reasoning stream visibility consistently with SGLang.
- Add regression coverage for both cases.

Verification:
- cargo fmt --all
- cargo test -p codewhale-tui --bin codewhale-tui --locked provider_dashboard
- cargo test -p codewhale-tui --bin codewhale-tui --locked self_hosted
@Hmbown

Hmbown commented Jun 24, 2026

Copy link
Copy Markdown
Owner Author

Addressed the provider-dashboard review nits in dad6695:

  • SGLang/vLLM no-key rows now show the (self-hosted) hint via ProviderAuthStatus::Optional.
  • vLLM now uses the same default structured-thinking stream visibility as SGLang.
  • Added regression coverage for both cases.

Verification:

  • cargo fmt --all
  • cargo test -p codewhale-tui --bin codewhale-tui --locked provider_dashboard — 10 passed
  • cargo test -p codewhale-tui --bin codewhale-tui --locked self_hosted — 5 passed

@claude

claude Bot commented Jun 24, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 1m 47s —— View job


Follow-up Review — feat(tui): #3083 provider dashboard

Tasks

  • Gather context (trigger: PR synchronize after Hmbown addressed review nits in dad6695)
  • Read the updated diff to verify fixes for the two correctness issues
  • Check for any new issues introduced by the fix
  • Post findings

Verification of dad6695

Both issues from the first review are correctly resolved.

Issue 1 — Self-hosted hint for SGLang/vLLM: FIXED

The predicate widening in compact_hint() (line 415–418) is exactly right:

let self_hosted = if matches!(
    self.auth_status,
    ProviderAuthStatus::Local | ProviderAuthStatus::Optional
) {

Importantly, ProviderAuthStatus::Optional is only returned by auth_status_for() for ApiProvider::Sglang | ApiProvider::Vllm on the no-key path (line 706–711). No cloud provider returns Optional — they fall through to Configured or Missing. So the widened predicate is correctly scoped and cannot misclassify a cloud provider as self-hosted.

This is also internally consistent: the existing readiness check at line 773 already uses Local | Optional => LocalReady, so the hint now aligns with the readiness posture rather than diverging from it.

The test correctly exercises both the auth_status value and the compact_hint() string for SGLang. vLLM's hint is implicitly covered by the same auth_status_for() branch, so a redundant vLLM-specific hint assertion isn't needed.

Issue 2 — vLLM reasoning stream visibility: FIXED

ApiProvider::Vllm is added to the StructuredThinking arm (line 691), and the new test self_hosted_reasoning_visibility_covers_vllm locks in both SGLang and vLLM together.

No new issues introduced. The fix commit is minimal (26 lines, single file, no behavioral changes beyond the two targeted paths).


The earlier advisory note on humanize_token_count K-band truncation stands as a low-priority cosmetic point, not a blocker.

PR looks clean. Both correctness issues from the first pass are resolved with appropriate tests. Ready to land when Hunter confirms the deferred "test connection" probe is acceptable as a follow-up.

@Hmbown Hmbown merged commit ac9fa7c into main Jun 24, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v0.8.65: /provider readiness dashboard from route/catalog projections

1 participant