From 6074ac8aeb3f80a9f568c1889aa325b57a722c02 Mon Sep 17 00:00:00 2001 From: Jose Alekhinne Date: Tue, 24 Mar 2026 18:48:57 -0700 Subject: [PATCH] /ctx-architecture principal Signed-off-by: Jose Alekhinne --- .../claude/skills/ctx-architecture/SKILL.md | 687 +++++++++++++++++- 1 file changed, 666 insertions(+), 21 deletions(-) diff --git a/internal/assets/claude/skills/ctx-architecture/SKILL.md b/internal/assets/claude/skills/ctx-architecture/SKILL.md index 0a6fdcd5..a5c29289 100644 --- a/internal/assets/claude/skills/ctx-architecture/SKILL.md +++ b/internal/assets/claude/skills/ctx-architecture/SKILL.md @@ -1,6 +1,6 @@ --- name: ctx-architecture -description: "Build and maintain architecture maps. Use to create or refresh ARCHITECTURE.md and DETAILED_DESIGN.md." +description: "Build and maintain architecture maps. Use to create or refresh ARCHITECTURE.md and DETAILED_DESIGN.md. Supports principal mode for deeper analysis: vision, future direction, bottlenecks, implementation alternatives, gaps, upstream proposals, and intervention points." allowed-tools: Bash(ctx:*), Bash(git:*), Bash(go:*), Read, Write, Edit, Glob, Grep --- @@ -10,6 +10,38 @@ and **DETAILED_DESIGN.md** (deep per-module reference, consulted on-demand). Coverage is tracked in `map-tracking.json` so each run extends the map rather than re-analyzing everything. +## Execution Priority + +When time or context budget runs short, execute in this order. +Never skip a tier to do a lower one: + +1. **Authoritative truth first** — ARCHITECTURE.md + DETAILED_DESIGN.md + must be accurate and honest. Incomplete is fine; wrong is not. +2. **Surface uncertainty honestly** — partial coverage with correct + confidence scores beats inflated scores. Mark what you don't know. +3. **Offer judgment only where grounded** — danger zones, extension + points, improvement ideas only for modules you actually analyzed. +4. **Prefer fewer sharp insights over many shallow sections** — a + CHEAT-SHEETS.md with one excellent cheat sheet beats five thin ones. + An ARCHITECTURE-PRINCIPAL.md with three concrete risks beats ten + vague ones. + +## Mode Detection + +Read the invocation for a mode keyword: + +- **No keyword** (or `default`) → run **Default mode** (Phases 0–5 below) +- `principal` → run **Principal mode** (Phases 0–5 + Principal phases P1–P3) + +Examples: +```text +/ctx-architecture +/ctx-architecture principal +/ctx-architecture (principal) +``` + +--- + ## When to Use - First time setting up architecture documentation for a project @@ -19,6 +51,7 @@ extends the map rather than re-analyzing everything. - When the agent nudges that the map is stale (>30 days, commits detected) - When you need deep understanding of a module before working on it +- When you want strategic analysis of the architecture (principal mode) ## When NOT to Use @@ -30,7 +63,9 @@ extends the map rather than re-analyzing everything. - When the user has opted out (`opted_out: true` in map-tracking.json) -## Execution +--- + +## Default Mode (Phases 0–5) ### Phase 0: Check Opt-Out @@ -42,6 +77,57 @@ Read `.context/map-tracking.json`. If it exists and Then stop. +### Phase 0.5: Quick Structure Scan + Focus Areas + +Before any deep analysis, do a lightweight structural survey to +discover what the project actually contains. This takes seconds +and makes the focus-area question concrete instead of open-ended. + +**Scan steps** (no file reads — structure only): + +```bash +# Detect ecosystem +ls go.mod package.json Cargo.toml pyproject.toml 2>/dev/null + +# List top-level source directories / packages +# Go: +go list ./... 2>/dev/null | sed 's|.*/||' | sort -u | head -40 +# or: ls internal/ cmd/ pkg/ 2>/dev/null + +# Node/other: ls src/ lib/ packages/ 2>/dev/null + +# Large monorepo guard: if >100 packages, limit to top 2 levels only +find . -mindepth 1 -maxdepth 2 -type d \ + ! -path './.git/*' ! -path './vendor/*' ! -path './node_modules/*' \ + | sort | head -60 +``` + +**Then ask** (present the discovered package/module names): + +``` +I found these top-level packages/modules: + [list from scan] + +Any specific areas you'd like me to go deep on? You can name +packages from the list above, describe subsystems (e.g. "the +reconciler loop", "auth handling"), or say "all" for a uniform +pass. + +Skip or press enter to do a standard uniform pass. +``` + +**If focus areas are given**, carry them forward: +- Phase 2 goes deep on focus packages (target confidence ≥ 0.8) +- Direct dependencies of focus packages get a solid pass (≥ 0.7) +- All other packages are stubbed (0.2) unless they appear as + transitive dependencies +- DETAILED_DESIGN.md sections for focus packages are written first + and in full detail +- Principal mode Phase P2 strategic questions reference the focus + areas explicitly + +**If "all" or no answer**, proceed with standard uniform analysis. + ### Phase 1: Assess Current State Determine if this is a **first run** or **subsequent run**: @@ -65,6 +151,30 @@ git log --oneline --since="" \ ### Phase 2: Survey (First Run) or Analyze Frontier (Subsequent Run) +**Optional: GitNexus MCP enrichment** + +If GitNexus resources are available in the environment (the +`list_repos` MCP tool responds), use them first for clustering +and symbol context before the manual survey. Otherwise continue +with the manual survey below — no need to ask. GitNexus provides pre-built knowledge graph data +that significantly speeds up and deepens the survey phase: + +| GitNexus tool/resource | Use for | +|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------| +| `gitnexus://repo/{name}/clusters` | Domain groupings for DETAILED_DESIGN splitting — use these as the starting domain boundaries instead of inferring manually | +| `gitnexus://repo/{name}/processes` | Seed CHEAT-SHEETS.md lifecycle flows — these are execution flows extracted from the graph | +| `context({symbol})` | Per-symbol 360° view — enriches Exported API and Dependencies sections; catches cross-package references manual reading misses | +| `impact({symbol})` | Blast radius per symbol — use as raw input for Danger zones (high blast radius = high modification risk); still add the *reasoning* | +| `generate_map` prompt | Seed ARCHITECTURE.md mermaid diagrams; verify and annotate before using | + +When GitNexus is available: run the graph queries first, then use +manual file reading to add judgment, rationale, and "why" context +that the graph alone can't provide. Do not copy graph output +verbatim — synthesize it. + +When GitNexus is not available: proceed with full manual survey +below. The skill works without it. + **First run: full survey:** 0. Run `ctx deps` to bootstrap the dependency graph: @@ -125,17 +235,139 @@ format: **Data flow**: Entry → Processing → Output +Include an ASCII sequence diagram when there are 3+ actors or +non-obvious ordering: + +``` +Caller Scheduler Worker +|--schedule()-->| | +| |--dispatch()-->| +| |<--result------| +|<--done--------| | +``` + +Include an ASCII state diagram when the module manages lifecycle +or status transitions: + +``` +[Init] --configure()--> [Ready] --start()--> [Running] +| | +error()---------| |--stop()-->[ Stopped] +| [Stopped] --reset()--> [Ready] +[Failed] +``` + +Use plain ASCII (not mermaid) for DETAILED_DESIGN.md — it renders +in any terminal, editor, or raw file view without a renderer. +Reserve mermaid for ARCHITECTURE.md only. + **Edge cases**: - Condition → behavior +**Performance considerations**: +- Known or likely bottlenecks (hot paths, allocation pressure, + lock contention, I/O bound operations) +- Scale assumptions baked into the design (e.g. "assumes <1000 + items", "single-threaded reconcile loop") +- What breaks first under load + +**Danger zones** (top 3 riskiest modification points): +1. `` — why it's dangerous (hidden coupling, + ordering assumption, shared mutable state, etc.) +2. ... +3. ... + +**Control loop & ownership** (if the module participates in +reconciliation or state management): +- What owns the reconciliation for this module's resources? +- What is source of truth vs. derived/cached state? +- What triggers re-reconciliation? + +**Extension points** (where features would naturally attach): +- `` — what kind of extension fits here + +**Improvement ideas** (1–3 concrete suggestions, not generic): +- `` — what it fixes and why it's feasible + **Dependencies**: list of internal packages used ``` +**Splitting DETAILED_DESIGN.md when it grows large:** + +When DETAILED_DESIGN.md exceeds ~600 lines or covers 3+ natural +domains, split into domain files and keep a shallow index: + +- `DETAILED_DESIGN.md` — index only (domain name, file pointer, + module list, one-line domain purpose) +- `DETAILED_DESIGN-.md` — full module sections for that + domain + +Domains are natural groupings, not arbitrary splits. Examples: +- storage, auth, api, reconciler, cli, observability +- If no natural grouping exists, split by: core vs. peripheral + +Index format: +```markdown +# Detailed Design Index + +| Domain | File | Modules | Summary | +|---------|----------------------------|----------------------|-------------------| +| storage | DETAILED_DESIGN-storage.md | pkg/store, pkg/cache | Persistence layer | +| auth | DETAILED_DESIGN-auth.md | pkg/authn, pkg/authz | Identity + policy | + +> See individual files for module-level detail. +``` + +Update `map-tracking.json` to record which domain file each module +lives in: +```json +"pkg/store": { + "domain_file": "DETAILED_DESIGN-storage.md", + ... +} +``` + Each section is self-contained. The agent reads specific sections when working on a module, not the entire file. +**CHEAT-SHEETS.md**: write (or update) short mental models for +key lifecycle flows. One cheat sheet per major lifecycle or flow +identified in the codebase. Format: + +```markdown +## + +Steps: +1. +2. +3. ... + +Key invariants: +- + +Common failure modes: +- + +Flow (ASCII — include when sequence or state is non-obvious): + + [Trigger] --> [Step A] --> [Step B] --> [Done] + | + [Error] --> [Retry] --> [Dead Letter] +``` + +Aim for cheat sheets that fit on one screen. If a flow needs more +than ~15 steps, split it. Write cheat sheets for at minimum: +- The main entry-point lifecycle (e.g. controller reconcile loop, + request handler, CLI command dispatch) +- Any policy or rule evaluation flow +- Any significant async or background job lifecycle + +Skip if the project has no meaningful lifecycles (e.g. a pure +library with no runtime behavior). + ### Phase 4: Update Tracking + Write `.context/map-tracking.json` with: ```json @@ -155,28 +387,382 @@ Write `.context/map-tracking.json` with: } ``` -### Phase 5: Report +### Phase 5: Convergence Report + Search Prompts + +Print a structured convergence report. This is the primary output +the user reads — make it clear and actionable. + +**Format:** + +``` +## Convergence Report + +### By Module + +| Module | Confidence | Status | Blocker | +|--------|------------|--------|---------| +| pkg/foo | 0.9 | ✅ Converged | — | +| pkg/bar | 0.6 | 🔶 Shallow | Internal flow unclear | +| pkg/baz | 0.2 | 🔴 Stubbed | Not analyzed | + +### By Domain (if natural groupings exist) -Summarize what was done: +Group related modules and show aggregate coverage: + e.g. "Auth layer: 2/3 modules converged (avg 0.72)" -1. **Modules analyzed**: list with old → new confidence -2. **Documents updated**: which sections changed in each doc -3. **Overall coverage**: fraction of modules at confidence ≥ 0.7 -4. **Remaining frontier**: modules still below 0.7 or unanalyzed +### Overall + +- Total modules: N +- Converged (≥ 0.9): N ✅ +- Solid (0.7–0.89): N 🟡 +- Shallow (0.4–0.69): N 🔶 +- Stubbed (< 0.4): N 🔴 + +### What Would Help Next + +For each non-converged module, print a specific suggestion: + +🔶 pkg/bar (0.6) — Shallow + → Read the test files to understand expected behavior under + edge cases: `pkg/bar/*_test.go` + → Trace the internal flow through + → Ask: "walk me through what happens when X" + +🔴 pkg/baz (0.2) — Not analyzed + → Run /ctx-architecture with focus area: pkg/baz + → Or: open pkg/baz/README.md if present + +### Convergence Verdict + +One of: +- ✅ CONVERGED — all modules ≥ 0.9, frontier empty. Further runs + without code changes won't improve coverage. +- 🟡 MOSTLY CONVERGED — core modules ≥ 0.9, peripheral modules + shallow. Diminishing returns on full re-run; use focus areas. +- 🔶 PARTIAL — significant modules below 0.7. Re-run with focus + areas or read tests. +- 🔴 INCOMPLETE — substantial portions unanalyzed. Run again. +``` + +**Convergence thresholds:** +- Module is **converged** at confidence ≥ 0.9 +- Project is **converged** when all non-peripheral modules ≥ 0.9 +- Peripheral = no other modules depend on it AND it has no + exported API surface (pure internal helpers, generated code, + vendor) + +**Blocker vocabulary** (use these consistently in the table): +- `Internal flow unclear` — exports known, internals not traced +- `Not analyzed` — directory listed only +- `Tests not read` — implementation known, behavior under edge + cases unknown +- `Design rationale unknown` — code understood, "why" is unclear +- `Converged` — nothing left to learn from static reading + +--- + +After printing the convergence verdict, append a **Search Prompts** +section. The skill has just read the codebase and knows its jargon — +this is the most useful thing it can hand back to someone who is +not blocked by intelligence but by not knowing the right words. + +**Format:** + +``` +## Search Prompts + +The right keyword changes everything. Based on what I found in +the codebase, here are targeted searches worth running — in your +internal docs, Confluence, Notion, Slack, or publicly: + +### Fill the gaps (ranked by how much they'd help) + +For modules/areas still below 0.9: + +🔶 pkg/bar — Internal flow unclear + Try searching: + - " design" or " internals" + - " " + - "why does use " (ADR or design doc) + +🔴 pkg/baz — Not analyzed + Try searching: + - " explained" + - " behavior" + +### Concepts worth understanding deeply + +List 3–5 technical concepts the codebase clearly depends on but +that can't be learned from the code alone. Give the exact search +phrase, not a topic: + +- " explained" — e.g. "etcd watch semantics + explained", "CRDT merge strategies", "OIDC token refresh flow" +- " tradeoffs" — e.g. "saga pattern vs 2PC tradeoffs" + +### Architecture decision records (if relevant) + +If the code shows signs of a deliberate non-obvious choice +(e.g. custom retry logic instead of a library, unusual data +structure), suggest: + - " ADR" + - " RFC" + - "why doesn't use " + +--- +Note: I won't run these searches for you — you may have internal +docs where these are more useful than public results, and you know +which sources to trust. Pick the phrases that match what's blocking +you. +``` + +**Rules for this section:** +- Always generate search prompts, even for converged modules — + there's always design rationale that code can't express +- Phrases must be concrete and use actual names/types from the + codebase — no generic "learn more about X" fluff +- Rank by usefulness: gaps in shallow modules first, concepts + second, ADRs third +- Maximum ~10 phrases total; fewer sharp ones beat many vague ones +- Default: do NOT run the searches yourself +- Exception: if the user requested principal-mode depth AND no + internal search tools are available, you may run public searches + for upstream ADRs, peer-project design docs, or KEPs — but only + for concepts the codebase shows clear dependency on; note what + you searched and what you found + +--- + +## Principal Mode (Phases 0–5 + P1–P3) + +Run all default mode phases first (0–5), then continue below. +Principal mode is for strategic thinking — beyond "what is" to +"what could be" and "what should concern us." + +### Phase P1: Extended Context Gathering + +In addition to the default phase sources, read: + +- `.context/TASKS.md` — outstanding work, future plans +- `CHANGELOG.md` or `docs/changelog.md` — trajectory of decisions +- `docs/` — any design rationale in user-facing docs +- Recent git log: `git log --oneline -30` + +### Phase P2: Gather Strategic Context + +Two-tier behavior — do not stall: + +**If answers are available** (user provided them in the prompt, +or they exist in `.context/TASKS.md` / `DECISIONS.md`): use them. +Do not ask for what you already have. + +**If answers are not available**: do NOT stop. Generate a +provisional principal analysis with assumptions explicitly labeled +(see Principal Mode Fallback below). Include a "Questions That +Would Sharpen This" section at the end of ARCHITECTURE-PRINCIPAL.md. + +When asking the user, present all questions at once as a numbered +list — do not ask one-at-a-time: + +``` +Before I write the principal analysis, a few questions — skip +or say "unsure" on anything you don't know: + +0. **Focus areas** (if not already set in Phase 0.5) + +1. **Vision**: What is this project trying to become in 12–24 months? + +2. **Future direction**: Any architectural pivots being considered? + (plugin system, multi-tenant, cloud sync, daemon model, etc.) + +3. **Known bottlenecks**: Where does the current design hurt you? + +4. **Implementation alternatives**: Any decisions you'd do + differently starting fresh? + +5. **Gaps**: What's missing that you expect to need? + +6. **Areas of improvement**: Known tech debt or structural awkwardness? +``` + +### Phase P3: Write Principal Analysis + +After collecting answers, write `.context/ARCHITECTURE-PRINCIPAL.md` +(separate from `ARCHITECTURE.md` — speculation must not pollute +the authoritative doc). + +```markdown +# Architecture — Principal Analysis +_Generated . Strategic analysis only; see ARCHITECTURE.md +for the authoritative architecture reference._ + +## Current State Summary +[Condensed narrative of the current architecture — ~1 page max] + +## Vision Alignment +[How does the current architecture support or constrain the stated +vision? What structural changes would enable it?] + +## Future Direction +[Architectural implications of planned pivots or new capabilities. +What would need to change if [feature X] were added?] + +## Known Bottlenecks +[Analysis of performance, scalability, or dev-experience pain +points identified in the codebase or raised by the user] + +## Implementation Alternatives +[For 2–3 key design decisions: current approach, alternatives, +tradeoffs] + +## Gaps +[Missing capabilities or abstractions the architecture doesn't +handle yet but probably will need to] + +## Areas of Improvement +Ranked by impact/effort: +- **High impact, low effort** (do first) +- **High impact, high effort** (plan for) +- **Low impact** (defer or skip) + +## Risks +[Architectural risks as the system scales, team grows, or +requirements evolve] + +## Intervention Points +Top 5 highest-leverage places to implement new features or +improvements, ranked by impact/effort: +1. `` — what kind of change fits here and why +2. ... + +(These are concrete locations — package paths, interface names, +function boundaries — not vague subsystem labels.) + +## Upstream Proposals +2–3 changes worth proposing to the project upstream (KEP / RFC / +issue style thinking). For each: +- **What**: one-sentence description of the change +- **Why**: what problem it solves that the current design can't +- **Where**: which abstraction boundary it touches +- **Risk**: what it breaks or complicates + +Each proposal must cross an abstraction boundary — it must affect +how modules interact, not just refactor internals. If it doesn't +change an interface, a contract, or an ownership boundary, it's +not upstream-worthy; it's a local improvement (put it in +Improvement Ideas instead). + +## Productization Gaps +What would need to change for this to work at enterprise scale? +- Multi-cluster / multi-tenant gaps +- Observability and debuggability holes +- Operational hardening missing from current design +- What a large customer would hit first + +## Failure-First Analysis +[Hidden assumptions baked into the architecture. What breaks +silently vs. loudly? What would cause a cascade? What does the +system assume about its environment that may not hold?] + +## Onboarding Friction +[Practical, not theoretical — this is what a new engineer actually +hits in week one:] +- What makes this system hard to understand quickly? +- Which modules require tribal knowledge to use safely? +- Where would a new engineer get stuck first, and why? +- What isn't written down anywhere? +``` + +**Boundary hygiene** — ARCHITECTURE-PRINCIPAL.md is for synthesis, +leverage, risk, direction, and judgment. Do NOT restate module +details that already exist in DETAILED_DESIGN.md. Reference module +paths only where needed to ground an argument. If you find yourself +summarizing what a module does, stop — link to it instead. + +**Principal mode fallback** — if Phase P2 answers were not provided, +label speculative sections clearly and add at the end: + +```markdown +## Questions That Would Sharpen This Analysis + +Answering any of these would move speculative sections to grounded ones: + +1. **Vision** — What is this project trying to become in 12–24 months? +2. **Future direction** — Any architectural pivots being considered? +3. **Known bottlenecks** — Where does the current design hurt? +4. **Assumptions marked** — These sections are labeled [inferred]: + [list them] +``` + +**Autonomous inferences** — principal mode must also answer the +following from the codebase alone, without waiting for user input. +These are things the code is silently deciding. Surface them: + +- Where are abstraction boundaries likely to calcify under growth? +- Which current APIs are accidentally becoming public contracts? +- What will become expensive when team size or data volume doubles? +- Where is the architecture optimized for current workflow rather + than long-term extensibility? +- Which parts are structurally elegant but strategically wrong for + the likely future? + +These go in a dedicated "Silent Choices" section in +ARCHITECTURE-PRINCIPAL.md. The code is making bets — name them. + +**Opinion floor** — ARCHITECTURE-PRINCIPAL.md must contain at minimum: +- 3 risks (specific, not "this could be slow") +- 3 improvement ideas (concrete, not "add more tests") +- 2 upstream opportunities (actionable, not "contribute more") + +Generate opinions, not just descriptions. If you find yourself +writing neutral summaries, push harder. + +When in doubt, prefer a strong, falsifiable opinion over a safe, +generic one. Weak opinions are noise; strong opinions can be +corrected. + +**Cross-project comparison** (include when the codebase shows +non-obvious design choices or when focus areas have well-known +peers): + +For any module where a comparable exists in another project, add: +```markdown +### Compared to / + +- What does differently +- What does better +- What could be unified or learned from +``` + +Examples worth comparing when relevant: +- Velero vs Stash (backup) +- controller-runtime reconciler vs custom loops +- Gatekeeper vs Kyverno (policy) +- Any CNCF project vs its closest peer + +Skip if no meaningful peer exists. Do not force comparisons. + +Be direct. This document is for engineering judgment, not external +audiences. + +--- ## Confidence Rubric -Use these levels for honest self-assessment: +Score by **decision usefulness**, not descriptive completeness. +Ask: "What could an engineer safely do with this understanding?" -| Level | Meaning | -|-----------|-----------------------------------------------------------------------| -| 0.0 - 0.3 | Stubbed: directory listed but contents not examined | -| 0.4 - 0.6 | Shallow: purpose understood, key exports known, internal flow unclear | -| 0.7 - 0.8 | Solid: can explain exports, data flow, and main code paths | -| 0.9 - 1.0 | Deep: can explain edge cases, error handling, design rationale | +| Level | Decision usefulness | +|------------|------------------------------------------------------------------------------| +| 0.0 - 0.3 | Stubbed: not safe to make any decisions; directory listed only | +| 0.4 - 0.6 | Shallow: can describe purpose; not safe to modify without more reading | +| 0.7 - 0.79 | Safe to make localized changes with care; can review simple PRs | +| 0.8 - 0.89 | Can reason about design tradeoffs; safe to design changes in this module | +| 0.9 - 1.0 | Can predict likely breakage from non-trivial changes; safe to own the module | -A confidence of 0.9 means "I could explain every exported function's -purpose and the data flow through this module." Not "I read the file." +Inflate scores and you lie to the next agent that reads the tracking +file. Under-score and the convergence report will never clear. +Score the decision-usefulness honestly. ## Opt-Out Handling @@ -206,13 +792,72 @@ The nudge is a suggestion, not automatic execution. After running, verify: - [ ] ARCHITECTURE.md is under 4000 tokens (~16KB) - [ ] ARCHITECTURE.md has all required sections (Overview, Dependency - Graph, Component Map, Data Flow, Key Patterns, File Layout) + Graph, Component Map, Data Flow, Key Patterns, File Layout) - [ ] DETAILED_DESIGN.md uses consistent per-module format - [ ] Each module section has Purpose, Key types, Exported API, - Data flow, Edge cases, Dependencies + Data flow, Edge cases, Performance considerations, Control + loop & ownership (if applicable), Danger zones, Extension + points, Improvement ideas, Dependencies +- [ ] ASCII sequence diagram included when 3+ actors or + non-obvious ordering +- [ ] ASCII state diagram included when module manages lifecycle + or status transitions +- [ ] No mermaid in DETAILED_DESIGN.md (ASCII only) +- [ ] If DETAILED_DESIGN.md > ~600 lines or 3+ domains: split + into domain files with shallow index +- [ ] map-tracking.json records domain_file for each module + when split - [ ] map-tracking.json is valid JSON with version, coverage entries - [ ] Confidence levels are honest (not inflated) - [ ] Stale modules were re-analyzed, not just marked current - [ ] ARCHITECTURE.md was only updated for boundary/flow/dependency - changes, not internal implementation details -- [ ] Report was provided summarizing what changed + changes, not internal implementation details +- [ ] Convergence report printed with per-module table +- [ ] Domain groupings shown if natural groupings exist +- [ ] Each non-converged module has a specific "what would help" + suggestion (not generic advice) +- [ ] Overall convergence verdict stated (CONVERGED / MOSTLY / + PARTIAL / INCOMPLETE) +- [ ] Blocker column uses consistent vocabulary +- [ ] Search Prompts section printed after convergence verdict +- [ ] Search phrases use actual type/function/pattern names from + the codebase (not generic topics) +- [ ] Phrases ranked: shallow-module gaps first, concepts second, + ADRs third +- [ ] No more than ~10 phrases total +- [ ] Skill did NOT run the searches itself +- [ ] Phase 0.5 structure scan was run before any deep analysis +- [ ] Focus areas question was asked with actual package names (not + open-ended) +- [ ] If focus areas given: deep analysis concentrated there; other + packages stubbed at 0.2 unless direct dependencies +- [ ] Principal mode: P2 answers used if available; if not, + provisional analysis written with [inferred] labels +- [ ] Principal mode: "Questions That Would Sharpen This" section + present if P2 answers were not provided +- [ ] Principal mode: output written to `ARCHITECTURE-PRINCIPAL.md`, + not overwriting `ARCHITECTURE.md` +- [ ] Principal mode: "Silent Choices" section present (autonomous + inferences from code — abstraction calcification, accidental + contracts, scale costs, strategic bets) +- [ ] Principal mode: ARCHITECTURE-PRINCIPAL.md does not restate + DETAILED_DESIGN.md content — links to module paths instead +- [ ] CHEAT-SHEETS.md written with at least one lifecycle flow +- [ ] Each cheat sheet fits ~one screen; long flows are split +- [ ] Danger zones section present in each DETAILED_DESIGN module + (top 3, with reasoning — not just "this is complex") +- [ ] Extension points section present in each module +- [ ] Principal mode: Failure-First Analysis section written +- [ ] Principal mode: Onboarding Friction section present (practical, + week-one concerns — not generic "hard to understand") +- [ ] Principal mode: Upstream Proposals cross abstraction boundaries + (not internal refactors) +- [ ] Principal mode: Intervention Points section present (concrete + locations, not vague labels) +- [ ] Principal mode: Upstream Proposals section present (2–3 items + with what/why/where/risk) +- [ ] Principal mode: Productization Gaps section present +- [ ] Principal mode: opinion floor met (≥3 risks, ≥3 improvements, + ≥2 upstream opportunities — specific, not generic) +- [ ] Principal mode: cross-project comparisons included where + meaningful peers exist (not forced)