Skip to content

v0.8.65: Evaluate the Orchestration disposition in constitution.md — help, deadweight, or harm? #3494

Description

@Hmbown

We are trying the ## Orchestration disposition added to crates/tui/src/prompts/constitution.md via #3470 (merged for the 0.8.65 rebuild). This issue tracks the empirical evaluation, to be done after the 0.8.65 global rebuild + hands-on testing.

Question

Does instilling the orchestrator disposition ("when work outgrows one context you are an orchestrator; delegate the doing, verify every returned slice, keep the loop alive") actually change agent behavior for the better, or is it deadweight prompt tokens — or does it cause over-orchestration of small tasks?

What to measure (post-rebuild)

  • Does the agent delegate appropriately on large/multi-context tasks vs. before?
  • Does it over-orchestrate small/obvious one-file tasks (the disposition explicitly warns against this — does the warning hold)?
  • Verification discipline: does "never trust a worker's done" translate to more ground-truth checks?
  • Token cost of the added constitution section vs. observed benefit.

How

A/B feel-test during real 0.8.65 usage (disposition on vs. a build with the section removed), plus a few representative large tasks. Decide: keep as-is, soften (e.g. the unqualified "Always have work in flight" at constitution.md:390), or remove.

Ref: #3470 (the disposition), #3154 (Fleet EPIC).

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    Status
    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions