Skip to content

Fix /claude review/challenge: hard timeout + merge-base diff scope#1893

Open
jackspacelb wants to merge 5 commits into
garrytan:mainfrom
jackspacelb:fix-claude-reviewer-timeout
Open

Fix /claude review/challenge: hard timeout + merge-base diff scope#1893
jackspacelb wants to merge 5 commits into
garrytan:mainfrom
jackspacelb:fix-claude-reviewer-timeout

Conversation

@jackspacelb
Copy link
Copy Markdown

@jackspacelb jackspacelb commented Jun 7, 2026

Problem

/claude review/challenge/consult had two compounding problems that made the
nested-Claude "outside voice" unreliable:

  1. Unbounded run. The diff was piped into claude -p with no timeout and no
    size cap, so a large diff could hang effectively forever.
  2. Wrong diff scope. The diff was captured with git diff origin/<base>
    the working tree against the base tip. On a branch whose base has moved
    ahead, or whose base ref is stale/un-fetched, this balloons to hundreds of
    unrelated files. This is the deeper cause of "the review hangs": even with a
    timeout, the reviewer was handed a giant, mostly-irrelevant diff and produced
    nothing useful.

The sibling /codex skill already had timeout + exit-124 handling; /claude
never got it. (Both problems are still present on main, v1.56.0.0.)

Fix

Bound the run (commits 1–2):

  • Wrap all four claude -p invocations (review, challenge, consult-new,
    consult-resume) in a hard timeout: gtimeouttimeout, 600s. Inlined
    per block on purpose — each bash block in a SKILL.md runs in its own shell, so
    a helper defined in another block would be undefined at call time.
  • Prompt fed via < "$PROMPT_FILE" (not cat ... |), so timeout wraps
    claude directly. Deliberately not < /dev/null (would send an empty
    prompt).
  • Fail closed if no timeout binary exists (exit 125 + clear "install
    coreutils" message) rather than silently running unbounded.
  • On exit 124, state the run did NOT finish (do not treat as a pass) and print
    the first 120 lines of stderr before cleanup deletes it.

Scope the diff correctly (commit 3):

  • Diff from the merge base to the working tree:
    git diff "$(git merge-base origin/<base> HEAD)". Shows only this branch's
    changes and keeps uncommitted work. (A plain three-dot diff would drop
    uncommitted changes, so merge-base → working tree is used instead.)
  • Harden the base fetch: warn instead of silently diffing against a stale base.

Test

gen-skill-docs asserts the new stdin-redirect shape, timeout/124 + fail-closed
markers, and the merge-base diff scope; it rejects the old unbounded pipe and the
old base-tip diff, so neither can silently regress.

Verification

  • Forced-timeout smoke (timeout 2 sleep 10) → exit 124, handler fires.
  • Delivery smoke: claude -p ... < "$PROMPT_FILE" returns the exact prompt token
    (confirms the redirect delivers content, not an empty prompt).
  • Diff-scope, empirically, on a branch whose base moved ahead: old git diff origin/main = 10 files of noise; merge-base diff = exactly the 2 real files.
  • bun test test/gen-skill-docs.test.ts passes.

Left VERSION/CHANGELOG to maintainers. Not included (deliberately, as separate
concerns): a size-preflight/split path for legitimately huge diffs, and
fail-closed on all non-zero exits (only 124/125 are special-cased today).

…w/challenge hang)

The `/claude` review, challenge, and consult modes piped the entire branch diff
into `cat "$PROMPT_FILE" | claude -p` with no timeout and no size cap. On a large
diff this can hang effectively forever — there is no timeout, so the caller waits
indefinitely.

The sibling `/codex` skill already wraps its review/challenge/consult calls in a
`gtimeout`/`timeout` fallback with exit-124 hang handling. The `/claude` wrapper
never got the same treatment (still true on main).

This change wraps all four `claude -p` invocations:
- Self-contained timeout per block (`gtimeout` -> `timeout` -> unwrapped, 600s).
  Inlined per block on purpose: each bash block in a SKILL.md runs in its own
  shell, so a helper defined in another block would be undefined at call time.
- Prompt fed via `< "$PROMPT_FILE"` instead of `cat ... |`, so `timeout` wraps
  `claude` directly. Deliberately NOT `< /dev/null`, which would send an empty
  prompt and review nothing.
- On exit 124, prints that the run did NOT finish and must not be treated as a
  pass/clean result, and suggests splitting the diff.

Test: updates the gen-skill-docs assertion to require the new stdin-redirect
shape + timeout/124 markers and to reject the old unbounded pipe, so the hang
cannot silently regress.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented Jun 7, 2026

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

jackspacelb and others added 2 commits June 7, 2026 10:52
…changes)

The reviewer captured `git diff origin/<base>`, comparing the working tree to
the base TIP. On a branch whose base moved ahead, or whose base ref is stale or
un-fetched, this balloons to hundreds of unrelated files. That is the deeper
reason "/claude review hangs": the timeout caps the wait, but the reviewer was
still handed a giant, mostly-irrelevant diff and could never produce a useful
result.

- Diff from the merge base to the working tree:
  `git diff "$(git merge-base origin/<base> HEAD)"`. Shows only this branch's
  changes AND keeps uncommitted work. A plain three-dot diff would drop
  uncommitted changes, so merge-base -> working tree is used instead.
- Harden the base fetch: warn instead of silently diffing against a stale base.

Test asserts the merge-base scope and rejects the old base-tip diff.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jackspacelb jackspacelb changed the title Wrap nested claude -p reviewer in a hard timeout (fix /claude review/challenge hang) Fix /claude review/challenge: hard timeout + merge-base diff scope Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant