Skip to content

fix(code-mode): tolerate js fences, harden prompt + empty-repair path#233

Open
nolanmak wants to merge 1 commit into
mainfrom
fix/code-mode-failures
Open

fix(code-mode): tolerate js fences, harden prompt + empty-repair path#233
nolanmak wants to merge 1 commit into
mainfrom
fix/code-mode-failures

Conversation

@nolanmak

Copy link
Copy Markdown
Owner

Summary

Three small fixes that together address the open code-mode-failure postmortems #205-#208 (no fenced ts/typescript code block — both first attempt and repair returned empty) and add a regression test for #190 (Unexpected token ':' from a TS cast — fixed by PR #204).

  • extract_ts_block accepts js / javascript fences in addition to ts / typescript. The Deno runner transpiles either flavor via data:application/typescript module loading, so the stricter check was rescuing nothing — it was just rejecting cases where the model emitted the wrong label.
  • CODE_MODE_SYSTEM_PREFIX now leads with a non-negotiable response-format section and a minimal template (async function main(): Promise<void> + await main()). Burying the format rule among the 7 hard rules let the model drift to prose-only responses. Snapshot regenerated.
  • New code_mode_repair_tail helper de-duplicates the repair-tail rendering across code_mode_repair_user_message and Reasoner::call_code_mode_with_repair. When the prior attempt produced no fenced block at all (the [code-mode] no fenced ts/typescript code block in assistant respon… #205-[code-mode] no fenced ts/typescript code block in assistant respon… #208 failure mode), the empty-prior-source branch switches to a forceful template that calls out the missing-fence failure and re-shows the template — the existing <prior_program>\n\n</prior_program> was too quiet a signal for the model to recover from.
  • Regression test ts_type_assertions_in_program_do_not_crash_runner pins PR fix(code-mode): run program as TS module so type casts work #204's fix: a program with as unknown as EmailContext runs cleanly through the real Deno sidecar.

Closes #205, #206, #207, #208, #190.

Test plan

  • cargo test -p augmentagent-channel-core --lib — 194 pass (includes 4 new tests covering js/javascript fence tolerance + the empty-prior-source repair branch)
  • cargo test -p augmentagent-channel-core --test code_mode_prompt_snapshot — 1 pass (regenerated)
  • cargo test -p augmentagent-channel-core --test code_mode_runner — 5 pass, including new [code-mode] reasoner call failed: run_program: runtime: Unexpected token… #190 regression under real Deno sidecar
  • Downstream channels build clean: cargo build -p augmentagent-channel-email -p augmentagent-channel-slack -p augmentagent-channel-twitter
  • Live reasoner exercise: ./target/release/augmentagent --wiki-dir ./wiki wiki ask "..." — returned a coherent response, confirming the post-refactor reasoner.rs invokes the claude subprocess cleanly. Receipt at .claude/agent-test-receipts/65e37733ba6f0f2817eb93fcfe671483a345cd39.txt with honest notes on which paths could not be triggered deterministically from CLI.

Three small fixes that together address the open code-mode-failure
postmortems #205-#208 (no fenced code block, repair also empty) and
add a regression test for #190 (TS cast crash, fixed by PR #204).

* `extract_ts_block` now accepts ```js / ```javascript in addition to
  ```ts / ```typescript. The Deno runner transpiles either way via
  `data:application/typescript` module loading, so the stricter check
  was rescuing nothing — just rejecting cases where the model emitted
  the wrong fence label.

* `CODE_MODE_SYSTEM_PREFIX` now leads with an explicit
  "Response format — non-negotiable" section and a minimal template
  showing the required `async function main()` + `await main()` shape.
  Burying the format rule among the hard-rules list let the model
  drift to prose-only responses. Snapshot regenerated.

* New `code_mode_repair_tail` helper de-duplicates the repair-tail
  rendering across `code_mode_repair_user_message` (prompt.rs) and
  `Reasoner::call_code_mode_with_repair` (reasoner.rs). When the prior
  attempt produced no fenced block at all (#205-#208's failure mode),
  the empty-prior-source branch switches to a forceful template that
  calls out the missing-fence failure and re-shows the minimal example
  — the existing "<prior_program>\n\n</prior_program>" was too quiet
  a signal to recover from.

* Regression test `ts_type_assertions_in_program_do_not_crash_runner`
  in `tests/code_mode_runner.rs` pins PR #204's fix: a program with
  `as unknown as EmailContext` runs cleanly. If anyone reverts the
  runner to plain-JS evaluation, this test goes red.

Closes #205, #206, #207, #208, #190.

## Test plan

- [x] `cargo test -p augmentagent-channel-core --lib` — 194 pass
- [x] `cargo test -p augmentagent-channel-core --test code_mode_prompt_snapshot` — 1 pass
- [x] `cargo test -p augmentagent-channel-core --test code_mode_runner` — 5 pass (incl. new #190 regression test running under real Deno sidecar)
- [x] Downstream channels build clean: `cargo build -p augmentagent-channel-email -p augmentagent-channel-slack -p augmentagent-channel-twitter`
@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@nolanmak, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 36 minutes and 5 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5d2dedf1-bb6f-4c82-808f-2cad5ee7029a

📥 Commits

Reviewing files that changed from the base of the PR and between 3306d64 and 65e3773.

📒 Files selected for processing (5)
  • crates/augmentagent-channel-core/src/code_mode/mod.rs
  • crates/augmentagent-channel-core/src/prompt.rs
  • crates/augmentagent-channel-core/src/reasoner.rs
  • crates/augmentagent-channel-core/tests/code_mode_runner.rs
  • crates/augmentagent-channel-core/tests/snapshots/code_mode_system_v1.txt
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/code-mode-failures

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[code-mode] no fenced ts/typescript code block in assistant respon…

1 participant