fix(code-mode): tolerate js fences, harden prompt + empty-repair path#233
fix(code-mode): tolerate js fences, harden prompt + empty-repair path#233nolanmak wants to merge 1 commit into
Conversation
Three small fixes that together address the open code-mode-failure postmortems #205-#208 (no fenced code block, repair also empty) and add a regression test for #190 (TS cast crash, fixed by PR #204). * `extract_ts_block` now accepts ```js / ```javascript in addition to ```ts / ```typescript. The Deno runner transpiles either way via `data:application/typescript` module loading, so the stricter check was rescuing nothing — just rejecting cases where the model emitted the wrong fence label. * `CODE_MODE_SYSTEM_PREFIX` now leads with an explicit "Response format — non-negotiable" section and a minimal template showing the required `async function main()` + `await main()` shape. Burying the format rule among the hard-rules list let the model drift to prose-only responses. Snapshot regenerated. * New `code_mode_repair_tail` helper de-duplicates the repair-tail rendering across `code_mode_repair_user_message` (prompt.rs) and `Reasoner::call_code_mode_with_repair` (reasoner.rs). When the prior attempt produced no fenced block at all (#205-#208's failure mode), the empty-prior-source branch switches to a forceful template that calls out the missing-fence failure and re-shows the minimal example — the existing "<prior_program>\n\n</prior_program>" was too quiet a signal to recover from. * Regression test `ts_type_assertions_in_program_do_not_crash_runner` in `tests/code_mode_runner.rs` pins PR #204's fix: a program with `as unknown as EmailContext` runs cleanly. If anyone reverts the runner to plain-JS evaluation, this test goes red. Closes #205, #206, #207, #208, #190. ## Test plan - [x] `cargo test -p augmentagent-channel-core --lib` — 194 pass - [x] `cargo test -p augmentagent-channel-core --test code_mode_prompt_snapshot` — 1 pass - [x] `cargo test -p augmentagent-channel-core --test code_mode_runner` — 5 pass (incl. new #190 regression test running under real Deno sidecar) - [x] Downstream channels build clean: `cargo build -p augmentagent-channel-email -p augmentagent-channel-slack -p augmentagent-channel-twitter`
|
Warning Review limit reached
More reviews will be available in 36 minutes and 5 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Three small fixes that together address the open
code-mode-failurepostmortems #205-#208 (no fenced ts/typescript code block— both first attempt and repair returned empty) and add a regression test for #190 (Unexpected token ':'from a TS cast — fixed by PR #204).extract_ts_blockacceptsjs/javascriptfences in addition tots/typescript. The Deno runner transpiles either flavor viadata:application/typescriptmodule loading, so the stricter check was rescuing nothing — it was just rejecting cases where the model emitted the wrong label.CODE_MODE_SYSTEM_PREFIXnow leads with a non-negotiable response-format section and a minimal template (async function main(): Promise<void>+await main()). Burying the format rule among the 7 hard rules let the model drift to prose-only responses. Snapshot regenerated.code_mode_repair_tailhelper de-duplicates the repair-tail rendering acrosscode_mode_repair_user_messageandReasoner::call_code_mode_with_repair. When the prior attempt produced no fenced block at all (the [code-mode] no fencedts/typescript code block in assistant respon… #205-[code-mode] no fencedts/typescript code block in assistant respon… #208 failure mode), the empty-prior-source branch switches to a forceful template that calls out the missing-fence failure and re-shows the template — the existing<prior_program>\n\n</prior_program>was too quiet a signal for the model to recover from.ts_type_assertions_in_program_do_not_crash_runnerpins PR fix(code-mode): run program as TS module so type casts work #204's fix: a program withas unknown as EmailContextruns cleanly through the real Deno sidecar.Closes #205, #206, #207, #208, #190.
Test plan
cargo test -p augmentagent-channel-core --lib— 194 pass (includes 4 new tests coveringjs/javascriptfence tolerance + the empty-prior-source repair branch)cargo test -p augmentagent-channel-core --test code_mode_prompt_snapshot— 1 pass (regenerated)cargo test -p augmentagent-channel-core --test code_mode_runner— 5 pass, including new [code-mode] reasoner call failed: run_program: runtime: Unexpected token… #190 regression under real Deno sidecarcargo build -p augmentagent-channel-email -p augmentagent-channel-slack -p augmentagent-channel-twitter./target/release/augmentagent --wiki-dir ./wiki wiki ask "..."— returned a coherent response, confirming the post-refactorreasoner.rsinvokes theclaudesubprocess cleanly. Receipt at.claude/agent-test-receipts/65e37733ba6f0f2817eb93fcfe671483a345cd39.txtwith honest notes on which paths could not be triggered deterministically from CLI.