Replay 2906 commits from private repo history by jiaminc-cmu · Pull Request #549 · promptdriven/pdd

jiaminc-cmu · 2026-02-21T05:25:27Z

Summary

Replayed 2,906 filtered commits from the private repo onto this branch
Only includes files matching the shared section of .sync-config.yml
Original author, date, and commit messages are preserved
Private-only files (architecture.json, _python.prompt in pdd/prompts/, etc.) were excluded

What's included

pdd/*.py, pdd/commands/*.py, pdd/core/*.py, pdd/server/**/*.py
tests/**/*.py
prompts/*_LLM.prompt (rewritten to pdd/prompts/*_LLM.prompt)
examples/, context/**/*_example.py
Root configs (README, requirements.txt, pyproject.toml, Makefile, etc.)
docs/, utils/vscode_prompt/

What's excluded

architecture.json
pdd/prompts/*_python.prompt (cap-only)
.github/workflows/ (not synced)
All non-shared internal files

Verification

Leak check passed — no private files introduced by replay
Review final file state matches expected public content
Verify commit authorship: git log --format="%an <%ae> %ad %s" | head -20

🤖 Generated with Claude Code

…parse` in `unfinished_prompt.py`.

- Fix format string injection: Escape curly braces in LLM outputs before storing in context to prevent KeyError when subsequent prompts contain {placeholders} from code/error analysis - Fix silent error: Print KeyError messages to console before returning - Fix resume message: Calculate actual start step (5.5) before displaying resume message instead of showing incorrect "step 6" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add 5 new tests covering: - Format string injection: Verify curly braces in LLM outputs don't cause KeyError - Restored context escaping: Verify curly braces in resumed state are escaped - Error console output: Verify KeyError messages are printed to console - Resume message for step 5.5: Verify correct step shown when resuming after step 5 - Resume message for step 6: Verify correct step shown when resuming after step 5.5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Cherry-picked changes from PR #267: - Add optional interactive steering to sync command - Fix sync animation for horizontal terminal resizes - Add --no-steer and --steer-timeout options - Add sync_tui tests and example Note: Excluded pdd/prompts/* (symlink) and project_dependencies.csv Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

## Non-Python Sync Fixes - Skip test_extend for non-Python languages since code coverage tooling is Python-specific - Fix sync returning success without generating tests for non-Python modules - Added check for test file existence before accepting workflow as complete - The synthetic RunReport from crash/verify was incorrectly triggering "all_synced" - Add safety checks in sync_orchestration.py and pin_example_hack.py ## Frontend File Detection Fixes - Support new .pddrc `outputs.code.path` format (Issue #237) - Previously only looked for legacy `generate_output_path` - Now checks `outputs.code.path`, `outputs.test.path`, `outputs.example.path` first - Add .test.ts/.spec.ts patterns for Jest/TypeScript test file detection - Fixes detection of files like `test_prisma_schema.test.ts` - Rebuild frontend with updated file detection logic ## Architecture Generation Fixes - Add valid language suffixes guidance to prevent LLM using invalid suffixes like `_NextJS` - Escape curly braces in architecture_json.prompt template to prevent .format() errors - Add preprocessing in orchestrator before .format() calls ## Path Resolution Fixes - Add `typescriptreact` -> `.tsx`, `javascriptreact` -> `.jsx`, `prisma` -> `.prisma` mappings - Ensure example and test paths always have fallback defaults 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## Workflow Changes Reorganized the agentic architecture workflow for better modularity: - Step 1-6: Unchanged (analyze PRD, decompose, research, design, deps, generate) - Step 7: NEW - Generate .pddrc configuration file - Step 8: Renamed from step 7 - Generate individual prompts - Step 9: Renamed from step 8 - Completeness validation - Step 10: Renamed from step 9 - Sync prompts with architecture - Step 11: NEW - Dependency resolution - Step 12: Renamed from step 11 - Fix validation errors ## New Files - `prompts/agentic_arch_step7_pddrc_LLM.prompt` - .pddrc generation step - `prompts/agentic_arch_step11_deps_LLM.prompt` - Dependency resolution step - `prompts/agentic_arch_step12_fix_LLM.prompt` - Enhanced fix step - `pdd/templates/architecture/example_nextjs_task_notes.prompt` - Example for Next.js projects - `pdd/templates/architecture/pdd_path_construction_guide.prompt` - Path construction reference ## Template Fixes - Escape curly braces in docs/prompting_guide.md to prevent .format() errors - Change {PLACEHOLDER} to [PLACEHOLDER] in generate_prompt.prompt to avoid confusion - Updated step count references in step 1 and 2 prompts (8 -> 11) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## Task Queue Panel Improvements - Make task queue panel draggable by adding drag handle - Save/restore panel position to localStorage - Add reset position button to return to default (top-right corner) - Keep panel within viewport bounds on window resize - Separate collapse toggle from drag handle for better UX ## Generate Command - Add --skip-prompts flag to skip prompt generation in agentic architecture mode - Prompts are generated by default; flag allows skipping when not needed ## Logging - Change generate_output_paths logging from INFO to DEBUG level - Reduces noise since paths may be overridden by outputs.code.path config 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- test_agentic_architecture_orchestrator: Update all tests to reflect the new 11-step workflow (steps 1-8 linear + steps 9-11 validation) - test_sync_determine_operation: Fix test_decision_test_on_low_coverage to create actual test file (required after test file existence check) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Update README and TUTORIALS.md to reflect the new 11-step agentic architecture workflow: - Steps 1-8: Analysis & generation (architecture.json, .pddrc, prompts) - Steps 9-11: Validation (completeness, sync, dependencies) Also document the new --skip-prompts option for faster architecture-only generation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- test_commands_generate.py: Add skip_prompts=False to expected call - code_generator_main.py: Handle both {{PLACEHOLDER}} (YAML-escaped) and {PLACEHOLDER} (single brace) in post_process_args substitution 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This reverts commit 351813a.

…ck.acquire() This commit adds comprehensive unit and E2E tests that detect the file handle resource leak in SyncLock.acquire() when non-IOError/OSError exceptions occur. Tests added: - Unit tests in test_sync_determine_operation.py (6 tests) - KeyboardInterrupt during lock acquisition - RuntimeError during lock acquisition - Exception during file operations - IOError/OSError regression tests - Normal operation regression tests - Context manager exception handling - E2E tests in test_e2e_issue_403_file_handle_leak.py (4 tests) - Real-world KeyboardInterrupt scenario (Ctrl+C) - RuntimeError leak detection - Normal operation verification - Context manager interrupt handling All tests correctly fail on current code, detecting the bug where file descriptors remain open when exceptions other than IOError/OSError occur during lock acquisition. Related to #403 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fixes #403

- Revert unnecessary code change to handle double braces in code_generator_main.py - Update test_code_generator_main.py to normalize {{PLACEHOLDER}} to {PLACEHOLDER} when reading the template for testing purposes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Fix mock_httpx_client fixture to properly mock async context manager using AsyncMock for __aenter__ and __aexit__ - Fix test_first_heartbeat_sent_immediately to avoid orphan coroutines by using side_effect instead of reassigning the mock - Fix test_heartbeat_refreshes_token_on_401 to use side_effect pattern - Fix test_heartbeat_only_refreshes_once_per_cycle to use return_value 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add explanation that core_dump files are created when PDD runs crash or hit internal errors, per Copilot review suggestion. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…h expanded agentic architecture, various bug fixes, and refactors.

…gration This update improves the verbose logging setup by allowing the LiteLLM library to toggle its debug output based on the verbose flag or environment variable. It ensures that logging levels are appropriately set for production and development modes, and adds error handling for potential attribute access issues in LiteLLM.

…n SyncLock.acquire()" This reverts commit 400e8ea.

- Add batch detection using Union-Find algorithm to group modules by dependency - Each batch is a connected component in the dependency graph - Modules within a batch sync sequentially (by priority), different batches are independent - Add BatchFilterDropdown component with expandable module list view - Add batch color stripe indicator on graph nodes - Add SyncOptionsModal for configuring sync options before execution - Various UI improvements to PromptSelector, PromptSpace, and constants 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add `agentic_mode` parameter to sync_orchestration for Python agentic path - Change cmd_test_main return type from 3-tuple to 4-tuple with agentic_success flag - Add run_agentic_test_generate 4-tuple return with success boolean - Add _use_agentic_path() helper to determine agentic behavior - Add _create_synthetic_run_report_for_agentic_success() for non-Python languages - Fix sync_determine_operation to differentiate synthetic vs real run reports using test_hash - Use sentinel value "agentic_test_success" when agent succeeds but file is at different path - Trust agentic_success flag for non-Python test generation instead of file existence check - Update prompts and examples to reflect API changes Fixes issue where sync reported failure despite successful agentic test generation for non-Python languages (CSS, TypeScript, etc.) where test files may be created at different paths or with different extensions than expected. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…upport Add agentic_mode parameter throughout the sync workflow: - commands/maintenance.py: Add --agentic CLI flag to sync command - sync_main.py: Pass agentic_mode parameter to sync_orchestration - prompts/sync_main_python.prompt: Document agentic_mode parameter Update agentic_test_generate return signature: - prompts/agentic_test_generate_python.prompt: Update return type from tuple[str, float, str] to tuple[str, float, str, bool] to include success boolean, matching actual implementation Fix cloud timeout handling: - fix_verification_errors_loop.py: Use get_cloud_timeout() function instead of hardcoded CLOUD_REQUEST_TIMEOUT constant Improve server job failure detection: - server/jobs.py: Check stdout for sync failure indicators even when exit code is 0, since sync may return 0 but report failure in output Extend language extension mappings: - server/routes/files.py: Add HTML, CSS, and Makefile extensions; include "Dockerfile" without extension prefix 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Update test_sync_dry_run_mode to expect the new agentic_mode=False parameter in sync_orchestration calls. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…er for the 3blue1brown demo, along with related dependency and changelog updates.

…of a test fix attempt.

This commit adds comprehensive test coverage for the bug where commits created by LLM agents during Step 1 of the agentic E2E fix workflow are not pushed to the remote repository when the workflow exits early at Step 2. Test files: - tests/test_e2e_issue_419_unpushed_commits.py: Unit tests for _commit_and_push() - tests/test_e2e_issue_419_cli_unpushed_commits.py: E2E integration test The tests verify the expected behavior documented in CHANGELOG v0.0.121: "pdd fix now automatically commits and pushes changes after successful completion" These tests fail on the current buggy code and will pass once the fix is implemented in pdd/agentic_e2e_fix_orchestrator.py lines 237-238. Related to #419

Fixes #419

Reverts 53a9caa and 38d3ab3. The calculate_prompt_hash() fix is correct at the fingerprint-calculation layer but incomplete end-to-end: pdd sync's insert-includes step strips <include> tags from the original .prompt file, so subsequent syncs cannot detect include dependency changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…cle detection) Fixes #521

…o-deps) - Skip auto-deps in agentic mode (prompts already have explicit dependencies) - Add 30s client-side timeout to Firecrawl scrape_url via ThreadPoolExecutor (works around SDK bug where timeout ms is passed to requests as seconds) - Update Firecrawl method from scrape() to scrape_url() for current SDK - Add 30s timeout to git ls-files subprocess call - Add label parameter to _run_with_provider for future heartbeat support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… auth When Claude Code runs on subscription (not API key), total_cost_usd is absent from JSON output. Add _calculate_anthropic_cost() with three-tier fallback: (1) modelUsage per-model costUSD, (2) token-based estimation from usage field with model-family-aware pricing (Opus/Sonnet/Haiku), (3) zero. This matches the existing pattern used for Gemini and Codex providers which always estimate from token counts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…it tests for issue 509.

…and Vertex AI ADC, various bug fixes, build improvements, and grounding experiment documentation.

- Fix result[-2] indexing bug in sync_orchestration.py and pin_example_hack.py that caused $0.00 cost for agentic test generation. For 4-tuple returns (content, cost, model, success), result[-2] gave model (str) not cost (float). Changed to result[1] which is always the cost index. - Increase MODULE_TIMEOUT from 900s (15 min) to 1800s (30 min) in agentic_sync_runner.py. Complex modules (e.g. TypeScriptReact with <web> tags) need generate+crash+verify+test which can take 20+ min total. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove unused signal/threading imports from agentic_common.py - Guard against UnboundLocalError in _save_state if mkstemp fails - Clean up temp cost_file on Popen failure in _sync_one_module Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…state desynchronization (Issue #159), update LLM invocation logic and prompts, and add new grounding experiment results.

…e Claude model, and migrate pytest configuration to pyproject.toml.

Introduces user story tests as a first-class PDD feature: - `pdd/user_story_tests.py`: core validation logic — discover story files, run detect_change against each story, and apply fixes via change_main - `pdd detect --stories`: new mode that validates all user_stories/story__*.md files against current prompts (pass = no changes needed) - `pdd fix user_stories/story__<name>.md`: auto-detects affected prompts, applies changes, then re-validates - `pdd change`: auto-validates user stories post-change before finalizing, respecting `skip_user_stories` context flag to prevent recursion - `user_stories/story__template.md`: standard story template - Full test coverage (89 tests across 4 test files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Set mock_proc.pid = 99999 in _make_mock_popen so that the timeout test calls os.killpg(99999, SIGTERM) instead of resolving MagicMock.__index__() to 1, which was sending SIGTERM to process group 1 and killing the entire pytest-xdist worker mid-run, causing CI to fail at 26% every time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace claude-sonnet-4-5 entries (both Vertex AI and Anthropic) with claude-sonnet-4-6 in the canonical data file shipped with the package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…dates for `sync_orchestration` to fix state desynchronization.

…hestration runs All 5 runs used vertex_ai/claude-opus-4-6 (context-1m-2025-08-07 beta working), achieving 100% test pass rate (108/108) and ref_sim=0.823 ± 0.031. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… models to LLM configurations.

…architecture improvements (#482) * feat: Add iterative fix-verify loop (steps 3-7) to checkup orchestrator Steps 3-7 (build, interfaces, test, fix, verify) now run in a while loop (max 3 iterations) instead of once. If step 7 finds remaining failures, the workflow loops back to re-check/fix until "All Issues Fixed" or max iterations. Worktree is created before the loop; step 8 runs after. Prompts updated with iteration awareness, previous_fixes context, e2e test instructions, and "All Issues Fixed" exit signal. 59 tests (12 new). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Copy uncommitted/untracked files into worktree on creation The worktree is created from HEAD, which only contains committed files. If the user has uncommitted or untracked files (e.g. new CRM modules), the worktree would be missing them, causing steps 3-7 to see different files than steps 1-2 analyzed. Now _setup_worktree calls _copy_uncommitted_changes which: 1. Applies uncommitted tracked changes via git diff HEAD | git apply 2. Copies untracked files (excluding .pdd/) into the worktree Both operations are best-effort — failures are logged but don't block. Added 7 tests for the new behavior. Reverted prompt workaround. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Split step 6 into sub-steps, fix resume bugs, and fix iteration display - Split monolithic step 6 into 6.1 (fix), 6.2 (regression tests), 6.3 (e2e tests) with separate prompts and 600s timeouts each - Bug A fix: save worktree state immediately after creation so Ctrl+C during step 3 can resume without recreating - Bug B fix: detect between-iterations resume (start_step > 7 with fix_verify_iteration > 0) and restart at step 3 with incremented iteration - Fix iteration number always showing "1" in GitHub comments: add iteration suffix to steps 3-5 comment headers, add explicit instruction to all loop step prompts to use exact iteration number - Fix total step count: "X of 7" -> "X of 8" across all prompts - Add STEP_ORDER constant and _next_step() helper for fractional step arithmetic - Add checkup command, agentic_checkup module, and comprehensive tests (108 total) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Add frontend integration checks to checkup step 4 and step 6.1 Step 4 (Interface Check) now audits: - Frontend navigation reachability: detect orphan pages with no nav link - Frontend→Backend API call consistency: detect pages using different URL patterns than the rest of the codebase (e.g. relative vs base URL) Step 6.1 (Fix) now handles: - Adding missing nav links for orphan pages - Updating inconsistent API call patterns to match codebase convention Found via QA on the CRM app where the page existed but had no sidebar link and used relative `/adminCrmActions` instead of the standard `${NEXT_PUBLIC_API_BASE_URL}/adminCrmActions` pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Increase step 7 (verify) timeout from 600s to 1200s Step 7 re-runs the full test suite to verify fixes, which can exceed 10 minutes on larger projects (e.g. 4600+ tests). The 600s timeout caused step 7 to time out after posting its GitHub comment but before returning, leaving state stuck at step 6.3 and causing infinite resume loops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Improve architecture generation with Strategy B support, register checkup command, and add gh timeout - Add Strategy B (template-based group contexts) support to arch steps 7, 8, 10, 12 - Add pdd_path_construction_guide Strategy B documentation - Add example_python_backend.prompt template - Register checkup command in CLI - Add timeout parameter to _run_gh_command() - Update test durations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Add PDD prompts, examples, and architecture entries for checkup modules - Add agentic_checkup_python.prompt and agentic_checkup_orchestrator_python.prompt - Add context/agentic_checkup_example.py and context/agentic_checkup_orchestrator_example.py - Register checkup and orchestrator in architecture.json (priority 217, 218) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: Add README documentation for pdd checkup and pdd sync URL mode Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: trigger Cloud Build CI --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Greg Tanaka <glt@alumni.caltech.edu>

Align the replay branch's final file state with main so the PR shows zero diff. The branch preserves the full commit history while ending at the same tree as main.

Copilot

Copilot wasn't able to review any files in this pull request.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…c orchestrators (promptdriven#549) (promptdriven#565) * fix: replace .format(**context) with safe str.replace() in all agentic orchestrators (promptdriven#549) Root cause: orchestrators called prompt_template.format(**context) to substitute step outputs into prompt templates. When a step's LLM output contained JSON curly braces (e.g. {"error": "Insufficient role"}), the format call either raised KeyError or, when values were pre-escaped with .replace("{", "{{"), inserted doubled braces {{...}} into the LLM prompt instead of the original single-brace JSON. Fix: replace .format(**context) with iterative str.replace() — the same safe pattern used in template_expander.py:155-156. This substitutes each context key literally without interpreting curly braces in values. The remaining {{ }} from preprocess(double_curly_brackets=True) are then un-doubled with a final .replace("{{", "{").replace("}}", "}") call. Value pre-escaping (.replace("{", "{{") in context storage) is also removed as it is no longer needed. Files changed: - pdd/agentic_bug_orchestrator.py - pdd/agentic_change_orchestrator.py - pdd/agentic_checkup_orchestrator.py - pdd/agentic_test_orchestrator.py - pdd/agentic_e2e_fix_orchestrator.py - pdd/agentic_architecture_orchestrator.py Tests: 22 passing unit tests across both test files verify: - JSON output from step N appears as single braces in step N+1 prompt - Nested/multiple JSON blocks are preserved - Unknown placeholders left intact (no KeyError) - Structural assertions confirm the buggy patterns are removed from source Fixes promptdriven#549 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address copilot review comments on PR promptdriven#565 - Remove unused `ast` import from test_e2e_issue_549_format_double_escaping.py - Remove unused `re` and `MagicMock` imports from test_e2e_issue_549_other_orchestrators.py - Remove dead no-op `if` block in agentic_checkup_orchestrator.py that claimed to set a dotted alias for integer step keys but wrote to the same dict key as the line above it Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: Add failing tests for E2E timeout retry bug promptdriven#791 - 8 unit tests in test_agentic_e2e_fix_orchestrator.py (2 pass prompt checks, 6 fail detecting missing behavior) - 3 E2E tests in test_e2e_issue_791_e2e_timeout_retry.py (all fail detecting the bug) - Prompt fix adding E2E pre-flight check and cross-cycle memory requirements Root causes: 1. No environment pre-flight check before Step 2 E2E tests (line 660-726) 2. step_outputs cleared between cycles destroying failure memory (line 857-859) Fixes promptdriven#791 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: PDD bug changes for promptdriven#791 * fix: pdd fix: E2E test step times out on every cycle, wasting cost and time Fixes promptdriven#791 * fix: PDD fix changes for promptdriven#791 * fix: persist skipped_steps to state and remove artifact files - Save skipped_steps in state_data so it survives resume across sessions - Load skipped_steps from state on resume (with JSON string-to-int key conversion) - Include skipped_steps in KeyboardInterrupt and Exception state saves - Remove artifact files: error_output_791.txt, test_errors_791.txt, and agentic_e2e_fix_orchestrator_test_agentic_e2e_fix_orchestrator_fixed.py Fixes promptdriven#791 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add _check_e2e_environment mock to issue promptdriven#419 tests The new E2E environment preflight check skips Step 2 when no playwright config exists. Existing tests that expect Step 2 to dispatch to the LLM agent need to mock _check_e2e_environment to return (True, ""). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add _check_e2e_environment mock to issue promptdriven#545 tests Same fix as promptdriven#419 tests: mock the E2E environment preflight check so Step 2 dispatches to the LLM mock instead of being skipped. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add _check_e2e_environment mock to e2e_fix_deps fixture in promptdriven#549 tests Without this mock, _check_e2e_environment skips Step 2 (no playwright config in tmp_path), so the cycle1_step2 assertion fails. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add _check_e2e_environment mock to promptdriven#468, promptdriven#467, promptdriven#357 test fixtures Same pattern as previous fixes — without this mock, _check_e2e_environment skips Step 2 (no playwright in tmp_path), breaking step execution assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — narrow skip trigger to timeouts, remove last_completed_step advance, add check=True - Narrow Step 2 skip to timeout-specific errors only (not transient provider outages like rate limits) - Remove contradictory last_completed_step = step_num in skip path (skipped_steps dict already handles cross-cycle memory) - Add check=True to git subprocess calls in test_issue_791 fixture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: PDD Bot <pdd-bot@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Greg Tanaka <glt@alumni.caltech.edu>

Run pdd update + pdd example on agentic_e2e_fix_orchestrator, agentic_e2e_fix, commands/fix, and agentic_common to capture accumulated bug fixes (promptdriven#338, promptdriven#468, promptdriven#549, promptdriven#791, #830) into prompts and refresh few-shot examples before re-running pdd change on #822. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gltanaka and others added 30 commits January 26, 2026 12:32

feat: add a new hello example and suppress SyntaxWarning during `ast.…

36d5fb2

…parse` in `unfinished_prompt.py`.

Revert "feat: Add steerable option to pdd sync + fix TUI resize issues"

74a9358

This reverts commit 351813a.

fix: Bug: File Handle Resource Leak in SyncLock.acquire()

5f3b7bc

Fixes #403

docs: Clarify core_dumps comment in fix prompt

a9c1975

Add explanation that core_dump files are created when PDD runs crash or hit internal errors, per Copilot review suggestion. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

bump: version 0.0.130 → 0.0.131

94dacfa

feat: Add 3Blue1Brown style PDD intro script and update changelog wit…

539966a

…h expanded agentic architecture, various bug fixes, and refactors.

chore(debug): suppress LiteLLM debug messages

fb4afe5

Revert "Add failing tests for issue #403: File Handle Resource Leak i…

ddfa115

…n SyncLock.acquire()" This reverts commit 400e8ea.

test(sync_main): add agentic_mode parameter to expected calls

f48b72c

Update test_sync_dry_run_mode to expect the new agentic_mode=False parameter in sync_orchestration calls. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

bump: version 0.0.131 → 0.0.132

964b168

feat: Introduce Qwen3-TTS 12Hz 1.7B CustomVoice model and its tokeniz…

caa5a68

…er for the 3blue1brown demo, along with related dependency and changelog updates.

chore: Update gitignore to exclude large model weights and add a log …

66c197a

…of a test fix attempt.

fix: Bug: Agentic fix doesn't push commits when exiting early at Step 2

7ff236b

Fixes #419

gltanaka and others added 23 commits February 15, 2026 18:43

fix: Circular <include> tags silently produce corrupted output (no cy…

717d2fb

…cle detection) Fixes #521

fix: Implement LLM invocation retry cost logic and add new E2E and un…

9febbf4

…it tests for issue 509.

bump: version 0.0.149 → 0.0.150

c5359df

docs: Update changelog with new features for include cycle detection …

05ba3ed

…and Vertex AI ADC, various bug fixes, build improvements, and grounding experiment documentation.

fix: Increase LLM API call timeout from 120 seconds to 600 seconds.

a70acc8

bump: version 0.0.150 → 0.0.151

9593f4a

feat: Implement atomic state updates for sync_orchestration to fix …

de79c88

…state desynchronization (Issue #159), update LLM invocation logic and prompts, and add new grounding experiment results.

bump: version 0.0.151 → 0.0.152

9f81abc

feat: implement atomic state updates, configurable LLM timeout, updat…

1d17ed9

…e Claude model, and migrate pytest configuration to pyproject.toml.

chore: update bundled llm_model.csv to claude-sonnet-4-6

1e96ad7

Replace claude-sonnet-4-5 entries (both Vertex AI and Anthropic) with claude-sonnet-4-6 in the canonical data file shipped with the package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: update Claude Sonnet token counts and introduce atomic state up…

92e9a5b

…dates for `sync_orchestration` to fix state desynchronization.

bump: version 0.0.152 → 0.0.153

53a2885

feat: Update Gemini 3 Pro preview to 3.1 Pro preview and add xAI Grok…

0ea741f

… models to LLM configurations.

chore: sync final state to match main

094bb15

Align the replay branch's final file state with main so the PR shows zero diff. The branch preserves the full commit history while ending at the same tree as main.

gltanaka requested a review from Copilot February 21, 2026 07:44

Copilot AI reviewed Feb 21, 2026

View reviewed changes

jamesdlevine mentioned this pull request Mar 1, 2026

pdd test crashes with KeyError when project context/test.prompt contains curly braces #622

Closed

Serhan-Asad mentioned this pull request Mar 27, 2026

CLI - architecture.json generation failing (without error) (non-github invocation) #686

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replay 2906 commits from private repo history#549

Replay 2906 commits from private repo history#549
jiaminc-cmu wants to merge 2907 commits intomainfrom
replay-commit-history

jiaminc-cmu commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

jiaminc-cmu commented Feb 21, 2026

Summary

What's included

What's excluded

Verification

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants