fix: Error codes needs to be documented in Swagger#2451
Draft
William-Hill wants to merge 1 commit intoConduitIO:mainfrom
Draft
fix: Error codes needs to be documented in Swagger#2451William-Hill wants to merge 1 commit intoConduitIO:mainfrom
William-Hill wants to merge 1 commit intoConduitIO:mainfrom
Conversation
Fixes ConduitIO#576 Generated by conduit-agent-experiment implementer.
William-Hill
pushed a commit
to William-Hill/conduit-agent-experiment
that referenced
this pull request
Apr 7, 2026
Full pipeline completed: triage → archivist → planner → reviewer → implementer → draft PR on ConduitIO/conduit#2451. Total cost ~$0.06, total time ~3 minutes. Documents key learnings about Gemini Flash tool-calling reliability, JSON vs markdown for plans, and Haiku prompt engineering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
William-Hill
added a commit
to William-Hill/conduit-agent-experiment
that referenced
this pull request
Apr 8, 2026
…riter (#16) * feat(implementer): add coding tools for BetaToolRunner Implements 5 tools (read_file, write_file, list_dir, search_files, run_command) using anthropic-sdk-go toolrunner. Each tool is scoped to a repo directory with path traversal protection. Errors are returned as text results so the model can reason about them and retry. 12 tests covering happy paths, error cases, and security boundaries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): add RunAgent with BetaToolRunner and system prompt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): add CLI entry point with triage-to-PR orchestration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add implementer agent feasibility test results Orchestration pipeline verified end-to-end: triage JSON → clone → agent setup → API call. Blocked on Anthropic credits for live run. 13/13 tool tests pass, build clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(archivist): add dossier types and persistence * feat(archivist): add ADK Go archivist agent with repo exploration tools Implements 4 functiontools (read_file, list_dir, search_files, save_dossier) with path traversal protection, plus the archivist agent and RunArchivist entry point using Gemini Flash via the ADK runner. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): accept archivist dossier in prompt context Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): wire archivist into pipeline (triage → archivist → implementer → PR) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(archivist): create session before running agent The ADK Go runner requires the session to exist in the session service before calling Run(). Added session.Service.Create() call. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): add IMPL_ISSUE_NUMBER override and bump max iterations to 30 Allows targeting a specific issue instead of always picking the top-ranked one. Useful for testing against simpler issues first. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): switch to Haiku default, add prompt caching and model override - Default model is now Haiku 4.5 ($1/$5 per MTok) instead of Sonnet 4.6 - Override with IMPL_MODEL env var (e.g. IMPL_MODEL=claude-sonnet-4-6) - System prompt and user context marked with CacheControl so they're only billed at full price once; subsequent iterations use cache hits at 10% of input cost — major savings over 20+ iteration runs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(archivist): tighten instruction to enforce save_dossier call, add progress logging - Archivist instruction now has strict 8-call budget and emphatic requirement to call save_dossier - Both archivist and implementer now log each tool call as it happens for progress visibility (no extra token cost) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(implementer): aggressively prioritize writing over exploring Haiku burns all iterations reading files even when the archivist already provided context. New prompt enforces: write by iteration 3, hard budget of 15 tool calls, minimize read_file usage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: reduce max iterations to 15 to match prompt budget Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(planner): add Gemini Flash planner and reviewer agents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): accept implementation plan instead of dossier The implementer agent now takes a *planner.ImplementationPlan rather than a *archivist.Dossier. The system prompt is simplified to just write files from the plan and verify the build. Tests updated to use planner types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(implementer): wire planner+reviewer into CLI pipeline The implementer CLI now runs a 9-step pipeline: triage -> fetch issue -> clone -> archivist -> planner -> reviewer (with one retry) -> implementer -> check changes -> create PR. Planner and reviewer use Gemini Flash. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(archivist): remove list_dir tool and force save_dossier by step 5 Gemini Flash ignores budget instructions and wanders with list_dir. Removed the tool entirely — search_files is sufficient for finding relevant code. Instruction now mandates save_dossier by the 5th call. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(archivist): replace agent loop with single Gemini call The ADK Go agent loop couldn't reliably force Gemini Flash to call save_dossier. Replaced with a deterministic approach: 1. Pre-gather search results in Go (grep for issue keywords) 2. Single Gemini Flash call to analyze results and output JSON 3. Read file contents for identified paths This is faster, cheaper, and guaranteed to produce a dossier. Removed tools.go and tools_test.go (no longer needed). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove stray implementer binary * fix(planner): use Gemini JSON mode and add retry on parse failure Gemini was producing invalid JSON when embedding Go source code. Added ResponseMIMEType: "application/json" to force valid JSON output. Planner retries once with error feedback if parsing fails. Reviewer also updated with JSON mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(planner): switch to markdown output instead of JSON Gemini can't reliably produce valid JSON containing Go source code (backticks, special chars break JSON encoding even in JSON mode). Markdown is natural for LLMs, handles code blocks cleanly, and requires no parsing. The plan is now a markdown string passed directly to the reviewer and implementer prompts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update feasibility report with successful live run Full pipeline completed: triage → archivist → planner → reviewer → implementer → draft PR on ConduitIO/conduit#2451. Total cost ~$0.06, total time ~3 minutes. Documents key learnings about Gemini Flash tool-calling reliability, JSON vs markdown for plans, and Haiku prompt engineering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add full pipeline achievement report for talk Documents the end-to-end pipeline: triage → archivist → planner → reviewer → implementer → draft PR. Includes architecture diagram, cost analysis ($0.06/run, $2.16/month at active pace), technology stack, and key learnings from the development process. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: document CI failure on PR #2451 — hallucinated error constants The draft PR failed golangci-lint because the planner/implementer referenced error constants (connector.ErrNameAlreadyExists, etc.) that don't exist in the Conduit codebase. Same hallucination failure mode from experiments 02-05. Documents root cause, lessons, and links to issues #17, #18, #21 that would address this. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR #16 review comments — security, logic, and cleanup Security: harden safePath against sibling-prefix and symlink escapes, fix archivist readFileContent path traversal (use os.Open instead of shell), add -e flag to grep pattern to prevent flag injection, replace run_command sh -c with allowlisted argv executor. Logic: detect untracked files in change check, re-review retried plans, use strconv.Atoi for IMPL_ISSUE_NUMBER, fix hardcoded model name in PR body. Cleanup: extract cleanJSON to shared llmutil package, add langFromPath for code fence language detection, use strings.Contains in tests, add dossier round-trip assertions, log grepRepo errors, fix docs lint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address second round of CodeRabbit review comments - Fail fast in archivist when no proposed files can be loaded - Resolve symlinks in readFileContent before opening (was only lexical) - Truncate search_files output at 64KB to prevent unbounded results - Reject path-qualified executables in run_command (./go, ../bin/git) - Replace CleanJSON string scanning with atomic regex fence stripping - Add CleanJSON edge case tests (single-line fence, backticks in JSON) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address third round of CodeRabbit review comments - Walk up to nearest existing ancestor in safePath for nested write escape prevention (symlink in intermediate directory) - Exclude .git/ from recursive grep searches - Scrub run_command environment to minimal set (PATH, HOME, GOPATH, GOROOT, TMPDIR) to prevent credential exfiltration from malicious repos - Handle symlink test setup errors (t.Fatal/t.Skip) and assert escape-specific error message - Simplify unused buf allocation in readFileContent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add design spec for per-step cost tracking and budget controls (#21) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add README and project journey document for new collaborators README covers quick start, architecture overview, project structure, and configuration. JOURNEY.md traces the full project arc from hypothesis through milestones, experiments, the architecture pivot, and current state — written for a new collaborator joining the repo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: William Hill <william@d4bl.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #576
Agent Summary
Good, so the functions already exist. Now I need to fix my status.go implementation to only include the error constants that actually exist. Let me remove the non-existent ones:
Generated by conduit-agent-experiment (archivist: Gemini Flash, implementer: Claude Sonnet, 15 iterations).