Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
49a967c
remove P2/P3 suppression instruction
varin-nair-factory Jan 13, 2026
23e8f1f
add import verification step
varin-nair-factory Jan 13, 2026
e422d6a
Add structured analysis phases
varin-nair-factory Jan 13, 2026
d0d4251
Add Thorough Analysis Checklist
varin-nair-factory Jan 13, 2026
26c5252
consolidate instructions into structured phases and remove redundant …
varin-nair-factory Jan 13, 2026
45cb5b5
remove contradicting instructions about resolving existing threads
varin-nair-factory Jan 13, 2026
38af5d4
reorganize Phase 3, remove duplicates, fix diff position params
varin-nair-factory Jan 13, 2026
2ff53b2
Add robustness improvements to review prompt
varin-nair-factory Jan 13, 2026
49df74b
Consolidate inline comment tool guidance with multi-line support
varin-nair-factory Jan 13, 2026
b8d9301
Refine code review prompt to improve precision
varin-nair-factory Jan 14, 2026
37d9992
Use local /review prompt
varin-nair-factory Jan 14, 2026
ae1f3e4
Revert "Use local /review prompt"
varin-nair-factory Jan 15, 2026
afc66a4
add no pager to get full diff, exclude P2/P3 issues
varin-nair-factory Jan 15, 2026
0983d10
Add common analysis patterns for reviews
varin-nair-factory Jan 15, 2026
cbbe3c1
Pre-compute PR diff and comments, enforce thorough review with xhigh …
varin-nair-factory Jan 15, 2026
e53cbd8
remove revoking of gh app token
varin-nair-factory Jan 15, 2026
9ffb2ed
switch gpt-5.2 reasoning effort back to high
varin-nair-factory Jan 15, 2026
86ec277
fix tests
varin-nair-factory Jan 21, 2026
5da3f3f
test validator run
varin-nair-factory Jan 21, 2026
a7066fb
increase candidate volume/coverage
varin-nair-factory Jan 21, 2026
03d7bc8
increase candidate volume/coverage 2
varin-nair-factory Jan 22, 2026
10804bd
Revert "increase candidate volume/coverage 2"
varin-nair-factory Jan 22, 2026
c275c9b
Revert "increase candidate volume/coverage"
varin-nair-factory Jan 22, 2026
9502f9e
add thoroughness
varin-nair-factory Jan 22, 2026
56ab298
improvements to candidate and validator prompts
varin-nair-factory Jan 23, 2026
b89f550
add common classes of bugs to candidate prompt
varin-nair-factory Jan 27, 2026
2ad2164
old validator prompt
varin-nair-factory Jan 27, 2026
00e3ea9
Parallel Review: Phase 1 - Add file-group-reviewer subagent
varin-nair-factory Jan 28, 2026
925aeb6
Parallel Review: Phase 2 - Add Triage Phase section in Candidates Prompt
varin-nair-factory Jan 28, 2026
fb7b9ae
Parallel Review: Phase 3 - Add parallel subagent calls phase
varin-nair-factory Jan 28, 2026
11816e2
Parallel Review: Phase 4 - Aggregation Phase
varin-nair-factory Jan 28, 2026
7f72756
Parallel Review: Phase 5 - Move subagent to ~/.factory/droids
varin-nair-factory Jan 28, 2026
355bbe3
enable task tool
varin-nair-factory Jan 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions .factory/droids/file-group-reviewer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
name: file-group-reviewer
description: Reviews an assigned subset of PR files for bugs, security issues, and correctness problems. Spawned in parallel by the main review agent to ensure thorough coverage.
model: inherit
tools: ["Read", "Grep", "Glob", "LS"]
---

You are a senior staff software engineer and expert code reviewer.

Your task: Review the assigned files from the PR and generate a JSON array of **high-confidence, actionable** review comments that pinpoint genuine issues.

<review_guidelines>
- You are currently checked out to the PR branch.
- Review ALL files assigned to you thoroughly.
- Focus on: functional correctness, syntax errors, logic bugs, broken dependencies/contracts/tests, security issues, and performance problems.
- High-signal bug patterns to actively check for (only comment when evidenced in the diff):
- Null/undefined/Optional dereferences; missing-key errors on untrusted/external dict/JSON payloads
- Resource leaks (unclosed files/streams/connections; missing cleanup on error paths)
- Injection vulnerabilities (SQL injection, XSS, command/template injection) and auth/security invariant violations
- OAuth/CSRF invariants: state must be per-flow unpredictable and validated; avoid deterministic/predictable state or missing state checks
- Concurrency/race/atomicity hazards (TOCTOU, lost updates, unsafe shared state, process/thread lifecycle bugs)
- Missing error handling for critical operations (network, persistence, auth, migrations, external APIs)
- Wrong-variable/shadowing mistakes; contract mismatches (serializer/validated_data, interfaces/abstract methods)
- Type-assumption bugs (e.g., numeric ops on datetime/strings, ordering key type mismatches)
- Offset/cursor/pagination semantic mismatches (off-by-one, prev/next behavior, commit semantics)
- Only flag issues you are confident about—avoid speculative or stylistic nitpicks.
</review_guidelines>

<workflow>
1. Read each assigned file in full to understand the context
2. Read the relevant diff sections provided in the prompt
3. Read related files as needed to fully understand the changes:
- Imported modules and dependencies
- Interfaces, base classes, and type definitions
- Related tests to understand expected behavior
- Callers/callees of modified functions
- Configuration files if behavior depends on them
4. Analyze the changes for issues matching the bug patterns above
5. For each issue found, verify it against the actual code and related context before including it
</workflow>

<output_format>
Return your findings as a JSON array (no wrapper object, just the array):

```json
[
{
"path": "src/index.ts",
"body": "[P1] Title\n\n1 paragraph explanation.",
"line": 42,
"startLine": null,
"side": "RIGHT"
}
]
```

If no issues found, return an empty array: `[]`

Field definitions:
- `path`: Relative file path (must match exactly as provided in your assignment)
- `body`: Comment text starting with priority tag [P0|P1|P2], then title, then 1 paragraph explanation
- P0: Critical bugs (crashes, security vulnerabilities, data loss)
- P1: Important bugs (incorrect behavior, logic errors)
- P2: Minor bugs (edge cases, non-critical issues)
- `line`: Target line number (single-line) or end line number (multi-line). Must be ≥ 0.
- `startLine`: `null` for single-line comments, or start line number for multi-line comments
- `side`: "RIGHT" for new/modified code (default), "LEFT" only for commenting on removed code
</output_format>

<constraints>
- Output ONLY the JSON array—no additional commentary or markdown formatting around it.
- Do not include `commit_id` in your output—the parent agent will add this.
- Do not attempt to post comments to GitHub—just return the JSON array.
</constraints>
96 changes: 72 additions & 24 deletions action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,18 @@ inputs:
description: "Override reasoning effort for review flows (passed to Droid Exec as --reasoning-effort). If empty and review_model is also empty, the action defaults internally to gpt-5.2 at high reasoning."
required: false
default: ""
review_use_validator:
description: "Enable two-pass review: generate candidate comments to JSON, then validate and post only approved ones."
required: false
default: "true"
review_candidates_path:
description: "Path to write review candidates JSON (run #1 when review_use_validator=true)."
required: false
default: "${{ runner.temp }}/droid-prompts/review_candidates.json"
review_validated_path:
description: "Path to write review validated JSON (run #2 when review_use_validator=true)."
required: false
default: "${{ runner.temp }}/droid-prompts/review_validated.json"
fill_model:
description: "Override the model used for PR description fill (e.g., 'claude-sonnet-4-5-20250929', 'gpt-5.1-codex'). Only applies to fill flows."
required: false
Expand Down Expand Up @@ -137,6 +149,9 @@ runs:
AUTOMATIC_REVIEW: ${{ inputs.automatic_review }}
REVIEW_MODEL: ${{ inputs.review_model }}
REASONING_EFFORT: ${{ inputs.reasoning_effort }}
REVIEW_USE_VALIDATOR: ${{ inputs.review_use_validator }}
REVIEW_CANDIDATES_PATH: ${{ inputs.review_candidates_path }}
REVIEW_VALIDATED_PATH: ${{ inputs.review_validated_path }}
FILL_MODEL: ${{ inputs.fill_model }}
ADDITIONAL_PERMISSIONS: ${{ inputs.additional_permissions }}
DROID_ARGS: ${{ inputs.droid_args }}
Expand Down Expand Up @@ -169,6 +184,20 @@ runs:
DROID_DIR=$(dirname "${{ inputs.path_to_droid_executable }}")
echo "$DROID_DIR" >> "$GITHUB_PATH"

- name: Setup Custom Droids
if: steps.prepare.outputs.contains_trigger == 'true'
shell: bash
run: |
echo "Setting up custom droids..."
mkdir -p ~/.factory/droids
if [ -d "${GITHUB_ACTION_PATH}/.factory/droids" ]; then
cp -r ${GITHUB_ACTION_PATH}/.factory/droids/* ~/.factory/droids/
echo "Copied custom droids to ~/.factory/droids/"
ls -la ~/.factory/droids/
else
echo "No custom droids found in action"
fi

- name: Setup Network Restrictions
if: steps.prepare.outputs.contains_trigger == 'true' && inputs.experimental_allowed_domains != ''
shell: bash
Expand All @@ -178,18 +207,6 @@ runs:
env:
EXPERIMENTAL_ALLOWED_DOMAINS: ${{ inputs.experimental_allowed_domains }}

- name: Checkout PR branch for review
if: steps.prepare.outputs.contains_trigger == 'true' && steps.prepare.outputs.review_pr_number != ''
shell: bash
run: |
echo "Checking out PR #${{ steps.prepare.outputs.review_pr_number }} branch for full file access..."
# Reset any local changes from the merge commit to allow clean checkout
git reset --hard HEAD
gh pr checkout ${{ steps.prepare.outputs.review_pr_number }}
echo "Successfully checked out PR branch: $(git rev-parse --abbrev-ref HEAD)"
env:
GH_TOKEN: ${{ steps.prepare.outputs.github_token }}

- name: Run Droid Exec
id: droid
if: steps.prepare.outputs.contains_trigger == 'true'
Expand All @@ -216,6 +233,46 @@ runs:
DETAILED_PERMISSION_MESSAGES: "1"
FACTORY_API_KEY: ${{ inputs.factory_api_key }}

- name: Prepare validator
id: prepare_validator
if: steps.prepare.outputs.contains_trigger == 'true' && inputs.review_use_validator == 'true'
shell: bash
run: |
bun run ${GITHUB_ACTION_PATH}/src/entrypoints/prepare-validator.ts
env:
GITHUB_TOKEN: ${{ steps.prepare.outputs.github_token }}
REVIEW_USE_VALIDATOR: ${{ inputs.review_use_validator }}
REVIEW_VALIDATED_PATH: ${{ inputs.review_validated_path }}
REVIEW_CANDIDATES_PATH: ${{ inputs.review_candidates_path }}
DROID_COMMENT_ID: ${{ steps.prepare.outputs.droid_comment_id }}

- name: Run Droid Exec (validator)
id: droid_validator
if: steps.prepare.outputs.contains_trigger == 'true' && inputs.review_use_validator == 'true'
shell: bash
run: |

# Run the base-action
bun run ${GITHUB_ACTION_PATH}/base-action/src/index.ts
env:
# Base-action inputs
INPUT_PROMPT_FILE: ${{ runner.temp }}/droid-prompts/droid-prompt.txt
INPUT_SETTINGS: ${{ inputs.settings }}
INPUT_DROID_ARGS: ${{ steps.prepare_validator.outputs.droid_args }}
INPUT_MCP_TOOLS: ${{ steps.prepare_validator.outputs.mcp_tools }}
INPUT_EXPERIMENTAL_SLASH_COMMANDS_DIR: ${{ github.action_path }}/slash-commands
INPUT_ACTION_INPUTS_PRESENT: ${{ steps.prepare.outputs.action_inputs_present }}
INPUT_PATH_TO_DROID_EXECUTABLE: ${{ inputs.path_to_droid_executable }}
INPUT_PATH_TO_BUN_EXECUTABLE: ${{ inputs.path_to_bun_executable }}
INPUT_SHOW_FULL_OUTPUT: ${{ inputs.show_full_output }}

# Model configuration
GITHUB_TOKEN: ${{ steps.prepare.outputs.GITHUB_TOKEN }}
NODE_VERSION: ${{ env.NODE_VERSION }}
DETAILED_PERMISSION_MESSAGES: "1"
FACTORY_API_KEY: ${{ inputs.factory_api_key }}


- name: Update comment with job link
if: steps.prepare.outputs.contains_trigger == 'true' && steps.prepare.outputs.droid_comment_id && always()
shell: bash
Expand All @@ -230,7 +287,7 @@ runs:
GITHUB_EVENT_NAME: ${{ github.event_name }}
TRIGGER_COMMENT_ID: ${{ github.event.comment.id }}
IS_PR: ${{ github.event.issue.pull_request != null || github.event_name == 'pull_request_target' || github.event_name == 'pull_request_review_comment' }}
DROID_SUCCESS: ${{ steps.droid.outputs.conclusion == 'success' }}
DROID_SUCCESS: ${{ (inputs.review_use_validator == 'true' && steps.droid_validator.outputs.conclusion == 'success') || (inputs.review_use_validator != 'true' && steps.droid.outputs.conclusion == 'success') }}
TRIGGER_USERNAME: ${{ github.event.comment.user.login || github.event.issue.user.login || github.event.pull_request.user.login || github.event.sender.login || github.triggering_actor || github.actor || '' }}
PREPARE_SUCCESS: ${{ steps.prepare.outcome == 'success' }}
PREPARE_ERROR: ${{ steps.prepare.outputs.prepare_error || '' }}
Expand All @@ -247,16 +304,7 @@ runs:
~/.factory/logs/droid-log-single.log
~/.factory/logs/console.log
~/.factory/sessions/*
~/.factory/droids/*
${{ runner.temp }}/droid-prompts/**
if-no-files-found: ignore
retention-days: 7

- name: Revoke app token
if: always() && inputs.github_token == '' && steps.prepare.outputs.skipped_due_to_workflow_validation_mismatch != 'true'
shell: bash
run: |
curl -L \
-X DELETE \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer ${{ steps.prepare.outputs.GITHUB_TOKEN }}" \
-H "X-GitHub-Api-Version: 2022-11-28" \
${GITHUB_API_URL:-https://api.github.com}/installation/token
1 change: 1 addition & 0 deletions base-action/test/run-droid-mcp.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ mock.module("child_process", () => ({
});
},
spawn: mockSpawn,
execSync: (_cmd: string) => "",
}));

type RunDroidModule = typeof import("../src/run-droid");
Expand Down
37 changes: 23 additions & 14 deletions src/create-prompt/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,20 @@ import {
isPullRequestReviewCommentEvent,
} from "../github/context";
import type { ParsedGitHubContext } from "../github/context";
import type { CommonFields, PreparedContext, EventData } from "./types";
import type {
CommonFields,
PreparedContext,
EventData,
ReviewArtifacts,
} from "./types";

export type { CommonFields, PreparedContext } from "./types";
export type { CommonFields, PreparedContext, ReviewArtifacts } from "./types";

const BASE_ALLOWED_TOOLS = [
"Execute",
"Edit",
"Create",
"ApplyPatch",
"Read",
"Glob",
"Grep",
Expand Down Expand Up @@ -70,6 +76,7 @@ export function prepareContext(
baseBranch?: string,
droidBranch?: string,
prBranchData?: { headRefName: string; headRefOid: string },
reviewArtifacts?: ReviewArtifacts,
): PreparedContext {
const repository = context.repository.full_name;
const triggerPhrase = context.inputs.triggerPhrase || "@droid";
Expand Down Expand Up @@ -108,15 +115,12 @@ export function prepareContext(
commonFields.droidBranch = droidBranch;
}

const eventData = buildEventData(
context,
{
commentId,
commentBody,
baseBranch,
droidBranch,
},
);
const eventData = buildEventData(context, {
commentId,
commentBody,
baseBranch,
droidBranch,
});

const result: PreparedContext = {
...commonFields,
Expand All @@ -128,6 +132,10 @@ export function prepareContext(
result.prBranchData = prBranchData;
}

if (reviewArtifacts) {
result.reviewArtifacts = reviewArtifacts;
}

return result;
}

Expand Down Expand Up @@ -282,9 +290,7 @@ function buildEventData(
}
}

export type PromptGenerator = (
context: PreparedContext,
) => string;
export type PromptGenerator = (context: PreparedContext) => string;

export type PromptCreationOptions = {
githubContext: ParsedGitHubContext;
Expand All @@ -296,6 +302,7 @@ export type PromptCreationOptions = {
allowedTools?: string[];
disallowedTools?: string[];
includeActionsTools?: boolean;
reviewArtifacts?: ReviewArtifacts;
};

export async function createPrompt({
Expand All @@ -308,6 +315,7 @@ export async function createPrompt({
allowedTools = [],
disallowedTools = [],
includeActionsTools = false,
reviewArtifacts,
}: PromptCreationOptions) {
try {
const droidCommentId = commentId.toString();
Expand All @@ -317,6 +325,7 @@ export async function createPrompt({
baseBranch,
droidBranch,
prBranchData,
reviewArtifacts,
);

await mkdir(`${process.env.RUNNER_TEMP || "/tmp"}/droid-prompts`, {
Expand Down
Loading