-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Hybrid mode docs #1454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hybrid mode docs #1454
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 3 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="claude.md">
<violation number="1" location="claude.md:266">
P2: The hybrid mode description in the modes summary is inconsistent with the introduction paragraph. The intro correctly lists `(act, click, type, dragAndDrop)` but this line omits `act`. Consider adding `act` to maintain consistency.
(Based on your team's feedback about 'act' tool being included in hybrid mode.) [FEEDBACK_USED]</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
Greptile SummaryAdds comprehensive documentation for Hybrid mode in Stagehand Agent across all documentation files. What changed:
Key points:
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Stagehand
participant Agent
participant Browser
participant LLM
User->>Stagehand: new Stagehand({experimental: true})
User->>Stagehand: init()
User->>Agent: agent({mode: "hybrid", model: "google/gemini-3-flash-preview"})
Agent-->>User: AgentInstance
User->>Agent: execute({instruction, maxSteps})
loop Until task complete or maxSteps reached
Agent->>Browser: Take screenshot
Browser-->>Agent: Screenshot
Agent->>Browser: Get DOM structure
Browser-->>Agent: DOM data
Agent->>LLM: Process with both visual and DOM context
LLM-->>Agent: Decision (coordinate-based or DOM-based action)
alt DOM-based action (act, fillForm)
Agent->>Browser: DOM-based interaction
else Coordinate-based action (click, type, dragAndDrop)
Agent->>Browser: Coordinate-based interaction
end
Browser-->>Agent: Action result
end
Agent-->>User: AgentResult{success, message, actions, completed}
|
Greptile's behavior is changing!From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section. This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR". |
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
why
currently we have no documentation on hybrid mode of agent
what changed
adds documentation on hybrid mode
test plan
Summary by cubic
Adds Hybrid mode docs for Stagehand Agent, showing how to combine DOM and coordinate-based actions with supported models. Also documents the new agent mode config and updates feature availability across CUA, DOM, and Hybrid.
Written for commit 0756fb6. Summary will update automatically on new commits.