Skip to content

Commit b7185c9

Browse files
authored
v0.3.1: improvement + fix
v0.3.1: improvement + fix
2 parents 8b09510 + ca4b483 commit b7185c9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+1794
-1130
lines changed

apps/docs/components/icons.tsx

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -270,3 +270,26 @@ export const ResponseIcon = (props: SVGProps<SVGSVGElement>) => (
270270
<path d='m9 17-5-5 5-5' />
271271
</svg>
272272
)
273+
274+
export const StarterIcon = (props: SVGProps<SVGSVGElement>) => (
275+
<svg viewBox='0 0 24 24' fill='none' xmlns='http://www.w3.org/2000/svg' {...props}>
276+
<path d='M8 5v14l11-7z' fill='currentColor' />
277+
</svg>
278+
)
279+
280+
export const LoopIcon = (props: SVGProps<SVGSVGElement>) => (
281+
<svg viewBox='0 0 24 24' fill='none' xmlns='http://www.w3.org/2000/svg' {...props}>
282+
<path
283+
d='M4 12a8 8 0 018-8V2.5L16 6l-4 3.5V8a6 6 0 00-6 6 6 6 0 006 6 6 6 0 006-6h2a8 8 0 01-8 8 8 8 0 01-8-8z'
284+
fill='currentColor'
285+
/>
286+
</svg>
287+
)
288+
289+
export const ParallelIcon = (props: SVGProps<SVGSVGElement>) => (
290+
<svg viewBox='0 0 24 24' fill='none' xmlns='http://www.w3.org/2000/svg' {...props}>
291+
<rect x='3' y='3' width='18' height='6' rx='1' stroke='currentColor' strokeWidth='2' />
292+
<rect x='3' y='15' width='18' height='6' rx='1' stroke='currentColor' strokeWidth='2' />
293+
<path d='M12 9v6' stroke='currentColor' strokeWidth='2' strokeLinecap='round' />
294+
</svg>
295+
)

apps/docs/content/docs/blocks/agent.mdx

Lines changed: 181 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -8,47 +8,40 @@ import { Step, Steps } from 'fumadocs-ui/components/steps'
88
import { Tab, Tabs } from 'fumadocs-ui/components/tabs'
99
import { ThemeImage } from '@/components/ui/theme-image'
1010

11-
The Agent block is a fundamental component in Sim Studio that allows you to create powerful AI agents using various LLM providers. These agents can process inputs based on customizable system prompts and utilize integrated tools to enhance their capabilities.
11+
The Agent block serves as the interface between your workflow and Large Language Models (LLMs). It executes inference requests against various AI providers, processes natural language inputs according to defined instructions, and generates structured or unstructured outputs for downstream consumption.
1212

1313
<ThemeImage
1414
lightSrc="/static/light/agent-light.png"
1515
darkSrc="/static/dark/agent-dark.png"
16-
alt="Agent Block"
17-
width={300}
16+
alt="Agent Block Configuration"
17+
width={350}
1818
height={175}
1919
/>
2020

21-
<Callout type="info">
22-
Agent blocks serve as interfaces to Large Language Models, enabling your workflow to leverage
23-
state-of-the-art AI capabilities.
24-
</Callout>
25-
2621
## Overview
2722

28-
The Agent block serves as an interface to Large Language Models (LLMs), enabling you to create agents that can:
23+
The Agent block enables you to:
2924

3025
<Steps>
3126
<Step>
32-
<strong>Respond to user inputs</strong>: Generate natural language responses based on provided
33-
inputs
27+
<strong>Process natural language</strong>: Analyze user input and generate contextual responses
3428
</Step>
3529
<Step>
36-
<strong>Follow instructions</strong>: Adhere to specific instructions defined in the system
37-
prompt
30+
<strong>Execute AI-powered tasks</strong>: Perform content analysis, generation, and decision-making
3831
</Step>
3932
<Step>
40-
<strong>Use specialized tools</strong>: Interact with integrated tools to extend capabilities
33+
<strong>Call external tools</strong>: Access APIs, databases, and services during processing
4134
</Step>
4235
<Step>
43-
<strong>Structure output</strong>: Generate responses in structured formats when needed
36+
<strong>Generate structured output</strong>: Return JSON data that matches your schema requirements
4437
</Step>
45-
</Steps>
38+
</Steps>
4639

4740
## Configuration Options
4841

4942
### System Prompt
5043

51-
The system prompt defines the agent's behavior, capabilities, and limitations. It's the primary way to instruct the agent on how to respond to inputs.
44+
The system prompt establishes the agent's operational parameters and behavioral constraints. This configuration defines the agent's role, response methodology, and processing boundaries for all incoming requests.
5245

5346
```markdown
5447
You are a helpful assistant that specializes in financial analysis.
@@ -58,22 +51,25 @@ When responding to questions about investments, include risk disclaimers.
5851

5952
### User Prompt
6053

61-
The user prompt or context is the specific input or question that the agent should respond to. This can be:
54+
The user prompt represents the primary input data for inference processing. This parameter accepts natural language text or structured data that the agent will analyze and respond to. Input sources include:
6255

63-
- Directly provided in the block configuration
64-
- Connected from another block's output
65-
- Dynamically generated during workflow execution
56+
- **Static Configuration**: Direct text input specified in the block configuration
57+
- **Dynamic Input**: Data passed from upstream blocks through connection interfaces
58+
- **Runtime Generation**: Programmatically generated content during workflow execution
6659

6760
### Model Selection
6861

69-
Choose from a variety of LLM providers:
62+
The Agent block supports multiple LLM providers through a unified inference interface. Available models include:
63+
64+
**OpenAI Models**: GPT-4o, o1, o3, o4-mini, gpt-4.1 (API-based inference)
65+
**Anthropic Models**: Claude 3.7 Sonnet (API-based inference)
66+
**Google Models**: Gemini 2.5 Pro, Gemini 2.0 Flash (API-based inference)
67+
**Alternative Providers**: Groq, Cerebras, xAI, DeepSeek (API-based inference)
68+
**Local Deployment**: Ollama-compatible models (self-hosted inference)
7069

71-
- OpenAI (GPT-4o, o1, o3, o4-mini, gpt-4.1)
72-
- Anthropic (Claude 3.7 Sonnet)
73-
- Google (Gemini 2.5 Pro, Gemini 2.0 Flash)
74-
- Groq, Cerebras
75-
- Ollama Local Models
76-
- And more
70+
<div className="mx-auto w-3/5 overflow-hidden rounded-lg">
71+
<video autoPlay loop muted playsInline className="w-full -mb-2 rounded-lg" src="/models.mp4"></video>
72+
</div>
7773

7874
### Temperature
7975

@@ -104,103 +100,204 @@ Your API key for the selected LLM provider. This is securely stored and used for
104100

105101
### Tools
106102

107-
Integrate specialized tools to enhance the agent's capabilities. You can add tools to your agent by:
108-
109-
1. Clicking the Tools section in the Agent configuration
110-
2. Selecting from the tools dropdown menu
111-
3. Choosing an existing tool or creating a new one
103+
Tools extend the agent's capabilities through external API integrations and service connections. The tool system enables function calling, allowing the agent to execute operations beyond text generation.
112104

113-
<ThemeImage
114-
lightSrc="/static/light/tooldropdown-light.png"
115-
darkSrc="/static/dark/tooldropdown-dark.png"
116-
alt="Tools Dropdown"
117-
width={150}
118-
height={125}
119-
/>
105+
**Tool Integration Process**:
106+
1. Access the Tools configuration section within the Agent block
107+
2. Select from 60+ pre-built integrations or define custom functions
108+
3. Configure authentication parameters and operational constraints
120109

121-
Available tools include:
110+
<div className="mx-auto w-3/5 overflow-hidden rounded-lg">
111+
<video autoPlay loop muted playsInline className="w-full -mb-2 rounded-lg" src="/tools.mp4"></video>
112+
</div>
122113

123-
- **Confluence**: Access and query Confluence knowledge bases
124-
- **Evaluator**: Use evaluation metrics to assess content
125-
- **GitHub**: Interact with GitHub repositories and issues
126-
- **Gmail**: Process and respond to emails
127-
- **Firecrawl**: Web search and content retrieval
128-
- And many, many more pre-built integrations
114+
**Available Tool Categories**:
115+
- **Communication**: Gmail, Slack, Telegram, WhatsApp, Microsoft Teams
116+
- **Data Sources**: Notion, Google Sheets, Airtable, Supabase, Pinecone
117+
- **Web Services**: Firecrawl, Google Search, Exa AI, browser automation
118+
- **Development**: GitHub, Jira, Linear repository and issue management
119+
- **AI Services**: OpenAI, Perplexity, Hugging Face, ElevenLabs
129120

130-
You can also create custom tools to meet specific requirements for your agent's capabilities.
121+
**Tool Execution Control**:
122+
- **Auto**: Model determines tool invocation based on context and necessity
123+
- **Required**: Tool must be called during every inference request
124+
- **None**: Tool definition available but excluded from model context
131125

132-
<Callout type="info">
133-
Tools significantly expand what your agent can do, allowing it to access external systems,
134-
retrieve information, and take actions beyond simple text generation.
135-
</Callout>
126+
<div className="mx-auto w-3/5 overflow-hidden rounded-lg">
127+
<video autoPlay loop muted playsInline className="w-full -mb-2 rounded-lg" src="/granular-tool-control.mp4"></video>
128+
</div>
136129

137130
### Response Format
138131

139-
Define a structured format for the agent's response when needed, using JSON or other formats.
132+
The Response Format parameter enforces structured output generation through JSON Schema validation. This ensures consistent, machine-readable responses that conform to predefined data structures:
133+
134+
```json
135+
{
136+
"name": "user_analysis",
137+
"schema": {
138+
"type": "object",
139+
"properties": {
140+
"sentiment": {
141+
"type": "string",
142+
"enum": ["positive", "negative", "neutral"]
143+
},
144+
"confidence": {
145+
"type": "number",
146+
"minimum": 0,
147+
"maximum": 1
148+
}
149+
},
150+
"required": ["sentiment", "confidence"]
151+
}
152+
}
153+
```
154+
155+
This configuration constrains the model's output to comply with the specified schema, preventing free-form text responses and ensuring structured data generation.
156+
157+
### Accessing Results
158+
159+
After an agent completes, you can access its outputs:
160+
161+
- **`<agent.content>`**: The agent's response text or structured data
162+
- **`<agent.tokens>`**: Token usage statistics (prompt, completion, total)
163+
- **`<agent.tool_calls>`**: Details of any tools the agent used during execution
164+
- **`<agent.cost>`**: Estimated cost of the API call (if available)
165+
166+
## Advanced Features
167+
168+
### Memory Integration
169+
170+
Agents can maintain context across interactions using the memory system:
171+
172+
```javascript
173+
// In a Function block before the agent
174+
const memory = {
175+
conversation_history: previousMessages,
176+
user_preferences: userProfile,
177+
session_data: currentSession
178+
};
179+
```
180+
181+
### Structured Output Validation
182+
183+
Use JSON Schema to ensure consistent, machine-readable responses:
184+
185+
```json
186+
{
187+
"type": "object",
188+
"properties": {
189+
"analysis": {"type": "string"},
190+
"confidence": {"type": "number", "minimum": 0, "maximum": 1},
191+
"categories": {"type": "array", "items": {"type": "string"}}
192+
},
193+
"required": ["analysis", "confidence"]
194+
}
195+
```
196+
197+
### Error Handling
198+
199+
Agents automatically handle common errors:
200+
- API rate limits with exponential backoff
201+
- Invalid tool calls with retry logic
202+
- Network failures with connection recovery
203+
- Schema validation errors with fallback responses
140204

141205
## Inputs and Outputs
142206

143-
<Tabs items={['Inputs', 'Outputs']}>
207+
<Tabs items={['Configuration', 'Variables', 'Results']}>
144208
<Tab>
145209
<ul className="list-disc space-y-2 pl-6">
146210
<li>
147-
<strong>User Prompt</strong>: The user's query or context for the agent
211+
<strong>System Prompt</strong>: Instructions defining agent behavior and role
148212
</li>
149213
<li>
150-
<strong>System Prompt</strong>: Instructions for the agent (optional)
214+
<strong>User Prompt</strong>: Input text or data to process
151215
</li>
152216
<li>
153-
<strong>Tools</strong>: Optional tool connections that the agent can use
217+
<strong>Model</strong>: AI model selection (OpenAI, Anthropic, Google, etc.)
218+
</li>
219+
<li>
220+
<strong>Temperature</strong>: Response randomness control (0-2)
221+
</li>
222+
<li>
223+
<strong>Tools</strong>: Array of available tools for function calling
224+
</li>
225+
<li>
226+
<strong>Response Format</strong>: JSON Schema for structured output
154227
</li>
155228
</ul>
156229
</Tab>
157230
<Tab>
158231
<ul className="list-disc space-y-2 pl-6">
159232
<li>
160-
<strong>Content</strong>: The agent's response text
233+
<strong>agent.content</strong>: Agent's response text or structured data
234+
</li>
235+
<li>
236+
<strong>agent.tokens</strong>: Token usage statistics object
161237
</li>
162238
<li>
163-
<strong>Model</strong>: The model used for generation
239+
<strong>agent.tool_calls</strong>: Array of tool execution details
164240
</li>
165241
<li>
166-
<strong>Tokens</strong>: Usage statistics (prompt, completion, total)
242+
<strong>agent.cost</strong>: Estimated API call cost (if available)
167243
</li>
244+
</ul>
245+
</Tab>
246+
<Tab>
247+
<ul className="list-disc space-y-2 pl-6">
168248
<li>
169-
<strong>Tool Calls</strong>: Details of any tools used during processing
249+
<strong>Content</strong>: Primary response output from the agent
170250
</li>
171251
<li>
172-
<strong>Cost</strong>: Cost of the response
252+
<strong>Metadata</strong>: Usage statistics and execution details
173253
</li>
174254
<li>
175-
<strong>Usage</strong>: Usage statistics (prompt, completion, total)
255+
<strong>Access</strong>: Available in blocks after the agent
176256
</li>
177257
</ul>
178258
</Tab>
179259
</Tabs>
180260

181-
## Example Usage
182-
183-
Here's an example of how an Agent block might be configured for a customer support workflow:
184-
185-
```yaml
186-
# Example Agent Configuration
187-
systemPrompt: |
188-
You are a customer support agent for TechCorp.
189-
Always maintain a professional, friendly tone.
190-
If you don't know an answer, direct the customer to email [email protected].
191-
Never make up information about products or policies.
192-
193-
model: OpenAI/gpt-4
194-
temperature: 0.2
195-
tools:
196-
- ProductDatabase
197-
- OrderHistory
198-
- SupportTicketCreator
199-
```
261+
## Example Use Cases
262+
263+
### Customer Support Automation
264+
265+
<div className="mb-4 rounded-md border p-4">
266+
<h4 className="font-medium">Scenario: Handle customer inquiries with database access</h4>
267+
<ol className="list-decimal pl-5 text-sm">
268+
<li>User submits support ticket via API block</li>
269+
<li>Agent processes inquiry with product database tools</li>
270+
<li>Agent generates response and creates follow-up ticket</li>
271+
<li>Response block sends reply to customer</li>
272+
</ol>
273+
</div>
274+
275+
### Multi-Model Content Analysis
276+
277+
<div className="mb-4 rounded-md border p-4">
278+
<h4 className="font-medium">Scenario: Analyze content with different AI models</h4>
279+
<ol className="list-decimal pl-5 text-sm">
280+
<li>Function block processes uploaded document</li>
281+
<li>Agent with GPT-4o performs technical analysis</li>
282+
<li>Agent with Claude analyzes sentiment and tone</li>
283+
<li>Function block combines results for final report</li>
284+
</ol>
285+
</div>
286+
287+
### Tool-Powered Research Assistant
288+
289+
<div className="mb-4 rounded-md border p-4">
290+
<h4 className="font-medium">Scenario: Research assistant with web search and document access</h4>
291+
<ol className="list-decimal pl-5 text-sm">
292+
<li>User query received via input</li>
293+
<li>Agent searches web using Google Search tool</li>
294+
<li>Agent accesses Notion database for internal docs</li>
295+
<li>Agent compiles comprehensive research report</li>
296+
</ol>
297+
</div>
200298

201299
## Best Practices
202300

203301
- **Be specific in system prompts**: Clearly define the agent's role, tone, and limitations. The more specific your instructions are, the better the agent will be able to fulfill its intended purpose.
204302
- **Choose the right temperature setting**: Use lower temperature settings (0-0.3) when accuracy is important, or increase temperature (0.7-2.0) for more creative or varied responses
205-
- **Combine with Evaluator blocks**: Use Evaluator blocks to assess agent responses and ensure quality. This allows you to create feedback loops and implement quality control measures.
206-
- **Leverage tools effectively**: Integrate tools that complement the agent's purpose and enhance its capabilities. Be selective about which tools you provide to avoid overwhelming the agent.
303+
- **Leverage tools effectively**: Integrate tools that complement the agent's purpose and enhance its capabilities. Be selective about which tools you provide to avoid overwhelming the agent. For tasks with little overlap, use another Agent block for the best results.

0 commit comments

Comments
 (0)