|
4 | 4 |
|
5 | 5 | In this guided visit, you will learn how to build an agent, how to run it, and how to customize it to make it work better for your use-case. |
6 | 6 |
|
| 7 | +## Choosing an agent type: CodeAgent or ToolCallingAgent |
| 8 | + |
| 9 | +`smolagents` comes with two agent classes: [`CodeAgent`] and [`ToolCallingAgent`], which represent two different paradigms for how agents interact with tools. |
| 10 | +The key difference lies in how actions are specified and executed: code generation vs structured tool calling. |
| 11 | + |
| 12 | +- [`CodeAgent`] generates tool calls as Python code snippets. |
| 13 | + - The code is executed either locally (potentially unsecure) or in a secure sandbox. |
| 14 | + - Tools are exposed as Python functions (via bindings). |
| 15 | + - Example of tool call: |
| 16 | + ```py |
| 17 | + result = search_docs("What is the capital of France?") |
| 18 | + print(result) |
| 19 | + ``` |
| 20 | + - Strengths: |
| 21 | + - Highly expressive: Allows for complex logic and control flow and can combine tools, loop, transform, reason. |
| 22 | + - Flexible: No need to predefine every possible action, can dynamically generate new actions/tools. |
| 23 | + - Emergent reasoning: Ideal for multi-step problems or dynamic logic. |
| 24 | + - Limitations |
| 25 | + - Risk of errors: Must handle syntax errors, exceptions. |
| 26 | + - Less predictable: More prone to unexpected or unsafe outputs. |
| 27 | + - Requires secure execution environment. |
| 28 | + |
| 29 | +- [`ToolCallingAgent`] writes tool calls as structured JSON. |
| 30 | + - This is the common format used in many frameworks (OpenAI API), allowing for structured tool interactions without code execution. |
| 31 | + - Tools are defined with a JSON schema: name, description, parameter types, etc. |
| 32 | + - Example of tool call: |
| 33 | + ```json |
| 34 | + { |
| 35 | + "tool_call": { |
| 36 | + "name": "search_docs", |
| 37 | + "arguments": { |
| 38 | + "query": "What is the capital of France?" |
| 39 | + } |
| 40 | + } |
| 41 | + } |
| 42 | + ``` |
| 43 | + - Strengths: |
| 44 | + - Reliable: Less prone to hallucination, outputs are structured and validated. |
| 45 | + - Safe: Arguments are strictly validated, no risk of arbitrary code running. |
| 46 | + - Interoperable: Easy to map to external APIs or services. |
| 47 | + - Limitations: |
| 48 | + - Low expressivity: Can't easily combine or transform results dynamically, or perform complex logic or control flow. |
| 49 | + - Inflexible: Must define all possible actions in advance, limited to predefined tools. |
| 50 | + - No code synthesis: Limited to tool capabilities. |
| 51 | + |
| 52 | +When to use which agent type: |
| 53 | +- Use [`CodeAgent`] when: |
| 54 | + - You need reasoning, chaining, or dynamic composition. |
| 55 | + - Tools are functions that can be combined (e.g., parsing + math + querying). |
| 56 | + - Your agent is a problem solver or programmer. |
| 57 | + |
| 58 | +- Use [`ToolCallingAgent`] when: |
| 59 | + - You have simple, atomic tools (e.g., call an API, fetch a document). |
| 60 | + - You want high reliability and clear validation. |
| 61 | + - Your agent is like a dispatcher or controller. |
| 62 | + |
| 63 | +## CodeAgent |
| 64 | + |
| 65 | +[`CodeAgent`] generates Python code snippets to perform actions and solve tasks. |
| 66 | + |
| 67 | +By default, the Python code execution is done in your local environment. |
| 68 | +This should be safe because the only functions that can be called are the tools you provided (especially if it's only tools by Hugging Face) and a set of predefined safe functions like `print` or functions from the `math` module, so you're already limited in what can be executed. |
| 69 | + |
| 70 | +The Python interpreter also doesn't allow imports by default outside of a safe list, so all the most obvious attacks shouldn't be an issue. |
| 71 | +You can authorize additional imports by passing the authorized modules as a list of strings in argument `additional_authorized_imports` upon initialization of your [`CodeAgent`]: |
| 72 | + |
| 73 | +```py |
| 74 | +model = InferenceClientModel() |
| 75 | +agent = CodeAgent(tools=[], model=model, additional_authorized_imports=['requests', 'bs4']) |
| 76 | +agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?") |
| 77 | +``` |
| 78 | + |
| 79 | +Additionally, as an extra security layer, access to submodule is forbidden by default, unless explicitly authorized within the import list. |
| 80 | +For instance, to access the `numpy.random` submodule, you need to add `'numpy.random'` to the `additional_authorized_imports` list. |
| 81 | +This could also be authorized by using `numpy.*`, which will allow `numpy` as well as any subpackage like `numpy.random` and its own subpackages. |
| 82 | + |
| 83 | +> [!WARNING] |
| 84 | +> The LLM can generate arbitrary code that will then be executed: do not add any unsafe imports! |
| 85 | +
|
| 86 | +The execution will stop at any code trying to perform an illegal operation or if there is a regular Python error with the code generated by the agent. |
| 87 | + |
| 88 | +You can also use [E2B code executor](https://e2b.dev/docs#what-is-e2-b) or Docker instead of a local Python interpreter. For E2B, first [set the `E2B_API_KEY` environment variable](https://e2b.dev/dashboard?tab=keys) and then pass `executor_type="e2b"` upon agent initialization. For Docker, pass `executor_type="docker"` during initialization. |
| 89 | + |
| 90 | + |
| 91 | +> [!TIP] |
| 92 | +> Learn more about code execution [in this tutorial](tutorials/secure_code_execution). |
| 93 | +
|
| 94 | +### ToolCallingAgent |
| 95 | + |
| 96 | +[`ToolCallingAgent`] outputs JSON tool calls, which is the common format used in many frameworks (OpenAI API), allowing for structured tool interactions without code execution. |
| 97 | + |
| 98 | +It works much in the same way like [`CodeAgent`], of course without `additional_authorized_imports` since it doesn't execute code: |
| 99 | + |
| 100 | +```py |
| 101 | +from smolagents import ToolCallingAgent |
| 102 | + |
| 103 | +agent = ToolCallingAgent(tools=[], model=model) |
| 104 | +agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?") |
| 105 | +``` |
| 106 | + |
7 | 107 | ## Building your agent |
8 | 108 |
|
9 | 109 | To initialize a minimal agent, you need at least these two arguments: |
@@ -260,46 +360,6 @@ This validation mechanism enables: |
260 | 360 | - Implementing domain-specific validation rules |
261 | 361 | - Creating more robust agents that validate their own outputs |
262 | 362 |
|
263 | | -## CodeAgent and ToolCallingAgent |
264 | | - |
265 | | -`smolagents` comes with two agent classes: [`CodeAgent`] and [`ToolCallingAgent`]. `CodeAgent` is the default and writes Python code snippets that are then executed, while `ToolCallingAgent` outputs JSON tool calls. Both share the same interface so you can pick whichever style you prefer. |
266 | | - |
267 | | -By default, the execution is done in your local environment. |
268 | | -This should be safe because the only functions that can be called are the tools you provided (especially if it's only tools by Hugging Face) and a set of predefined safe functions like `print` or functions from the `math` module, so you're already limited in what can be executed. |
269 | | - |
270 | | -The Python interpreter also doesn't allow imports by default outside of a safe list, so all the most obvious attacks shouldn't be an issue. |
271 | | -You can authorize additional imports by passing the authorized modules as a list of strings in argument `additional_authorized_imports` upon initialization of your [`CodeAgent`]: |
272 | | - |
273 | | -```py |
274 | | -model = InferenceClientModel() |
275 | | -agent = CodeAgent(tools=[], model=model, additional_authorized_imports=['requests', 'bs4']) |
276 | | -agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?") |
277 | | -``` |
278 | | - |
279 | | -Additionally, as an extra security layer, access to submodule is forbidden by default, unless explicitly authorized within the import list. |
280 | | -For instance, to access the `numpy.random` submodule, you need to add `'numpy.random'` to the `additional_authorized_imports` list. |
281 | | -This could also be authorized by using `numpy.*`, which will allow `numpy` as well as any subpackage like `numpy.random` and its own subpackages. |
282 | | - |
283 | | -> [!WARNING] |
284 | | -> The LLM can generate arbitrary code that will then be executed: do not add any unsafe imports! |
285 | | -
|
286 | | -The execution will stop at any code trying to perform an illegal operation or if there is a regular Python error with the code generated by the agent. |
287 | | - |
288 | | -You can also use [E2B code executor](https://e2b.dev/docs#what-is-e2-b) or Docker instead of a local Python interpreter. For E2B, first [set the `E2B_API_KEY` environment variable](https://e2b.dev/dashboard?tab=keys) and then pass `executor_type="e2b"` upon agent initialization. For Docker, pass `executor_type="docker"` during initialization. |
289 | | - |
290 | | - |
291 | | -> [!TIP] |
292 | | -> Learn more about code execution [in this tutorial](tutorials/secure_code_execution). |
293 | | -
|
294 | | -We also support the widely-used way of writing actions as JSON-like blobs: this is [`ToolCallingAgent`], it works much in the same way like [`CodeAgent`], of course without `additional_authorized_imports` since it doesn't execute code: |
295 | | - |
296 | | -```py |
297 | | -from smolagents import ToolCallingAgent |
298 | | - |
299 | | -agent = ToolCallingAgent(tools=[], model=model) |
300 | | -agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?") |
301 | | -``` |
302 | | - |
303 | 363 | ## Inspecting an agent run |
304 | 364 |
|
305 | 365 | Here are a few useful attributes to inspect what happened after a run: |
|
0 commit comments