Skip to content

fix: prevent tools from being discarded when response_model is set (#4697)#4699

Open
devin-ai-integration[bot] wants to merge 3 commits intomainfrom
devin/1772627303-fix-tools-discarded-with-output-pydantic
Open

fix: prevent tools from being discarded when response_model is set (#4697)#4699
devin-ai-integration[bot] wants to merge 3 commits intomainfrom
devin/1772627303-fix-tools-discarded-with-output-pydantic

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 4, 2026

fix: prevent tools from being discarded when response_model is set (#4697)

Summary

Fixes #4697. When both native function calling (tools) and output_pydantic (response_model) are enabled, tools were being silently discarded through two mechanisms:

  1. InternalInstructor early return: When response_model was set, InternalInstructor intercepted the LLM call and created its own completion that ignored the tools parameter entirely.
  2. response_model forwarded to litellm: Even after skipping InternalInstructor, the code still passed response_model to litellm.completion(**params). Litellm uses instructor internally when response_model is provided, which also overrides the tools parameter.
  3. Streaming handler dropped tool_calls: _handle_streaming_response lacked an explicit path for tool_calls present + available_functions absent, causing it to return an empty string instead of the tool_calls (unlike the non-streaming handler which already had this path).

Changes

  • Added a has_tools guard to the three response handlers (_handle_non_streaming_response, _ahandle_non_streaming_response, _handle_streaming_response). When tools are present, InternalInstructor is skipped.
  • When tools are present, response_model is not added to params before calling litellm.completion / litellm.acompletion, preventing litellm's own instructor from overriding tools.
  • Added an explicit if tool_calls and not available_functions: return tool_calls path in _handle_streaming_response, mirroring the equivalent path in _handle_non_streaming_response.
  • InternalInstructor is still used when no tools are present (backward compatible).
  • Eight new unit tests cover the fix across sync/async, tools/no-tools, and response_model-not-forwarded scenarios.

Known limitation (pre-existing, not addressed here)

_ahandle_streaming_response has no InternalInstructor / response_model handling at all — neither before nor after this PR. This is a separate inconsistency that should be addressed in a follow-up.

Review & Testing Checklist for Human

  • Verify end-to-end behavior: The unit tests mock litellm.completion entirely, so they validate the LLM layer's branching logic but not real litellm behavior. Manually verify that an agent with both tools and output_pydantic can (1) call its tools, and (2) still return structured output in the expected Pydantic format after tool execution completes.
  • Check structured output enforcement when LLM returns text with tools present: When tools exist but the LLM returns plain text (no tool_calls), both InternalInstructor and response_model forwarding are now bypassed. Confirm that CrewAgentExecutor or the converter layer still enforces structured output on the final answer in this path.
  • Streaming handler new early-return: The new if tool_calls and not available_functions: return tool_calls path (line ~973) runs before the existing if not tool_calls or not available_functions: block. Verify this doesn't break any caller that expects a string response from the streaming handler when tool_calls are present.

Notes

…4697)

When both native function calling (tools) and output_pydantic (response_model)
are enabled, InternalInstructor was intercepting the LLM call and creating its
own completion without passing tools. This caused agent tools to be silently
discarded.

The fix checks whether tools are present in params before routing to
InternalInstructor. When tools exist, the normal litellm.completion path is
used so the LLM can see and call the agent's tools. InternalInstructor is
still used when no tools are present (backward compatible).

Applied to all three response handlers:
- _handle_non_streaming_response
- _ahandle_non_streaming_response
- _handle_streaming_response

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Contributor Author

Prompt hidden (unlisted session)

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

if response_model and self.is_litellm:
# Only use InternalInstructor for structured output when there
# are no tools — otherwise tools would be silently discarded.
has_tools = bool(params.get("tools"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already try this in local and it's not working

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this @Killian-fal. The initial commit only skipped InternalInstructor but still passed response_model to litellm.completion, which litellm uses internally with instructor — effectively still overriding the tools. I've pushed a follow-up commit that also prevents response_model from being forwarded to litellm when tools are present. Could you try the latest version of this branch?

…resent

Addresses review feedback: when tools are present, response_model must not
be passed to litellm.completion/acompletion either, because litellm uses
instructor internally which would override the tools parameter.

Added 2 more tests that explicitly verify response_model is NOT in the
kwargs sent to litellm when tools are present.

Co-Authored-By: João <joao@crewai.com>
# Only use InternalInstructor for structured output when there
# are no tools — otherwise tools would be silently discarded.
has_tools = bool(params.get("tools"))
if response_model and self.is_litellm and not has_tools:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Async streaming handler missing the has_tools guard for InternalInstructor

Medium Severity

The fix is applied to _handle_streaming_response, _handle_non_streaming_response, and _ahandle_non_streaming_response, but _ahandle_streaming_response (line ~1416) also accepts response_model as a parameter yet has no response_model/InternalInstructor handling at all — neither before nor after this PR. When async streaming is used with response_model set, structured output conversion is silently skipped regardless of whether tools are present. This is an inconsistency across the four handler methods that may cause unexpected behavior in the async streaming + response_model path.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pre-existing inconsistency — _ahandle_streaming_response has never had InternalInstructor/response_model handling, neither before nor after this PR. The lack of structured output support in async streaming is a separate gap that should be addressed in a follow-up PR to keep this fix scoped to the tools-being-discarded issue (#4697).

… None

Mirrors _handle_non_streaming_response's explicit path: when tool_calls
exist but available_functions is None, return the tool_calls to the caller
instead of returning an empty full_response string.

Co-Authored-By: João <joao@crewai.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

# Skip InternalInstructor when tools are present so the LLM can
# see and call the agent's tools before returning structured output.
has_tools = bool(params.get("tools"))
if response_model and self.is_litellm and not has_tools:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structured output silently lost when tools present but unused

High Severity

The has_tools guard checks whether tools are configured in params, not whether the LLM actually returned tool_calls. When tools are present but the LLM responds with plain text (no tool_calls), both InternalInstructor and response_model passthrough to litellm are skipped, so response_model is silently ignored. The caller receives unstructured text instead of the expected Pydantic-formatted output. This affects every LLM call after tool execution completes — the final-answer call still has tools in params, so structured output conversion never happens. This likely explains the reviewer's report that it's "not working" in local testing.

Additional Locations (2)

Fix in Cursor Fix in Web

# so the caller (e.g., executor) can handle tool execution.
# This mirrors _handle_non_streaming_response's explicit path.
if tool_calls and not available_functions:
return tool_calls
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming handler inconsistently prioritizes tool_calls over text response

Medium Severity

The new early return at lines 973–974 returns tool_calls when tool_calls and not available_functions, regardless of whether full_response contains text. The non-streaming handler (line 1254) does the opposite — it returns text_response first when truthy, and only returns tool_calls at line 1266 if text is empty. This creates inconsistent behavior between streaming and non-streaming modes for the same inputs.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] When supports_function_calling() is True, only the output_pydantic model is sent to LiteLLM (the custom agent tools are discard)

1 participant