Skip to content

Fix LLM callback isolation without serializing requests#4252

Open
VedantMadane wants to merge 3 commits intocrewAIInc:mainfrom
VedantMadane:fix/llm-callbacks-no-global-mutation
Open

Fix LLM callback isolation without serializing requests#4252
VedantMadane wants to merge 3 commits intocrewAIInc:mainfrom
VedantMadane:fix/llm-callbacks-no-global-mutation

Conversation

@VedantMadane
Copy link

@VedantMadane VedantMadane commented Jan 19, 2026

This is a follow-up to #4218 (auto-closed by bot) addressing the same race in LLM callback handling without holding a global lock across the network call.

What changed

  • Stop mutating LiteLLM global callback lists for per-request callbacks.
  • Pass callbacks via the request params ("callbacks") and continue to invoke token usage callbacks from CrewAI response handlers.
  • Make test_llm_callback_replacement deterministic by mocking litellm.completion (removes sleep/heisenbug).

Why

The approach in #4218 used a class-level lock held across the entire LLM request which can serialize all concurrent agent calls. This keeps concurrency while still ensuring callback isolation.

Fixes #4214.


Note

Medium Risk
Touches the core LLM.call/LLM.acall execution path by changing how callbacks are wired into LiteLLM, which could affect observability/token-usage integrations and streaming/non-streaming parity under concurrency.

Overview
Stops mutating LiteLLM’s global callback state for per-request handlers, and instead computes effective_callbacks and passes them via params["callbacks"] on each completion/acompletion call to avoid cross-request races.

Updates tests to make callback isolation deterministic: rewrites test_llm_callback_replacement to mock litellm.completion (removing the sleep/flakiness) and adds a new threaded concurrency test asserting each request receives only its own callback and token usage is not mixed.

Written by Cursor Bugbot for commit 8dfc1f4. This will update automatically on new commits. Configure here.

@VedantMadane
Copy link
Author

Not covered in this PR description:

  1. Lock scoping alternative (save previous global callbacks, set new ones, perform request, then restore) and why we avoided it.
  2. Context local callback isolation using contextvars or thread local dispatch.
  3. A true concurrency regression test (multi thread or async) that proves no cross contamination under parallel calls.

If you prefer, I can add a follow up commit that documents these options or adds a concurrency focused test.

@VedantMadane VedantMadane force-pushed the fix/llm-callbacks-no-global-mutation branch from 31fdc55 to 35483b6 Compare February 10, 2026 11:02
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

call_id=get_current_call_id(),
),
)
raise
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_callbacks is now dead code, never called

Low Severity

The set_callbacks static method mutates LiteLLM's global callback lists, which is exactly the pattern this PR removes. Its only call site (self.set_callbacks(callbacks or []) in __init__) was deleted, and a grep confirms no other callers exist anywhere in the codebase or tests. Leaving it around risks someone re-introducing the global-mutation pattern unintentionally.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Hidden race condition in LLM callback system causing test failures

1 participant