Describe the bug
_apply_tool_execution_decisions (in haystack/human_in_the_loop/strategies.py) is supposed to refuse to proceed when it can't unambiguously match a ToolExecutionDecision to a tool call — specifically, when multiple tool calls share the same tool name but have no tool_call_id. There's already a test for this (test_two_teds_same_name_no_ids), and it works only when every tool call/decision in the batch lacks an id.
The guard is:
decision_by_id = {d.tool_call_id: d for d in tool_execution_decisions if d.tool_call_id}
decision_by_name = {d.tool_name: d for d in tool_execution_decisions if d.tool_name}
if not decision_by_id and len(decision_by_name) < len(tool_execution_decisions):
raise ValueError(...)
The problem: not decision_by_id is checked across the entire batch. If any unrelated tool call elsewhere in the same batch happens to carry an id, decision_by_id becomes non-empty, and the whole guard is skipped — even though the other tool calls (the ones with duplicate names and no ids) are just as ambiguous as before. Once skipped, decision_by_name.get(tc.tool_name) returns whichever decision happens to be last in the dict, and that decision is silently applied to the wrong tool call. This is silent data corruption, not just a missed validation — there's no error, no warning, just a tool getting invoked with someone else's arguments.
Error message
No error is raised — that's the bug. Expected: ValueError: ToolExecutionDecisions are missing tool_call_id fields and there are multiple tool calls with the same name.... Actual: silent success with corrupted arguments.
Expected behavior
The ambiguity check should fire whenever there's a duplicate tool name among the calls/decisions that lack an id, regardless of whether other, unrelated tool calls in the same batch happen to have ids.
To Reproduce
from haystack.dataclasses import ChatMessage, ToolCall
from haystack.human_in_the_loop import ToolExecutionDecision
from haystack.human_in_the_loop.strategies import _apply_tool_execution_decisions
message_with_tool_calls = ChatMessage.from_assistant(
tool_calls=[
ToolCall(tool_name="add_database_tool", arguments={"name": "Malte"}), # no id
ToolCall(tool_name="add_database_tool", arguments={"name": "Milos"}), # no id, duplicate name
ToolCall(tool_name="other_tool", arguments={}, id="xyz"), # has id, unrelated
]
)
rejection_messages, new_tool_call_messages = _apply_tool_execution_decisions(
tool_call_messages=[message_with_tool_calls],
tool_execution_decisions=[
ToolExecutionDecision(tool_name="add_database_tool", execute=True, final_tool_params={"name": "Malte"}),
ToolExecutionDecision(tool_name="add_database_tool", execute=True, final_tool_params={"name": "Milos"}),
ToolExecutionDecision(tool_name="other_tool", execute=True, tool_call_id="xyz", final_tool_params={}),
],
)
for m in new_tool_call_messages:
print(m.tool_calls)
# Both `add_database_tool` calls end up with arguments {'name': 'Milos'}.
# No ValueError is raised. The "Malte" call is silently lost/overwritten.
Additional context
This is the same validation already covered by test_two_teds_same_name_no_ids in test/human_in_the_loop/test_strategies.py, but that test only has tool calls without ids at all. The bug only shows up with a mix of id-bearing and id-less calls in the same batch, which isn't currently tested.
I'd like to fix the guard to check ambiguity only among the id-less subset, and add a regression test for the mixed case, then submit a PR.
System:
- OS: Windows
- Haystack version: main
Describe the bug
_apply_tool_execution_decisions(inhaystack/human_in_the_loop/strategies.py) is supposed to refuse to proceed when it can't unambiguously match aToolExecutionDecisionto a tool call — specifically, when multiple tool calls share the same tool name but have notool_call_id. There's already a test for this (test_two_teds_same_name_no_ids), and it works only when every tool call/decision in the batch lacks an id.The guard is:
The problem:
not decision_by_idis checked across the entire batch. If any unrelated tool call elsewhere in the same batch happens to carry an id,decision_by_idbecomes non-empty, and the whole guard is skipped — even though the other tool calls (the ones with duplicate names and no ids) are just as ambiguous as before. Once skipped,decision_by_name.get(tc.tool_name)returns whichever decision happens to be last in the dict, and that decision is silently applied to the wrong tool call. This is silent data corruption, not just a missed validation — there's no error, no warning, just a tool getting invoked with someone else's arguments.Error message
No error is raised — that's the bug. Expected:
ValueError: ToolExecutionDecisions are missing tool_call_id fields and there are multiple tool calls with the same name.... Actual: silent success with corrupted arguments.Expected behavior
The ambiguity check should fire whenever there's a duplicate tool name among the calls/decisions that lack an id, regardless of whether other, unrelated tool calls in the same batch happen to have ids.
To Reproduce
Additional context
This is the same validation already covered by
test_two_teds_same_name_no_idsintest/human_in_the_loop/test_strategies.py, but that test only has tool calls without ids at all. The bug only shows up with a mix of id-bearing and id-less calls in the same batch, which isn't currently tested.I'd like to fix the guard to check ambiguity only among the id-less subset, and add a regression test for the mixed case, then submit a PR.
System: