Skip to content

Conversation

lcfyi
Copy link
Contributor

@lcfyi lcfyi commented Oct 8, 2025

Title

Fix parallel tool calls in the Anthropic passthrough adapter

Relevant issues

Fixes #15307

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
CleanShot 2025-10-07 at 21 02 44@2x
  • My PR passes all unit tests on make test-unit will rely on CI
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

This PR adds support for parallel tool calling for models that can stream multiple back-to-back tool calls. We do this by adding an additional fall-through behaviour to the _should_start_new_content_block check: if the current block type is a tool_use block, and we have a name, this is necessarily a new tool call.

This matches the behaviour of what OpenAI and Anthropic return.

OpenAI's example:

[{"index": 0, "id": "call_DdmO9pD3xa9XTPNJ32zg2hcA", "function": {"arguments": "", "name": "get_weather"}, "type": "function"}]
[{"index": 0, "id": null, "function": {"arguments": "{\"", "name": null}, "type": null}]
[{"index": 0, "id": null, "function": {"arguments": "location", "name": null}, "type": null}]
[{"index": 0, "id": null, "function": {"arguments": "\":\"", "name": null}, "type": null}]
[{"index": 0, "id": null, "function": {"arguments": "Paris", "name": null}, "type": null}]
[{"index": 0, "id": null, "function": {"arguments": ",", "name": null}, "type": null}]
[{"index": 0, "id": null, "function": {"arguments": " France", "name": null}, "type": null}]
[{"index": 0, "id": null, "function": {"arguments": "\"}", "name": null}, "type": null}]

Many of these fields are only set for the first delta of each tool call, like id, function.name, and type.

Anthropic's example:

event: content_block_start
data: {"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"toolu_01T1x1fJ34qAmk2tNTrN7Up6","name":"get_weather","input":{}}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}}

So as long as we get a name, this block should necessarily be a new tool call.

Test plan

See the repro script and full configuration in the original issue. With the changes, we now get:

Starting streaming conversation with tool calls...

RawMessageStartEvent(message=Message(id='msg_eaa56e78-d5bd-40db-8f9a-32ca1274025d', content=[], model='converse/us.anthropic.claude-sonnet-4-20250514-v1:0', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(cache_creation=None, cache_creation_input_tokens=None, cache_read_input_tokens=None, input_tokens=0, output_tokens=0, server_tool_use=None, service_tier=None)), type='message_start')
RawContentBlockStartEvent(content_block=TextBlock(citations=None, text='', type='text'), index=0, type='content_block_start')

💬 Assistant response:
ContentBlockStopEvent(index=0, type='content_block_stop', content_block=TextBlock(citations=None, text='', type='text'))
RawContentBlockStartEvent(content_block=ToolUseBlock(id='tooluse_-ltZL7ykRqaZWM8m4_nZvA', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')

🔧 Starting tool call: get_weather
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='{"location', type='input_json_delta'), index=1, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='{"location', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='":', type='input_json_delta'), index=1, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='":', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json=' "San Franci', type='input_json_delta'), index=1, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json=' "San Franci', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='sco, CA', type='input_json_delta'), index=1, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='sco, CA', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='"}', type='input_json_delta'), index=1, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='"}', snapshot={'location': 'San Francisco, CA'})
ContentBlockStopEvent(index=1, type='content_block_stop', content_block=ToolUseBlock(id='tooluse_-ltZL7ykRqaZWM8m4_nZvA', input={'location': 'San Francisco, CA'}, name='get_weather', type='tool_use'))

  Input: {'location': 'San Francisco, CA'}
RawContentBlockStartEvent(content_block=ToolUseBlock(id='tooluse_E6hoCX6yR4GqYJjRr4v0zw', input={}, name='get_weather', type='tool_use'), index=2, type='content_block_start')

🔧 Starting tool call: get_weather
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='', type='input_json_delta'), index=2, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='', type='input_json_delta'), index=2, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='{"locat', type='input_json_delta'), index=2, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='{"locat', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='ion"', type='input_json_delta'), index=2, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='ion"', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json=': "Ne', type='input_json_delta'), index=2, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json=': "Ne', snapshot={})
RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='w York, NY"}', type='input_json_delta'), index=2, type='content_block_delta')
InputJsonEvent(type='input_json', partial_json='w York, NY"}', snapshot={'location': 'New York, NY'})
ContentBlockStopEvent(index=2, type='content_block_stop', content_block=ToolUseBlock(id='tooluse_E6hoCX6yR4GqYJjRr4v0zw', input={'location': 'New York, NY'}, name='get_weather', type='tool_use'))

  Input: {'location': 'New York, NY'}
RawMessageDeltaEvent(delta=Delta(stop_reason='tool_use', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(cache_creation_input_tokens=None, cache_read_input_tokens=None, input_tokens=426, output_tokens=95, server_tool_use=None))
MessageStopEvent(type='message_stop', message=Message(id='msg_eaa56e78-d5bd-40db-8f9a-32ca1274025d', content=[TextBlock(citations=None, text='', type='text'), ToolUseBlock(id='tooluse_-ltZL7ykRqaZWM8m4_nZvA', input={'location': 'San Francisco, CA'}, name='get_weather', type='tool_use'), ToolUseBlock(id='tooluse_E6hoCX6yR4GqYJjRr4v0zw', input={'location': 'New York, NY'}, name='get_weather', type='tool_use')], model='converse/us.anthropic.claude-sonnet-4-20250514-v1:0', role='assistant', stop_reason='tool_use', stop_sequence=None, type='message', usage=Usage(cache_creation=None, cache_creation_input_tokens=None, cache_read_input_tokens=None, input_tokens=426, output_tokens=95, server_tool_use=None, service_tier=None)))

So we're now parsing multiple back-to-back tool calls properly.

@vercel
Copy link

vercel bot commented Oct 8, 2025

@lcfyi is attempting to deploy a commit to the CLERKIEAI Team on Vercel.

A member of the Team first needs to authorize it.

# TODO: for future contributors: if the initial content_block_start
# respects the upstream's starting chunk, the initial empty text block
# should be removed (and this test should be updated accordingly)
# ---------------------------------------------------------------------
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating fixing this in this PR, but wanted to reduce the scope since this is technically harmless for the client (it just ends up being an empty text block).

Some clients may hard fail on an empty text block, so it should be fixed at some point but in another PR.

@krrishdholakia krrishdholakia merged commit e2e0cdd into BerriAI:main Oct 9, 2025
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Converse → /v1/messages streaming doesn't handle parallel tool calls with Claude models

2 participants