feat: Add Anthropic Thinking and Streaming Support #3070

damithsenanayake · 2025-10-02T03:25:18Z

Summary

This PR adds support for Anthropic's Extended Thinking feature and Interleaved Thinking (beta) for Claude models in the ADK Python SDK. The implementation enables developers to use Claude's reasoning capabilities with thinking block signatures and multi-step tool reasoning, while maintaining full compatibility with existing code.

Key Features

Extended Thinking Support

Converts ADK ThinkingConfig to Anthropic API format
Supports disabled (0) and explicit (>0) budget values
Raises ValueError for unlimited budget (-1) as it's not supported by Claude
Works in both streaming and non-streaming modes
Parses thinking blocks as Part(thought=True, thought_signature=...) in standard GenAI format
Preserves cryptographic signatures from thinking blocks

Beta Features Support (e.g., Interleaved Thinking)

Users can pass beta headers directly via extra_headers parameter
Example: extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"}
Supports multiple beta features with comma-separated values
Enables interleaved thinking for multi-step tool reasoning
Thinking blocks automatically preserved in assistant message history

Improved Streaming

Migrated to AsyncAnthropicVertex for async streaming support
Implements thought suppression pattern to prevent UI fragmentation
Streaming controlled exclusively by stream parameter
Converts Anthropic streaming events to ADK LlmResponse format
Final message built from complete content blocks to preserve signatures

Backward Compatibility

100% backward compatible - no breaking changes
All existing tests pass
Non-streaming mode supports thinking blocks
Thinking blocks ordering preserved (must come first per Anthropic API requirement)
Beta features opt-in via extra_headers parameter

Implementation Details

Core Changes

File: src/google/adk/models/anthropic_llm.py

Import AsyncAnthropicVertex (line 33)
- Added async client for streaming support
Enhanced content_to_message_param() (lines 142-179)
- Separates thinking blocks from other content blocks
- Ensures thinking blocks come FIRST (Anthropic API requirement)
Enhanced content_block_to_part() (lines 182-219)
- Detects thinking blocks via thinking attribute or type='thinking'
- Extracts and preserves signature (base64-encoded bytes)
- Returns Part(thought=True, thought_signature=signature) for thinking content
- Logs signature presence for debugging
New streaming_event_to_llm_response() (lines 222-278)
- Converts Anthropic streaming events to ADK format
- Handles text_delta, thinking_delta, and message_delta events
- Returns LlmResponse with proper partial=True flag
Rewritten generate_content_async() (lines 366-508)
- Extracts and converts thinking config from ADK to Anthropic format
- Raises ValueError for budget=-1 (unlimited not supported)
- Applies budget directly (no minimum enforcement - handled by API)
- Uses self.extra_headers or NOT_GIVEN for API calls
- Streaming controlled only by stream parameter
- Accumulates thinking deltas and yields as single block (prevents UI fragmentation)
- Builds final response from final_message.content to preserve signatures
- Passes thinking and extra_headers parameters to both streaming and non-streaming API calls
Added extra_headers field (line 360)
- Optional dict for passing extra headers to Anthropic API
- Enables beta features like interleaved thinking
- Default: None (no extra headers)
Updated _anthropic_client property (lines 510-524)
- Changed from AnthropicVertex to AsyncAnthropicVertex

Test Coverage

Test Files (50 tests total):

test_anthropic_thinking.py (26 tests)
- Budget validation (raises error on -1, accepts 0 and positive values)
- Specific budget values (5000, 1024 minimum)
- Thinking block parsing with signatures (base64-encoded)
- Type-based thinking detection with signatures
- No-config baseline testing
- Streaming mode with thinking enabled
- NEW: Interleaved thinking tests (10 tests):
  - Beta header NOT sent by default
  - Beta header sent in streaming mode when enabled
  - Beta header sent in non-streaming mode when enabled
  - Beta header only sent when thinking config is active
  - Thinking blocks preserved in assistant message history
test_anthropic_streaming.py (8 tests)
- Event conversion (text_delta, thinking_delta, usage_delta)
- Start/stop event handling
- Event-to-response transformation
test_anthropic_llm.py (16 tests)
- Existing tests continue to pass
- No changes required

Code Quality

Style: Google Python Style Guide compliant
Formatting: Applied isort and pyink formatting
Documentation: Comprehensive inline comments explaining logic
Type Safety: Proper type hints and assertions
Signature Handling: Proper base64 encoding/decoding for cryptographic signatures

Testing Results

tests/unittests/models/test_anthropic_llm.py ............ 16/16 PASSED
tests/unittests/models/test_anthropic_thinking.py ...... 26/26 PASSED
tests/unittests/models/test_anthropic_streaming.py .....  8/8 PASSED

Total: 50/50 tests PASSED

No regressions - all existing tests continue to pass.

Breaking Changes

None. This PR is 100% backward compatible.

Usage Examples

Basic Extended Thinking

from google.genai import types
from google.adk.models.anthropic_llm import Claude
from google.adk.agents import Agent

# Enable extended thinking with explicit budget
agent = Agent(
    model=Claude(
        model="claude-opus-4-1@20250805",
        max_tokens=4096
    ),
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(
            include_thoughts=True,
            thinking_budget=2048  # Explicit budget (min 1024)
        )
    )
)

# Disable thinking
config=types.GenerateContentConfig(
    thinking_config=types.ThinkingConfig(
        include_thoughts=True,
        thinking_budget=0  # Disabled
    )
)

Beta Features with Tool Use (e.g., Interleaved Thinking)

# Enable beta features like interleaved thinking for multi-step tool reasoning
agent = Agent(
    model=Claude(
        model="claude-opus-4-1@20250805",
        max_tokens=4096,
        extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"}
    ),
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(
            include_thoughts=True,
            thinking_budget=2048
        )
    ),
    tools=[calculator_tool, database_tool]
)

# Multiple beta features
model=Claude(
    model="claude-opus-4-1@20250805",
    extra_headers={"anthropic-beta": "feature1,feature2"}
)

Accessing Thinking Blocks

# Streaming mode (explicit control)
async for response in model.generate_content_async(request, stream=True):
    if response.content:
        for part in response.content.parts:
            if part.thought:
                # Access thinking block with signature
                print(f"Thought: {part.text}")
                print(f"Signature: {part.thought_signature}")

Streaming Decision Logic

Streaming is controlled exclusively by the stream parameter:

stream=True: Uses streaming mode via messages.stream()
stream=False: Uses non-streaming mode via messages.create()

Both modes support thinking blocks when thinking parameter is provided.

Beta Features Details

Interleaved Thinking

When to Use:

Multi-step problems requiring reasoning between tool calls
Complex tool chains where intermediate results inform next steps
Scenarios where Claude needs to "think about" tool results before proceeding

How It Works:

Non-interleaved (default): Claude thinks once, makes all tool decisions upfront
Interleaved (beta): Claude can reason about tool results before deciding what to do next
Thinking blocks are automatically preserved when passing assistant messages back to API
Pass extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"} to enable

Requirements:

Claude 4 models (Opus 4, Sonnet 4, etc.)
Thinking config must be enabled (thinking_budget > 0)
Only supports automatic tool selection (tool_choice: auto)

…dels.

google-cla · 2025-10-02T03:25:23Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

gemini-code-assist · 2025-10-02T03:25:35Z

Summary of Changes

Hello @damithsenanayake, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the ADK Python SDK's integration with Anthropic's Claude models by introducing support for the Extended Thinking feature and substantially improving streaming capabilities. The changes allow for more sophisticated reasoning processes within Claude models and optimize performance for long-running or thought-intensive requests through intelligent streaming management, all while maintaining full backward compatibility.

Highlights

Extended Thinking Support: Integrated Anthropic's Extended Thinking feature, allowing developers to configure thinking budgets (automatic, disabled, explicit) and parsing thought blocks into the standard GenAI format.
Improved Streaming Capabilities: Migrated to an asynchronous Anthropic client (AsyncAnthropicVertex) and implemented a smart streaming decision logic that automatically enables streaming for requests with extended thinking, large max_tokens (>= 8192), or an explicit stream=True flag.
Thought Suppression Pattern: Implemented logic to accumulate thinking deltas and yield them as a single block, preventing UI fragmentation during streaming.
Backward Compatibility: Ensured full backward compatibility with existing code, with all tests passing and non-streaming mode preserved for simple requests.
Comprehensive Test Coverage: Added 34 new unit tests across three new files to validate the new thinking configuration, streaming event conversion, and streaming decision logic.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

adk-bot · 2025-10-02T03:26:25Z

Response from ADK Triaging Agent

Hello @damithsenanayake, thank you for your contribution!

Before we can review this PR, you'll need to sign our Contributor License Agreement (CLA). You can find instructions on how to do that here: https://cla.developers.google.com/

Also, our contribution guidelines require that new features have an associated GitHub issue to track the work. Could you please create an issue for this feature and link it in the PR description?

These steps will help us move forward with the review process. Thanks!

gemini-code-assist

Code Review

This pull request introduces support for Anthropic's Extended Thinking and enhances streaming capabilities for Claude models. The changes are well-structured and include comprehensive tests for the new features. My review focuses on improving code clarity, consistency, and maintainability. I've identified a misleading comment and log message regarding the thinking budget, which should be corrected. Additionally, I've suggested several minor refactorings to simplify conditional logic and reduce redundant code, making the implementation cleaner and easier to follow.

src/google/adk/models/anthropic_llm.py

feat: include token streaming and thoughts rendering for anthropic mo…

8ab5c96

…dels.

adk-bot added bot triaged [Bot] This issue is triaged by ADK bot models [Component] Issues related to model support labels Oct 2, 2025

gemini-code-assist bot reviewed Oct 2, 2025

View reviewed changes

damithsenanayake added 5 commits October 2, 2025 13:32

fix: default value fix

ae8b49e

refactor: gemini code review

5b7b6ed

fix: test fix for default tokens

c77e5d5

wip: tool calling fixed

769e57f

tests: thinking config changes

c949900

damithsenanayake changed the title ~~Add Anthropic Extended Thinking and Streaming Support~~ Add Anthropic Thinking and Streaming Support Oct 2, 2025

damithsenanayake added 3 commits October 3, 2025 06:37

feat: interleaved thinking with tools

5f321fd

Merge remote-tracking branch 'upstream/main' into claude-thoughts-stream

82ccd80

refactor: pass header directly

bb1b86b

damithsenanayake mentioned this pull request Oct 2, 2025

Include reasoning tokens and token streaming for anthropic LLMs #3079

Open

Merge branch 'main' into claude-thoughts-stream

7fa0d5d

damithsenanayake changed the title ~~Add Anthropic Thinking and Streaming Support~~ feat: Add Anthropic Thinking and Streaming Support Oct 2, 2025

damithsenanayake added 2 commits October 3, 2025 13:26

Merge branch 'main' into claude-thoughts-stream

656e8bd

Merge branch 'main' into claude-thoughts-stream

7d26b15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add Anthropic Thinking and Streaming Support #3070

feat: Add Anthropic Thinking and Streaming Support #3070

damithsenanayake commented Oct 2, 2025 •

edited

Loading

Uh oh!

google-cla bot commented Oct 2, 2025

Uh oh!

gemini-code-assist bot commented Oct 2, 2025

Uh oh!

adk-bot commented Oct 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat: Add Anthropic Thinking and Streaming Support #3070

Are you sure you want to change the base?

feat: Add Anthropic Thinking and Streaming Support #3070

Conversation

damithsenanayake commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Features

Extended Thinking Support

Beta Features Support (e.g., Interleaved Thinking)

Improved Streaming

Backward Compatibility

Implementation Details

Core Changes

Test Coverage

Code Quality

Testing Results

Breaking Changes

Usage Examples

Basic Extended Thinking

Beta Features with Tool Use (e.g., Interleaved Thinking)

Accessing Thinking Blocks

Streaming Decision Logic

Beta Features Details

Interleaved Thinking

Uh oh!

google-cla bot commented Oct 2, 2025

Uh oh!

gemini-code-assist bot commented Oct 2, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

adk-bot commented Oct 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

damithsenanayake commented Oct 2, 2025 •

edited

Loading