Skip to content

Conversation

damithsenanayake
Copy link

@damithsenanayake damithsenanayake commented Oct 2, 2025

Summary

This PR adds support for Anthropic's Extended Thinking feature and Interleaved Thinking (beta) for Claude models in the ADK Python SDK. The implementation enables developers to use Claude's reasoning capabilities with thinking block signatures and multi-step tool reasoning, while maintaining full compatibility with existing code.

Key Features

Extended Thinking Support

  • Converts ADK ThinkingConfig to Anthropic API format
  • Supports disabled (0) and explicit (>0) budget values
  • Raises ValueError for unlimited budget (-1) as it's not supported by Claude
  • Works in both streaming and non-streaming modes
  • Parses thinking blocks as Part(thought=True, thought_signature=...) in standard GenAI format
  • Preserves cryptographic signatures from thinking blocks

Beta Features Support (e.g., Interleaved Thinking)

  • Users can pass beta headers directly via extra_headers parameter
  • Example: extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"}
  • Supports multiple beta features with comma-separated values
  • Enables interleaved thinking for multi-step tool reasoning
  • Thinking blocks automatically preserved in assistant message history

Improved Streaming

  • Migrated to AsyncAnthropicVertex for async streaming support
  • Implements thought suppression pattern to prevent UI fragmentation
  • Streaming controlled exclusively by stream parameter
  • Converts Anthropic streaming events to ADK LlmResponse format
  • Final message built from complete content blocks to preserve signatures

Backward Compatibility

  • 100% backward compatible - no breaking changes
  • All existing tests pass
  • Non-streaming mode supports thinking blocks
  • Thinking blocks ordering preserved (must come first per Anthropic API requirement)
  • Beta features opt-in via extra_headers parameter

Implementation Details

Core Changes

File: src/google/adk/models/anthropic_llm.py

  1. Import AsyncAnthropicVertex (line 33)

    • Added async client for streaming support
  2. Enhanced content_to_message_param() (lines 142-179)

    • Separates thinking blocks from other content blocks
    • Ensures thinking blocks come FIRST (Anthropic API requirement)
  3. Enhanced content_block_to_part() (lines 182-219)

    • Detects thinking blocks via thinking attribute or type='thinking'
    • Extracts and preserves signature (base64-encoded bytes)
    • Returns Part(thought=True, thought_signature=signature) for thinking content
    • Logs signature presence for debugging
  4. New streaming_event_to_llm_response() (lines 222-278)

    • Converts Anthropic streaming events to ADK format
    • Handles text_delta, thinking_delta, and message_delta events
    • Returns LlmResponse with proper partial=True flag
  5. Rewritten generate_content_async() (lines 366-508)

    • Extracts and converts thinking config from ADK to Anthropic format
    • Raises ValueError for budget=-1 (unlimited not supported)
    • Applies budget directly (no minimum enforcement - handled by API)
    • Uses self.extra_headers or NOT_GIVEN for API calls
    • Streaming controlled only by stream parameter
    • Accumulates thinking deltas and yields as single block (prevents UI fragmentation)
    • Builds final response from final_message.content to preserve signatures
    • Passes thinking and extra_headers parameters to both streaming and non-streaming API calls
  6. Added extra_headers field (line 360)

    • Optional dict for passing extra headers to Anthropic API
    • Enables beta features like interleaved thinking
    • Default: None (no extra headers)
  7. Updated _anthropic_client property (lines 510-524)

    • Changed from AnthropicVertex to AsyncAnthropicVertex

Test Coverage

Test Files (50 tests total):

  1. test_anthropic_thinking.py (26 tests)

    • Budget validation (raises error on -1, accepts 0 and positive values)
    • Specific budget values (5000, 1024 minimum)
    • Thinking block parsing with signatures (base64-encoded)
    • Type-based thinking detection with signatures
    • No-config baseline testing
    • Streaming mode with thinking enabled
    • NEW: Interleaved thinking tests (10 tests):
      • Beta header NOT sent by default
      • Beta header sent in streaming mode when enabled
      • Beta header sent in non-streaming mode when enabled
      • Beta header only sent when thinking config is active
      • Thinking blocks preserved in assistant message history
  2. test_anthropic_streaming.py (8 tests)

    • Event conversion (text_delta, thinking_delta, usage_delta)
    • Start/stop event handling
    • Event-to-response transformation
  3. test_anthropic_llm.py (16 tests)

    • Existing tests continue to pass
    • No changes required

Code Quality

  • Style: Google Python Style Guide compliant
  • Formatting: Applied isort and pyink formatting
  • Documentation: Comprehensive inline comments explaining logic
  • Type Safety: Proper type hints and assertions
  • Signature Handling: Proper base64 encoding/decoding for cryptographic signatures

Testing Results

tests/unittests/models/test_anthropic_llm.py ............ 16/16 PASSED
tests/unittests/models/test_anthropic_thinking.py ...... 26/26 PASSED
tests/unittests/models/test_anthropic_streaming.py .....  8/8 PASSED

Total: 50/50 tests PASSED

No regressions - all existing tests continue to pass.

Breaking Changes

None. This PR is 100% backward compatible.

Usage Examples

Basic Extended Thinking

from google.genai import types
from google.adk.models.anthropic_llm import Claude
from google.adk.agents import Agent

# Enable extended thinking with explicit budget
agent = Agent(
    model=Claude(
        model="claude-opus-4-1@20250805",
        max_tokens=4096
    ),
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(
            include_thoughts=True,
            thinking_budget=2048  # Explicit budget (min 1024)
        )
    )
)

# Disable thinking
config=types.GenerateContentConfig(
    thinking_config=types.ThinkingConfig(
        include_thoughts=True,
        thinking_budget=0  # Disabled
    )
)

Beta Features with Tool Use (e.g., Interleaved Thinking)

# Enable beta features like interleaved thinking for multi-step tool reasoning
agent = Agent(
    model=Claude(
        model="claude-opus-4-1@20250805",
        max_tokens=4096,
        extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"}
    ),
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(
            include_thoughts=True,
            thinking_budget=2048
        )
    ),
    tools=[calculator_tool, database_tool]
)

# Multiple beta features
model=Claude(
    model="claude-opus-4-1@20250805",
    extra_headers={"anthropic-beta": "feature1,feature2"}
)

Accessing Thinking Blocks

# Streaming mode (explicit control)
async for response in model.generate_content_async(request, stream=True):
    if response.content:
        for part in response.content.parts:
            if part.thought:
                # Access thinking block with signature
                print(f"Thought: {part.text}")
                print(f"Signature: {part.thought_signature}")

Streaming Decision Logic

Streaming is controlled exclusively by the stream parameter:

  • stream=True: Uses streaming mode via messages.stream()
  • stream=False: Uses non-streaming mode via messages.create()

Both modes support thinking blocks when thinking parameter is provided.

Beta Features Details

Interleaved Thinking

When to Use:

  • Multi-step problems requiring reasoning between tool calls
  • Complex tool chains where intermediate results inform next steps
  • Scenarios where Claude needs to "think about" tool results before proceeding

How It Works:

  • Non-interleaved (default): Claude thinks once, makes all tool decisions upfront
  • Interleaved (beta): Claude can reason about tool results before deciding what to do next
  • Thinking blocks are automatically preserved when passing assistant messages back to API
  • Pass extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"} to enable

Requirements:

  • Claude 4 models (Opus 4, Sonnet 4, etc.)
  • Thinking config must be enabled (thinking_budget > 0)
  • Only supports automatic tool selection (tool_choice: auto)

Copy link

google-cla bot commented Oct 2, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link

Summary of Changes

Hello @damithsenanayake, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the ADK Python SDK's integration with Anthropic's Claude models by introducing support for the Extended Thinking feature and substantially improving streaming capabilities. The changes allow for more sophisticated reasoning processes within Claude models and optimize performance for long-running or thought-intensive requests through intelligent streaming management, all while maintaining full backward compatibility.

Highlights

  • Extended Thinking Support: Integrated Anthropic's Extended Thinking feature, allowing developers to configure thinking budgets (automatic, disabled, explicit) and parsing thought blocks into the standard GenAI format.
  • Improved Streaming Capabilities: Migrated to an asynchronous Anthropic client (AsyncAnthropicVertex) and implemented a smart streaming decision logic that automatically enables streaming for requests with extended thinking, large max_tokens (>= 8192), or an explicit stream=True flag.
  • Thought Suppression Pattern: Implemented logic to accumulate thinking deltas and yield them as a single block, preventing UI fragmentation during streaming.
  • Backward Compatibility: Ensured full backward compatibility with existing code, with all tests passing and non-streaming mode preserved for simple requests.
  • Comprehensive Test Coverage: Added 34 new unit tests across three new files to validate the new thinking configuration, streaming event conversion, and streaming decision logic.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added bot triaged [Bot] This issue is triaged by ADK bot models [Component] Issues related to model support labels Oct 2, 2025
@adk-bot
Copy link
Collaborator

adk-bot commented Oct 2, 2025

Response from ADK Triaging Agent

Hello @damithsenanayake, thank you for your contribution!

Before we can review this PR, you'll need to sign our Contributor License Agreement (CLA). You can find instructions on how to do that here: https://cla.developers.google.com/

Also, our contribution guidelines require that new features have an associated GitHub issue to track the work. Could you please create an issue for this feature and link it in the PR description?

These steps will help us move forward with the review process. Thanks!

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Anthropic's Extended Thinking and enhances streaming capabilities for Claude models. The changes are well-structured and include comprehensive tests for the new features. My review focuses on improving code clarity, consistency, and maintainability. I've identified a misleading comment and log message regarding the thinking budget, which should be corrected. Additionally, I've suggested several minor refactorings to simplify conditional logic and reduce redundant code, making the implementation cleaner and easier to follow.

@damithsenanayake damithsenanayake changed the title Add Anthropic Extended Thinking and Streaming Support Add Anthropic Thinking and Streaming Support Oct 2, 2025
@damithsenanayake damithsenanayake changed the title Add Anthropic Thinking and Streaming Support feat: Add Anthropic Thinking and Streaming Support Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot triaged [Bot] This issue is triaged by ADK bot models [Component] Issues related to model support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants