Skip to content

Usage tracking lost when streaming fails mid-request #1973

@habema

Description

@habema

Summary

When RunResultStreaming raises an error (API errors, connection drops, context window exceeded, etc.), result.context_wrapper.usage stays at 0 tokens even though tokens were consumed.

Root Cause

Usage is only accumulated when ResponseCompletedEvent arrives:

# src/agents/run.py, line 1245
async for event in model.stream_response(...):
    if isinstance(event, ResponseCompletedEvent):
        usage = Usage(...)
        context_wrapper.usage.add(usage)  # line 1263

If the model provider raises an exception before yielding ResponseCompletedEvent (which happens for context window errors, mid-stream connection failures, rate limits, etc.), the loop exits without ever updating usage.

Fix Ideas

  1. Estimate input tokens on error: When streaming fails, estimate input tokens from the request we sent (we have filtered.input + filtered.instructions). Mark it as estimated via a flag. Output tokens are lost but at least we track what went in.

I'd love to hear about whether this fix idea is valid and welcome before going ahead with implementing it. If you can think of any workarounds temporarily, that would also be great.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions