-
Notifications
You must be signed in to change notification settings - Fork 444
feat(llmobs): track prompt caching for anthropic sdk #13757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 288 ± 3 ms. The average import time from base is: 290 ± 3 ms. The import time difference between this PR and base is: -1.9 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
BenchmarksBenchmark execution time: 2025-06-27 21:26:38 Comparing candidate commit 277d942 in PR branch Found 0 performance improvements and 3 performance regressions! Performance is the same for 558 metrics, 3 unstable metrics. scenario:iastaspects-replace_aspect
scenario:iastaspectsospath-ospathsplit_aspect
scenario:iastaspectssplit-split_aspect
|
…/dd-trace-py into evan.li/anthropic-prompt-caching
--- | ||
features: | ||
- | | ||
LLM Observability: This introduces the ability to track the number of tokens read and written to the cache for Anthropic prompt caching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLM Observability: This introduces the ability to track the number of tokens read and written to the cache for Anthropic prompt caching. | |
LLM Observability: Introduces capturing cache input read/write token counts for Anthropic prompt caching use cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about " This introduces capturing the number of input tokens read and written to the cache for Anthropic prompt caching use cases."
am hesistant ab "cache input read/write token counts", think it is a bit confusing
"cache_write_input_tokens": 0, | ||
"cache_read_input_tokens": 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From your implementation, aren't we only adding these metrics if they exist? Should we be expecting these to always be present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test contains
{"type": "text", "text": "only respond in all caps", "cache_control": {"type": "ephemeral"}},
which causes anthropic to return cache read/write token metrics as 0
since other tests don't have the cache control arg passed in, the metrics don't exist
"cache_write_input_tokens": 2055, # Cache write tokens | ||
"cache_read_input_tokens": 0, # Should be 0 for first request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comments are unnecessary
"cache_write_input_tokens": 0, # Should be 0 for cache read | ||
"cache_read_input_tokens": 2055, # Cache read tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
"cache_write_input_tokens": 1031, # Cache write tokens | ||
"cache_read_input_tokens": 0, # Should be 0 for first request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
"cache_write_input_tokens": 0, # Should be 0 for cache read | ||
"cache_read_input_tokens": 1031, # Cache read tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Tracks number of tokens read from and written to the prompt cache for anthropic
https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
anthropic returns
cache_creation/read_input_tokens
in their usage field.We map these to
cache_write/read_input_tokens
keys in ourmetrics
field.Testing is blocked on DataDog/dd-apm-test-agent#217
implementation note
Right now, we are using
get_llmobs_metrics_tags
to set metrics for anthropic, which depends on usingset_metric
andget_metric
. We do not want to continue this pattern for prompt caching, so we instead directly extract it out fromresponse.usage
field.The caveat is that for the streamed case, the
usage
field is a dictionary that is manually constructed by us when parsing out streamed chunksFollow ups
llmobs_events
fixturemetrics
parsing from set/get metrics completelyChecklist
Reviewer Checklist