feat(llmobs): track prompt caching for anthropic sdk #13757

lievan · 2025-06-24T16:29:59Z

Tracks number of tokens read from and written to the prompt cache for anthropic

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching

anthropic returns cache_creation/read_input_tokens in their usage field.

We map these to cache_write/read_input_tokens keys in our metrics field.

Testing is blocked on DataDog/dd-apm-test-agent#217

implementation note

Right now, we are using get_llmobs_metrics_tags to set metrics for anthropic, which depends on using set_metric and get_metric. We do not want to continue this pattern for prompt caching, so we instead directly extract it out from response.usagefield.

The caveat is that for the streamed case, the usage field is a dictionary that is manually constructed by us when parsing out streamed chunks

Follow ups

Move all the unit tests to use llmobs_events fixture
De-couple metrics parsing from set/get metrics completely

Checklist

PR author has checked that all the criteria below are met
The PR description includes an overview of the change
The PR description articulates the motivation for the change
The change includes tests OR the PR description describes a testing strategy
The PR description notes risks associated with the change, if any
Newly-added code is easy to change
The change follows the library release note guidelines
The change includes or references documentation updates if necessary
Backport labels are set (if applicable)

Reviewer Checklist

Reviewer has checked that all the criteria below are met
Title is accurate
All changes are related to the pull request's stated goal
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Newly-added code is easy to change
Release note makes sense to a user of the library
If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

github-actions · 2025-06-24T16:30:33Z

CODEOWNERS have been resolved as:

releasenotes/notes/ant-p-cache-3d4001a431cedd67.yaml                    @DataDog/apm-python
ddtrace/contrib/internal/anthropic/_streaming.py                        @DataDog/ml-observability
ddtrace/llmobs/_constants.py                                            @DataDog/ml-observability
ddtrace/llmobs/_integrations/anthropic.py                               @DataDog/ml-observability
tests/contrib/anthropic/conftest.py                                     @DataDog/ml-observability
tests/contrib/anthropic/test_anthropic_llmobs.py                        @DataDog/ml-observability

tests/contrib/anthropic/test_anthropic_llmobs.py

github-actions · 2025-06-24T16:51:24Z

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 288 ± 3 ms.

The average import time from base is: 290 ± 3 ms.

The import time difference between this PR and base is: -1.9 ± 0.1 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 1.986 ms (0.69%)

ddtrace.bootstrap.sitecustomize 1.302 ms (0.45%)

ddtrace.bootstrap.preload 1.302 ms (0.45%)

ddtrace.internal.remoteconfig.client 0.646 ms (0.22%)

ddtrace 0.684 ms (0.24%)

ddtrace.internal._unpatched 0.033 ms (0.01%)

json 0.033 ms (0.01%)

json.decoder 0.033 ms (0.01%)

re 0.033 ms (0.01%)

enum 0.033 ms (0.01%)

types 0.033 ms (0.01%)

pr-commenter · 2025-06-24T17:14:44Z

Benchmarks

Benchmark execution time: 2025-06-27 21:26:38

Comparing candidate commit 277d942 in PR branch evan.li/anthropic-prompt-caching with baseline commit 40f2c37 in branch main.

Found 0 performance improvements and 3 performance regressions! Performance is the same for 558 metrics, 3 unstable metrics.

scenario:iastaspects-replace_aspect

🟥 execution_time [+494.020ns; +645.463ns] or [+10.558%; +13.794%]

scenario:iastaspectsospath-ospathsplit_aspect

🟥 execution_time [+853.416ns; +1031.884ns] or [+17.812%; +21.537%]

scenario:iastaspectssplit-split_aspect

🟥 execution_time [+128.527ns; +159.312ns] or [+8.893%; +11.023%]

tests/contrib/anthropic/test_anthropic_llmobs.py

…/dd-trace-py into evan.li/anthropic-prompt-caching

tests/contrib/anthropic/test_anthropic_llmobs.py

ddtrace/llmobs/_integrations/anthropic.py

Yun-Kim · 2025-06-27T16:53:55Z

releasenotes/notes/ant-p-cache-3d4001a431cedd67.yaml

+---
+features:
+  - |
+    LLM Observability: This introduces the ability to track the number of tokens read and written to the cache for Anthropic prompt caching.


Suggested change

LLM Observability: This introduces the ability to track the number of tokens read and written to the cache for Anthropic prompt caching.

LLM Observability: Introduces capturing cache input read/write token counts for Anthropic prompt caching use cases.

how about " This introduces capturing the number of input tokens read and written to the cache for Anthropic prompt caching use cases."

am hesistant ab "cache input read/write token counts", think it is a bit confusing

Yun-Kim · 2025-06-27T16:58:31Z

tests/contrib/anthropic/test_anthropic_llmobs.py

+                    "cache_write_input_tokens": 0,
+                    "cache_read_input_tokens": 0,


From your implementation, aren't we only adding these metrics if they exist? Should we be expecting these to always be present?

this test contains
{"type": "text", "text": "only respond in all caps", "cache_control": {"type": "ephemeral"}},
which causes anthropic to return cache read/write token metrics as 0
since other tests don't have the cache control arg passed in, the metrics don't exist

Yun-Kim · 2025-06-27T16:58:51Z

tests/contrib/anthropic/test_anthropic_llmobs.py

+                            "cache_write_input_tokens": 2055,  # Cache write tokens
+                            "cache_read_input_tokens": 0,  # Should be 0 for first request


I think the comments are unnecessary

Yun-Kim · 2025-06-27T16:58:58Z

tests/contrib/anthropic/test_anthropic_llmobs.py

+                            "cache_write_input_tokens": 0,  # Should be 0 for cache read
+                            "cache_read_input_tokens": 2055,  # Cache read tokens


Yun-Kim · 2025-06-27T17:02:38Z

tests/contrib/anthropic/test_anthropic_llmobs.py

+                            "cache_write_input_tokens": 1031,  # Cache write tokens
+                            "cache_read_input_tokens": 0,  # Should be 0 for first request


Yun-Kim · 2025-06-27T17:02:44Z

tests/contrib/anthropic/test_anthropic_llmobs.py

+                            "cache_write_input_tokens": 0,  # Should be 0 for cache read
+                            "cache_read_input_tokens": 1031,  # Cache read tokens


lievan added 2 commits June 23, 2025 20:35

ant prompt caching

215758d

ant p cache

fc761ad

datadog-datadog-prod-us1 bot reviewed Jun 24, 2025

View reviewed changes

tests/contrib/anthropic/test_anthropic_llmobs.py Outdated Show resolved Hide resolved

tests/contrib/anthropic/test_anthropic_llmobs.py Outdated Show resolved Hide resolved

streamed case for ant

d655ac2

datadog-datadog-prod-us1 bot reviewed Jun 24, 2025

View reviewed changes

tests/contrib/anthropic/test_anthropic_llmobs.py Outdated Show resolved Hide resolved

tests/contrib/anthropic/test_anthropic_llmobs.py Outdated Show resolved Hide resolved

Merge branch 'main' into evan.li/anthropic-prompt-caching

859bc80

datadog-datadog-prod-us1 bot reviewed Jun 24, 2025

View reviewed changes

tests/contrib/anthropic/test_anthropic_llmobs.py Outdated Show resolved Hide resolved

lievan marked this pull request as ready for review June 24, 2025 20:04

lievan requested review from a team as code owners June 24, 2025 20:04

lievan requested review from ZStriker19 and nsrip-dd June 24, 2025 20:04

lievan added 2 commits June 24, 2025 14:25

ruff

6d8e7ab

Merge branch 'evan.li/anthropic-prompt-caching' of github.com:DataDog…

6bbb7be

…/dd-trace-py into evan.li/anthropic-prompt-caching

emmettbutler approved these changes Jun 25, 2025

View reviewed changes

lievan added 3 commits June 25, 2025 11:09

fix test

89212d9

ant test agent

2d1e0c9

del cass

6b63cb5

datadog-datadog-prod-us1 bot reviewed Jun 27, 2025

View reviewed changes

tests/contrib/anthropic/test_anthropic_llmobs.py Show resolved Hide resolved

tests/contrib/anthropic/test_anthropic_llmobs.py Show resolved Hide resolved

lievan mentioned this pull request Jun 27, 2025

chore(llmobs): anthropic prompt caching cassettes DataDog/dd-apm-test-agent#217

Merged

Yun-Kim reviewed Jun 27, 2025

View reviewed changes

lievan added 2 commits June 27, 2025 14:30

cache read/write

378ab33

address cmnts

277d942

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(llmobs): track prompt caching for anthropic sdk #13757

feat(llmobs): track prompt caching for anthropic sdk #13757

lievan commented Jun 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

pr-commenter bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Yun-Kim Jun 27, 2025

Uh oh!

lievan Jun 27, 2025

Uh oh!

Yun-Kim Jun 27, 2025

Uh oh!

lievan Jun 27, 2025

Uh oh!

Yun-Kim Jun 27, 2025

Uh oh!

Yun-Kim Jun 27, 2025

Uh oh!

Yun-Kim Jun 27, 2025

Uh oh!

Yun-Kim Jun 27, 2025

Uh oh!

Uh oh!

	LLM Observability: This introduces the ability to track the number of tokens read and written to the cache for Anthropic prompt caching.
	LLM Observability: Introduces capturing cache input read/write token counts for Anthropic prompt caching use cases.

		"cache_write_input_tokens": 2055, # Cache write tokens
		"cache_read_input_tokens": 0, # Should be 0 for first request

		"cache_write_input_tokens": 0, # Should be 0 for cache read
		"cache_read_input_tokens": 2055, # Cache read tokens

		"cache_write_input_tokens": 1031, # Cache write tokens
		"cache_read_input_tokens": 0, # Should be 0 for first request

		"cache_write_input_tokens": 0, # Should be 0 for cache read
		"cache_read_input_tokens": 1031, # Cache read tokens

feat(llmobs): track prompt caching for anthropic sdk #13757

Are you sure you want to change the base?

feat(llmobs): track prompt caching for anthropic sdk #13757

Conversation

lievan commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

implementation note

Follow ups

Checklist

Reviewer Checklist

Uh oh!

github-actions bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bootstrap import analysis

Summary

Import time breakdown

Uh oh!

pr-commenter bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

scenario:iastaspects-replace_aspect

scenario:iastaspectsospath-ospathsplit_aspect

scenario:iastaspectssplit-split_aspect

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lievan commented Jun 24, 2025 •

edited

Loading

github-actions bot commented Jun 24, 2025 •

edited

Loading

github-actions bot commented Jun 24, 2025 •

edited

Loading

pr-commenter bot commented Jun 24, 2025 •

edited

Loading