[FEATURE]: support prompt_tokens_details parameter of the OpenAI API

### Feature request

Add actual information about the use of the KV cache when processing a request, which is assumed in completion_usage in the OpenAI API.

### Describe the problem you're encountering

Almost all providers provide discounts for cached tokens, so the lack of support may be a product blocker of launching on top of Dynamo.

In addition, actual per-request KV cache usage data will help to compare different caching improvements (KV-aware-router, KVBM, ...)

### Describe alternatives you've tried

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE]: support prompt_tokens_details parameter of the OpenAI API #4310

Feature request

Describe the problem you're encountering

Describe alternatives you've tried

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE]: support prompt_tokens_details parameter of the OpenAI API #4310

Description

Feature request

Describe the problem you're encountering

Describe alternatives you've tried

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions