Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable ITL, TTFT, E2E latency computation using mean rather than median #22

Open
tdoublep opened this issue Jun 19, 2024 · 0 comments
Open
Assignees

Comments

@tdoublep
Copy link
Collaborator

Let's have it configurable in the parser, and even maybe make mean the default.

Median does not make sense with speculative decoding anyway.

@tdoublep tdoublep self-assigned this Jun 19, 2024
@tdoublep tdoublep changed the title Enable ITL, TTFT computation using mean rather than median Enable ITL, TTFT, E2E latency computation using mean rather than median Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant