-
Notifications
You must be signed in to change notification settings - Fork 179
metrics: TTFT in streaming mode #203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: TTFT in streaming mode #203
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
/hold |
Signed-off-by: Jintao Zhang <[email protected]>
…first body chunk (fixes streaming TTFT) Signed-off-by: Jintao Zhang <[email protected]>
Signed-off-by: Jintao Zhang <[email protected]>
797fe14
to
a6542d2
Compare
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
/unhold |
/hold cancel |
@tao12345666333 would you please come up with a PR to explain how to enable streaming? |
@rootfs I've added some notes in the document. https://vllm-semantic-router.com/docs/api/router#streaming-sse-notes Do you want me to add a dedicated section describing the request for enabling streaming? In my current implementation, no special configuration is required, uust keep the default settings in the current project; it automatically determines whether to use streaming based on the request headers. |
@tao12345666333 gotcha, thanks for the info! |
What type of PR is this?
metrics: TTFT in streaming mode
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #128
Release Notes: Yes/No