Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend] Separate pooling APIs in offline inference #11129
[Frontend] Separate pooling APIs in offline inference #11129
Changes from 8 commits
e87e3ab
4084522
ab4aa67
1463a5c
28be60f
b63000f
71c4295
5361121
18d8f3a
ed3b558
828f4b4
cc47a53
581bf13
61556b0
78e5c81
9577e3b
db3c71e
26436d6
60835d6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since spec decode isn't applicable to pooling models, I have removed
spec_decode_worker_metrics
fromPoolerOutput
. The type annotation thatmodel_output
is a list ofSamplerOutputs
is actually incorrect here (it can be a list ofPoolerOutput
) but I'm not bothered to fix it since we will probably rework this in V1 anyways.