[Misc] Split up pooling tasks #10820

DarkLight1337 · 2024-12-02T04:46:24Z

As part of #10674, split up the embedding task into the following tasks for pooling models:

Embedding (embed), with default pooler pooling_type=LAST, normalize=True
- Usually one output per prompt
Classification (classify), with default pooler pooling_type=LAST, softmax=True
- Usually one output per prompt
- The scoring module will be automatically loaded from sentence-transformers.
- Other models should register a separate architecture so that the model contains the correct modules.
Scoring (scoring)
- Technically an alias for Classification, used for cross-encoder models
Reward Modeling (reward), with default pooler pooling_type=ALL
- Usually one output per token per prompt
- If the reward model has any additional modules compared to the text generation model, a separate architecture should be registered so that the model contains the correct modules.
- To avoid peiyi9979/math-shepherd-mistral-7b-prm conflicting with other LlamaForCausalLM pooling models, we will remove the pooler default from the text generation model. Since users already have to override step_tag_id etc. anyway to use that model, this should not be a major breaking change.

Backwards compatibility

embedding remains available as an alias to embed.
If --task embedding is passed but the requested pooling model is not an embedding model, a deprecation warning is emitted.
Later, we will make --task embedding exclusive to embedding models.

Signed-off-by: DarkLight1337 <[email protected]>

github-actions · 2024-12-02T04:46:34Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 · 2024-12-06T15:28:30Z

cc @russellb see if these new doc additions look good to you!

DarkLight1337 · 2024-12-09T16:01:33Z

@Isotr0py would be great if you could review this as well!

Isotr0py

Overall LGTM. Just a comment about the score task.

vllm/config.py

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py

LGTM now!

Signed-off-by: DarkLight1337 <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 added 2 commits December 2, 2024 04:40

Split up pooling tasks

8e71af9

Signed-off-by: DarkLight1337 <[email protected]>

Add docs

be5621a

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot added documentation Improvements or additions to documentation frontend labels Dec 2, 2024

DarkLight1337 mentioned this pull request Dec 2, 2024

[RFC]: Make any vLLM model a pooling model #10674

Closed

7 tasks

DarkLight1337 added 11 commits December 2, 2024 09:30

Create a new "Usage" section in the docs

7710e7f

Signed-off-by: DarkLight1337 <[email protected]>

Move models up

6b70f06

Signed-off-by: DarkLight1337 <[email protected]>

Streamline titles

75f366f

Signed-off-by: DarkLight1337 <[email protected]>

Revamp Using VLMs

c5da5fe

Signed-off-by: DarkLight1337 <[email protected]>

Improve link

d4e3eb5

Signed-off-by: DarkLight1337 <[email protected]>

Split up the code blocks

449eef6

Signed-off-by: DarkLight1337 <[email protected]>

Reword

89bd92e

Signed-off-by: DarkLight1337 <[email protected]>

Merge branch 'usage-docs' into split-pooling-tasks

963504b

New doc pages for generative and pooling models

1f4455e

Signed-off-by: DarkLight1337 <[email protected]>

Fix heading

291ae79

Signed-off-by: DarkLight1337 <[email protected]>

Various fixes

aef7899

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the split-pooling-tasks branch 2 times, most recently from b4964ad to 5d3a629 Compare December 2, 2024 16:44

Update

b9dd634

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the split-pooling-tasks branch 2 times, most recently from 3627461 to 39f7d7c Compare December 2, 2024 16:46

Place them under Models for now

11fbad1

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the split-pooling-tasks branch from 39f7d7c to 11fbad1 Compare December 2, 2024 17:02

DarkLight1337 added 6 commits December 2, 2024 17:05

Update

dcac4f2

Signed-off-by: DarkLight1337 <[email protected]>

Update

287c2ba

Signed-off-by: DarkLight1337 <[email protected]>

Reorganize Supported Models

6056ac3

Signed-off-by: DarkLight1337 <[email protected]>

Improve organization

a536fc8

Signed-off-by: DarkLight1337 <[email protected]>

Make score a separate task

3d149e7

Signed-off-by: DarkLight1337 <[email protected]>

Fix

9c4c5fe

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 requested a review from youkaichao as a code owner December 5, 2024 03:31

DarkLight1337 requested review from russellb and removed request for comaniac, njhill, zhuohan123, WoosukKwon and alexm-neuralmagic December 5, 2024 03:31

Merge branch 'main' into split-pooling-tasks

e9cf357

DarkLight1337 requested a review from Isotr0py December 9, 2024 16:01

Isotr0py reviewed Dec 9, 2024

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

Isotr0py reviewed Dec 9, 2024

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

DarkLight1337 added 2 commits December 10, 2024 04:35

Merge branch 'main' into split-pooling-tasks

b730f14

Address comment

3440eb9

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py approved these changes Dec 10, 2024

View reviewed changes

DarkLight1337 enabled auto-merge (squash) December 11, 2024 03:39

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 11, 2024

DarkLight1337 added 2 commits December 11, 2024 05:10

Merge branch 'main' into split-pooling-tasks

3e0b7d1

Fix test failures

4e24920

Signed-off-by: DarkLight1337 <[email protected]>

youkaichao disabled auto-merge December 11, 2024 09:27

youkaichao merged commit 8f10d5e into vllm-project:main Dec 11, 2024
55 of 58 checks passed

DarkLight1337 deleted the split-pooling-tasks branch December 11, 2024 09:28

DarkLight1337 mentioned this pull request Dec 11, 2024

[Doc] Update docs to refer to pooling models #11093

Merged

llsj14 pushed a commit to llsj14/vllm that referenced this pull request Dec 11, 2024

[Misc] Split up pooling tasks (vllm-project#10820)

795ec7e

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 mentioned this pull request Dec 12, 2024

[Frontend] Separate pooling APIs in offline inference #11129

Merged

Akshat-Tripathi pushed a commit to krai/vllm that referenced this pull request Dec 12, 2024

[Misc] Split up pooling tasks (vllm-project#10820)

013f210

Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]>

pooyadavoodi mentioned this pull request Dec 12, 2024

[Bugfix] Use runner_type instead of task in GritLM #11144

Merged

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Misc] Split up pooling tasks (vllm-project#10820)

8737cde

Signed-off-by: DarkLight1337 <[email protected]>

passaglia mentioned this pull request Dec 24, 2024

[Bug]: Qwen2.5-Math-RM-72B Online Inference Fails #11446

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Split up pooling tasks #10820

[Misc] Split up pooling tasks #10820

DarkLight1337 commented Dec 2, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 2, 2024

DarkLight1337 commented Dec 6, 2024 •

edited

Loading

DarkLight1337 commented Dec 9, 2024

Isotr0py left a comment

Isotr0py left a comment

[Misc] Split up pooling tasks #10820

[Misc] Split up pooling tasks #10820

Conversation

DarkLight1337 commented Dec 2, 2024 • edited by github-actions bot Loading

Backwards compatibility

github-actions bot commented Dec 2, 2024

DarkLight1337 commented Dec 6, 2024 • edited Loading

DarkLight1337 commented Dec 9, 2024

Isotr0py left a comment

Choose a reason for hiding this comment

Isotr0py left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Dec 2, 2024 •

edited by github-actions bot

Loading

DarkLight1337 commented Dec 6, 2024 •

edited

Loading