[Misc]Reduce BNB static variable #9987

jeejeelee · 2024-11-04T08:35:22Z

Obtain weight sharding information directly from the model rather than through extra static variable, simplifying the support process for BNB quantization.
With this PR, models that support BNB quantization (such as Qwen2) can now directly support TP.

ping @mgoin @chenqianfzh

Signed-off-by: Jee Jee Li <[email protected]>

github-actions · 2024-11-04T08:35:34Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

mgoin

Much much nicer to do it this way, nice work.

vllm/model_executor/models/gemma.py

Signed-off-by: Jee Jee Li <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Richard Liu <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Loc Huynh <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]>

Reduce BNB static var

516fb22

Signed-off-by: Jee Jee Li <[email protected]>

DarkLight1337 requested a review from mgoin November 4, 2024 09:44

mgoin approved these changes Nov 4, 2024

View reviewed changes

vllm/model_executor/models/gemma.py Outdated Show resolved Hide resolved

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 4, 2024

Delete newlines

1ff4b0c

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee requested a review from mgoin November 4, 2024 15:37

mgoin enabled auto-merge (squash) November 4, 2024 15:47

mgoin merged commit fb2716d into vllm-project:main Nov 4, 2024
60 checks passed

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Nov 4, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

7bb6d4e

Signed-off-by: Jee Jee Li <[email protected]>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Nov 4, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

6fbb2a5

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

richardsliu pushed a commit to richardsliu/vllm that referenced this pull request Nov 4, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

a2a024e

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Richard Liu <[email protected]>

jeejeelee deleted the reduce-bnb-static-var branch November 5, 2024 01:25

bigPYJ1151 pushed a commit to bigPYJ1151/vllm that referenced this pull request Nov 5, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

d1f60d9

Signed-off-by: Jee Jee Li <[email protected]>

DarkLight1337 pushed a commit that referenced this pull request Nov 5, 2024

[Misc]Reduce BNB static variable (#9987)

4aa525d

Signed-off-by: Jee Jee Li <[email protected]>

JC1DA pushed a commit to JC1DA/vllm that referenced this pull request Nov 11, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

696d1cf

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Loc Huynh <[email protected]>

sumitd2 pushed a commit to sumitd2/vllm that referenced this pull request Nov 14, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

5c3e21a

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

4e67e75

Signed-off-by: Jee Jee Li <[email protected]>

mfournioux pushed a commit to mfournioux/vllm that referenced this pull request Nov 20, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

d6550ee

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

tlrmchlsmth pushed a commit to neuralmagic/vllm that referenced this pull request Nov 23, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

27fdf0b

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Misc]Reduce BNB static variable (vllm-project#9987)

980a6a6

Signed-off-by: Jee Jee Li <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc]Reduce BNB static variable #9987

[Misc]Reduce BNB static variable #9987

jeejeelee commented Nov 4, 2024 •

edited

Loading

github-actions bot commented Nov 4, 2024

mgoin left a comment

[Misc]Reduce BNB static variable #9987

[Misc]Reduce BNB static variable #9987

Conversation

jeejeelee commented Nov 4, 2024 • edited Loading

github-actions bot commented Nov 4, 2024

mgoin left a comment

Choose a reason for hiding this comment

jeejeelee commented Nov 4, 2024 •

edited

Loading