[v1] fix compilation cache #11598

youkaichao · 2024-12-29T07:52:24Z

compilation cache is not working due to a bug, self.to_be_compiled_sizes should be self.compile_sizes.copy() rather than self.compile_sizes.union(self.capture_sizes) .

this should be merged after #11596 . our compilation cache does not consider the code change in the model's code.

before this pr:

Compiling a graph for general shape takes 12.69 s

After this pr:

Compiling a graph for general shape takes 2.27 s

cc @tlrmchlsmth how can we take all these code into consideration? like rope / activation etc.

Signed-off-by: youkaichao <[email protected]>

github-actions · 2024-12-29T07:52:37Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: youkaichao <[email protected]>

tlrmchlsmth · 2024-12-29T18:22:38Z

our compilation cache does not consider the code change in the model's code.

I thought this would be handled by adding the version number here. Maybe we need to explicitly add the git hash as well?

vllm/vllm/config.py

Line 3000 in faef77c

vllm_factors.append(__version__)

tlrmchlsmth · 2024-12-29T18:23:49Z

vllm/config.py

+        else:
+            vllm_factors.append("None")


Why do we need to append "None" now?

To be safer, to distinguish:

config 1: None config 2: hash abc

config 1: hash abc config 2: None

youkaichao · 2024-12-30T00:38:07Z

I thought this would be handled by adding the version number here. Maybe we need to explicitly add the git hash as well?

that would be kind of overkill, for the developers (like me), the git hash changes from time to time, and it means I cannot reuse any compilation cache.

Signed-off-by: youkaichao <[email protected]>

Signed-off-by: youkaichao <[email protected]> Signed-off-by: xcnick <[email protected]>

fix compilation cache for v1

da05c8a

Signed-off-by: youkaichao <[email protected]>

youkaichao added 2 commits December 29, 2024 17:07

Merge branch 'main' into fix_v1_compile_cache

8a45be8

add additional config

ec8cde1

Signed-off-by: youkaichao <[email protected]>

youkaichao requested a review from tlrmchlsmth December 29, 2024 09:44

tlrmchlsmth reviewed Dec 29, 2024

View reviewed changes

tlrmchlsmth approved these changes Dec 29, 2024

View reviewed changes

youkaichao enabled auto-merge (squash) December 30, 2024 00:38

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 30, 2024

fix ranks

3a5aa88

Signed-off-by: youkaichao <[email protected]>

youkaichao requested review from WoosukKwon, robertgshaw2-neuralmagic, njhill, ywang96, comaniac and alexm-neuralmagic as code owners December 30, 2024 02:27

youkaichao merged commit 3682e33 into vllm-project:main Dec 30, 2024
52 checks passed

youkaichao deleted the fix_v1_compile_cache branch December 30, 2024 04:38

BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024

[v1] fix compilation cache (vllm-project#11598)

0b6d538

Signed-off-by: youkaichao <[email protected]>

xcnick pushed a commit to xcnick/vllm that referenced this pull request Dec 31, 2024

[v1] fix compilation cache (vllm-project#11598)

e455973

Signed-off-by: youkaichao <[email protected]> Signed-off-by: xcnick <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v1] fix compilation cache #11598

[v1] fix compilation cache #11598

youkaichao commented Dec 29, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 29, 2024

tlrmchlsmth commented Dec 29, 2024

tlrmchlsmth Dec 29, 2024

youkaichao Dec 30, 2024

youkaichao commented Dec 30, 2024

[v1] fix compilation cache #11598

[v1] fix compilation cache #11598

Conversation

youkaichao commented Dec 29, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 29, 2024

tlrmchlsmth commented Dec 29, 2024

tlrmchlsmth Dec 29, 2024

Choose a reason for hiding this comment

youkaichao Dec 30, 2024

Choose a reason for hiding this comment

youkaichao commented Dec 30, 2024

youkaichao commented Dec 29, 2024 •

edited by github-actions bot

Loading