-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1] fix compilation cache #11598
[v1] fix compilation cache #11598
Conversation
Signed-off-by: youkaichao <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Signed-off-by: youkaichao <[email protected]>
I thought this would be handled by adding the version number here. Maybe we need to explicitly add the git hash as well? Line 3000 in faef77c
|
else: | ||
vllm_factors.append("None") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to append "None" now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be safer, to distinguish:
config 1: None
config 2: hash abc
config 1: hash abc
config 2: None
that would be kind of overkill, for the developers (like me), the git hash changes from time to time, and it means I cannot reuse any compilation cache. |
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]> Signed-off-by: xcnick <[email protected]>
compilation cache is not working due to a bug,
self.to_be_compiled_sizes
should beself.compile_sizes.copy()
rather thanself.compile_sizes.union(self.capture_sizes)
.this should be merged after #11596 . our compilation cache does not consider the code change in the model's code.
before this pr:
After this pr:
cc @tlrmchlsmth how can we take all these code into consideration? like rope / activation etc.