Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CUDA graph compilation #627

Merged
merged 6 commits into from
Oct 2, 2024
Merged

Fix CUDA graph compilation #627

merged 6 commits into from
Oct 2, 2024

Conversation

tgaddair
Copy link
Contributor

@tgaddair tgaddair commented Oct 2, 2024

CUDA graph compilation has been broken since we added FlashInfer and prefix caching support. This fixes the issues and adds some flexibility to how it works.

@tgaddair tgaddair requested a review from noyoshi October 2, 2024 22:16
@tgaddair tgaddair merged commit e3f7d6e into main Oct 2, 2024
1 check passed
@tgaddair tgaddair deleted the fix-compile branch October 2, 2024 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants