Skip to content

[Kernel] Use call_jax to simplify the gmm pallas kernel wrapper #9180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 16, 2025

Conversation

yaochengji
Copy link
Collaborator

No description provided.

@bhavya01 bhavya01 requested a review from miladm May 16, 2025 00:23
@yaochengji
Copy link
Collaborator Author

The background is there's a PR vllm-project/vllm#18025 trying to enable torch_xla gmm kernel but encountered correctness issue.

Simplifying the wrapper to use call_jax can fix it.

@yaochengji
Copy link
Collaborator Author

also cc @bythew3i for visibility, seems I cannot add you as a reviewer.

@yaochengji yaochengji force-pushed the chengji/gmm-use-call_jax branch from b796514 to 27a4c0a Compare May 16, 2025 02:19
Copy link
Collaborator

@vanbasten23 vanbasten23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chengji. LGTM with one comment.

@yaochengji yaochengji merged commit f39434a into master May 16, 2025
27 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants