-
Notifications
You must be signed in to change notification settings - Fork 581
Update heuristic for Cutlass BF16 Grouped GEMM #4138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
This pull request was exported from Phabricator. Differential Revision: D74836650 |
Summary: X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
Summary: X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
This pull request was exported from Phabricator. Differential Revision: D74836650 |
Summary: Pull Request resolved: pytorch#4138 X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
This pull request was exported from Phabricator. Differential Revision: D74836650 |
Summary: Pull Request resolved: pytorch#4138 X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
Summary: We plan to make some changes to the heuristics, first refactor a bit to parallelize kernel compilation, such as in FP8 rowwise. Differential Revision: D74760416
Summary: X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
Summary: X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
This pull request was exported from Phabricator. Differential Revision: D74836650 |
Summary: Pull Request resolved: pytorch#4138 X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
Summary: Pull Request resolved: pytorch#4138 X-link: facebookresearch/FBGEMM#1220 This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes. Reviewed By: jianyuh Differential Revision: D74836650
This pull request was exported from Phabricator. Differential Revision: D74836650 |
This pull request has been merged in 841c22c. |
- Disable GenAI builds against CUDA 11.8 since it is no longr possible to support GenAI builds against CUDA 11.8.0 as of pytorch#4138
- Disable GenAI builds against CUDA 11.8 since it is no longr possible to support GenAI builds against CUDA 11.8.0 as of pytorch#4138
- Disable GenAI builds against CUDA 11.8 since it is no longr possible to support GenAI builds against CUDA 11.8.0 as of pytorch#4138
- Disable GenAI builds against CUDA 11.8 since it is no longr possible to support GenAI builds against CUDA 11.8.0 as of pytorch#4138
Summary: X-link: facebookresearch/FBGEMM#1255 - Disable GenAI builds against CUDA 11.8 since it is no longr possible to support GenAI builds against CUDA 11.8.0 as of #4138 Pull Request resolved: #4173 Reviewed By: jiawenliu64 Differential Revision: D75229752 Pulled By: q10 fbshipit-source-id: e9626799d371ee2671f9062df1933d3caea65087
Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/1220
This diff updates the heuristic used for Cutlass BF16 grouped gemm, improving performance in some important shapes.
Differential Revision: D74836650