iRoPE varseq flag for pre-calculated kv qparams #4160

Aya-ZIbra · 2025-05-20T18:57:04Z

This will help avoid amax calc in rope for decode and partial prefill batch lanes.

Also, we can rely on it in Kernel2, to return back and avoid unneccessary quantization.

Reviewed By: y-sq

Differential Revision: D73478483

netlify · 2025-05-20T18:57:12Z

Name	Link
🔨 Latest commit	`163169c`
🔍 Latest deploy log	https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/682cdc7011053d0008d9b76f
😎 Deploy Preview	https://deploy-preview-4160--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

facebook-github-bot · 2025-05-20T18:57:18Z

This pull request was exported from Phabricator. Differential Revision: D73478483

Summary: X-link: facebookresearch/FBGEMM#1240 as title. This is needed to handle this case: https://www.internalfb.com/diff/D73833204?dst_version_fbid=9500286030082255&transaction_fbid=676020828512263 This will help avoid amax calc in rope for decode and partial prefill batch lanes. Also, we can rely on it in Kernel2, to return back and avoid unneccessary quantization. Reviewed By: y-sq Differential Revision: D73478483

facebook-github-bot · 2025-05-20T19:39:19Z

This pull request was exported from Phabricator. Differential Revision: D73478483

Summary: Pull Request resolved: pytorch#4160 X-link: facebookresearch/FBGEMM#1240 as title. This is needed to handle this case: https://www.internalfb.com/diff/D73833204?dst_version_fbid=9500286030082255&transaction_fbid=676020828512263 This will help avoid amax calc in rope for decode and partial prefill batch lanes. Also, we can rely on it in Kernel2, to return back and avoid unneccessary quantization. Reviewed By: y-sq Differential Revision: D73478483

facebook-github-bot · 2025-05-20T19:47:57Z

This pull request was exported from Phabricator. Differential Revision: D73478483

facebook-github-bot · 2025-05-21T07:39:26Z

This pull request has been merged in c85ea56.

facebook-github-bot added the cla signed label May 20, 2025

facebook-github-bot added the fb-exported label May 20, 2025

Aya-ZIbra force-pushed the export-D73478483 branch from 88e480c to 705f27b Compare May 20, 2025 19:35

Aya-ZIbra force-pushed the export-D73478483 branch from 705f27b to 3a04a01 Compare May 20, 2025 19:36

Aya-ZIbra force-pushed the export-D73478483 branch from 3a04a01 to ee77a32 Compare May 20, 2025 19:39

Aya-ZIbra force-pushed the export-D73478483 branch from ee77a32 to 163169c Compare May 20, 2025 19:47

facebook-github-bot closed this in c85ea56 May 21, 2025

facebook-github-bot added the Merged label May 21, 2025

Provide feedback