-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix QKV dtype in the bwd of FP8+CP #1134
Conversation
Signed-off-by: Xiaowei Ren <[email protected]>
Can we test both the E4M3 and HYBRID fp8 recipes please? Can we add that to the unit tests? |
This is easy, we just need to change a flag of FP8 recipe, but I think this probably will not add anything additional to the test. I am comparing the results of CP>1 vs. CP=1. E4M2 and HYBRID share the same code, and I think HYBRID test can cover everything of E4M3. For example the bug in this PR is for HYBRID only, it should work for E4M3 because fwd and bwd dtype are same. Maybe I am misunderstanding your point. In your mind, do you have anything special of E4M3 that cannot be covered by HYBRID test? |
Signed-off-by: Xiaowei Ren <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Xiaowei Ren <[email protected]>
/te-ci pytorch |
Signed-off-by: Xiaowei Ren <[email protected]>
/te-ci pytorch |
/te-ci pytorch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* fix qkv_dtype of FP8+CP Signed-off-by: Xiaowei Ren <[email protected]> * config cp correction dtype of FP8+CP Signed-off-by: Xiaowei Ren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code style change Signed-off-by: Xiaowei Ren <[email protected]> * always do FP8 CP correction in FP32 Signed-off-by: Xiaowei Ren <[email protected]> --------- Signed-off-by: Xiaowei Ren <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Charlene Yang <[email protected]>
Description
Fix the QKV dtype in the bwd of FP8+CP.
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: