Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PyTorch] Normalization ops #1033
[PyTorch] Normalization ops #1033
Changes from 25 commits
338e193
84bc1d7
0c40c54
a7f0228
cb9c455
68635ad
cb9b4ec
b33f367
d9fb6f4
00592d7
e0a2fd9
ad32d6a
92d1f89
2197bca
c27a783
fb6b7e4
7be0524
e7c9c67
91e6a03
21086aa
28bc058
e6c5d5f
4fdc3b7
b1141f5
4206fa2
fd5afe5
5b90e4b
fd4ef97
102c64f
87ce450
fed61f9
393ee66
556983e
9b508df
fb16ee9
34e2985
7e61399
8b3cf24
8ddd539
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a little backwards. if
dtype
is supposed to be the new argument name, then why is it inkwargs
? Bothparams_dtype
anddtype
should be regular parameters, there should be a deprecation warning when somebody usesparams_dtype
and also the check for duplicate assignment like the one you have here.Also, similar treatment should be done for hidden_size and sequence_parallel (especially the last one seems to be just gone completely so there should be some explanation that it was unused before or something?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that we should forward kwargs directly to
te.ops.LayerNorm
as much as possible so that we only have to change the API in one place if we ever make changes in the future. We include the deprecated options as explicit kwargs since they are specific to the module.This function signature also maintains backward compatibility for users who pass in the options as positional args, e.g.:
TransformerEngine/tests/pytorch/test_onnx_export.py
Lines 676 to 678 in 0ee5ccd