-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add float8 datatype to XPU OPs #1393
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kernels related: _local_scalar_dense foreach_tensor_copy cat_xpu where_xpu
@yucai-intel : what this comment tries to say? Can you, please, update PR description highlighting which XPU ops will be available for FP8 datatype after merging this PR? and whether there are any ops which won't be available with FP8 and we should expect over PRs to bring support for them?
AT_DISPATCH_SWITCH( \ | ||
TYPE, \ | ||
NAME, \ | ||
AT_PRIVATE_CASE_TYPE_USING_HINT( \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why so strange indent below? If I find definition correctly it expands to the regular case <id>: {}
. So each item can be placed just below the previous one.
Did you just copy from https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu#L288? I suggest to fix the indent here for sycl kernel.
#define AT_PRIVATE_CASE_TYPE_USING_HINT(enum_type, HINT, ...) \
case enum_type: { \
AT_PRIVATE_CHECK_SELECTIVE_BUILD(enum_type); \
using HINT [[maybe_unused]] = c10::impl::ScalarTypeToCPPTypeT<enum_type>; \
return __VA_ARGS__(); \
}
}); | ||
|
||
// AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND3( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the reason to have this commented out? Either remove or add in-source comment explaining why this might be needed.
kernels related:
_local_scalar_dense
foreach_tensor_copy
cat_xpu
where_xpu