Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top k softmax #175

Open
wants to merge 15 commits into
base: sycl-develop
Choose a base branch
from
Open

Top k softmax #175

wants to merge 15 commits into from

Conversation

t4c1
Copy link
Collaborator

@t4c1 t4c1 commented Dec 16, 2024

Implements Top K Softmax for PVC. This includes extending xe_epilogue to call EVT interfaces that are required for this operation and fixing a bug in generic version of Top K Softmax epilogue.

cutlass_example_add_executable(
61_hopper_gemm_with_topk_and_softmax
61_hopper_gemm_with_topk_and_softmax.cu
)
else()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elseif (SYCL_NVIDIA_TARGET)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this else if for intel, not nvidia gpu (actually it should be generic).

@@ -193,6 +193,7 @@ struct Sm90TmaWarpSpecializedBiasElementwise {
#if defined (SYCL_INTEL_TARGET)
struct IntelPVCEpilogue {
static constexpr int SubgroupSize = 16;
static constexpr int FragmentSize = 8;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is FragmentSize? Is it always 8 or does it depend on the MMATiled definition?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is number of elements processed by one call to epilogue. In xe_epilogue it is defined to depend on MMA atom size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants