Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online softmax #164

Merged
merged 19 commits into from
Dec 11, 2024
Merged

Conversation

t4c1
Copy link
Collaborator

@t4c1 t4c1 commented Dec 3, 2024

Add cutlass 3 implementation of online softmax that works on sm80 to example 35.

Copy link
Collaborator

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work @t4c1 👍

include/cutlass/gpu_generics.h Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/gemm_softmax_adapter.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/gemm_softmax_adapter.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@FMarno FMarno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nits from me

include/cutlass/gpu_generics.h Outdated Show resolved Hide resolved
examples/35_gemm_softmax/gemm_online_softmax.cpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/gemm_online_softmax.cpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_finalize.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@aacostadiaz aacostadiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!!!!

I left some minor comments but in general it is looking pretty good.

examples/35_gemm_softmax/gemm_online_softmax.cpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/gemm_online_softmax.cpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/gemm_softmax_adapter.hpp Outdated Show resolved Hide resolved
examples/35_gemm_softmax/softmax_epilogue.hpp Outdated Show resolved Hide resolved
include/cutlass/fast_math.h Outdated Show resolved Hide resolved
include/cutlass/gpu_generics.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@aacostadiaz aacostadiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!!!

@joeatodd joeatodd self-requested a review December 11, 2024 11:36
Copy link
Collaborator

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM nice work 👍

@aacostadiaz aacostadiaz merged commit c76cf17 into codeplaysoftware:sycl-develop Dec 11, 2024
5 checks passed
syncthreads();

// assumption for reductions: size<0>(sC) == block size
assert(size<0>(sC) == BlockDimX() * BlockDimy() * BlockDimZ());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a typo here @t4c1 "BlockDimy -> BlockDimY" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, opened a new PR to fix it: #176

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants