Skip to content

Conversation

sanchitintel
Copy link

@sanchitintel sanchitintel commented Sep 13, 2025

Background

For a GEMM, [K, N] is the canonical shape of the canonical B matrix, so RowMajor tends to mean [K, N].

In the current GEMM kernel & MMA collective examples such as those based on 00_bmg_gemm.cpp, the canonical B matrix is always shaped [N, K] in the code (maybe to conform to the CuTe convention that B is always shaped [N, K]?). it's row-major, but discontiguous (physical layout in the memory is [K, N], and stride is [N, 1], so it's actually transposed).

We could also support the column-major case for B, in which case it's really shaped [N, K].

Summary of changes

Also add a case for when B (shaped [N, K]) is contiguous (with a stride of [K, 1]), because such use-cases are common in LLMs. To support it, the code-changes needed are:

  1. B layout is cutlass::layout::ColumnMajor, and
  2. the copy-atom used is different.

00_bmg_gemm.cpp now runs both these cases.

Thanks!

@sanchitintel sanchitintel changed the title Also use contiguous B matrix in 00_bmg_gemm.cpp Also use column-major B matrix in 00_bmg_gemm.cpp Sep 13, 2025
remove copy-paste remnant
@sanchitintel sanchitintel changed the title Also use column-major B matrix in 00_bmg_gemm.cpp Also use column-major B matrix in the example 00_bmg_gemm.cpp Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant