Skip to content

Commit

Permalink
[Kernel] Remove if-else with identical branches in marlin 2:4 (vllm-p…
Browse files Browse the repository at this point in the history
…roject#10687)

Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Andrew Feldman <[email protected]>
  • Loading branch information
tlrmchlsmth authored and afeldman-nm committed Dec 2, 2024
1 parent a809ee1 commit 57485ba
Showing 1 changed file with 3 additions and 7 deletions.
10 changes: 3 additions & 7 deletions csrc/quantization/marlin/sparse/marlin_24_cuda_kernel.cu
Original file line number Diff line number Diff line change
Expand Up @@ -296,13 +296,9 @@ __global__ void Marlin_24(
// We use a different scale layout for grouped and column-wise quantization as
// we scale a `half2` tile in column-major layout in the former and in
// row-major in the latter case.
if (group_blocks != -1) {
s_sh_rd = 8 * ((threadIdx.x / 32) % (thread_n_blocks / 4)) +
(threadIdx.x % 32) / 4;
} else {
s_sh_rd = 8 * ((threadIdx.x / 32) % (thread_n_blocks / 4)) +
(threadIdx.x % 32) / 4;
}
s_sh_rd = 8 * ((threadIdx.x / 32) % (thread_n_blocks / 4)) +
(threadIdx.x % 32) / 4; // Note that in the original Marlin kernel
// this is (threadIdx.x % 32) / 4

// Precompute which thread should not read memory in which iterations; this is
// needed if there are more threads than required for a certain tilesize or
Expand Down

0 comments on commit 57485ba

Please sign in to comment.