what's the purpose of CUB_SUBSCRIPTION_FACTOR #1060
-
I‘m really confused about this CUB_SUBSCRIPTION_FACTOR (https://github.com/NVIDIA/cub/blob/b2e8bccb8c0cd15279974fe4b9b8d6fcd1842b57/cub/device/dispatch/dispatch_reduce.cuh#L753). Can you explain to me why |
Beta Was this translation helpful? Give feedback.
Replies: 0 comments 5 replies
-
The work / CTA partitioning in our reduce is static, meaning it's a function of problem size/architecture (no work stealing). The subscription factor is intended to improve load balancing. Imagine a GPU with 2 SMs, each holding only one CTA at a time. If we launch only 2 CTAs and one of them gets all the simple work, it'll finish early, keeping GPU underutilized. On the other hand, if we had another CTA available, it'd replace the finished one. |
Beta Was this translation helpful? Give feedback.
The work / CTA partitioning in our reduce is static, meaning it's a function of problem size/architecture (no work stealing). The subscription factor is intended to improve load balancing. Imagine a GPU with 2 SMs, each holding only one CTA at a time. If we launch only 2 CTAs and one of them gets all the simple work, it'll finish early, keeping GPU underutilized. On the other hand, if we had another CTA available, it'd replace the finished one.