Add allreduce runtime support for nvshmem reduction on-stream api #21973

Tixxx · 2025-01-28T18:57:58Z

This pr adds runtime support for nvshmem reduction on-stream api.
It adds a new backend config to instruction the emitter to lower to nvshmem with corresponding collective thunks that call nvshmem reduction api.

google-cla · 2025-01-28T18:58:03Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

jprabhas · 2025-01-29T20:58:50Z

xla/service/gpu/gpu_memory_space_assignment.h

+    is_nvshmem_collective = backend_config.backend() == CollectiveBackendConfig::NVSHMEM;
+  }
+  return (alias->instruction()->opcode() == HloOpcode::kCustomCall &&
+      alias->instruction()->custom_call_target() == "mosaic_gpu") || is_nvshmem_collective;


why are mosaic_gpu custom calls in the same category as nvshmem ops?

This is from the allocator branch. We use nvshmem allocator for mosaic gpu calls or native nvshmem collectives. I think instead of using mosaic_gpu custom call, the pallas kernel will use a more specific custom call name so this will change after it's changed in the other pr and rebase on that.

Added more fixes on top of base branch, needs one last rebase

Tixxx force-pushed the tixxx/nvshmem_ar branch 2 times, most recently from 11f965b to 74f118c Compare January 28, 2025 19:15

jprabhas reviewed Jan 29, 2025

View reviewed changes

Tixxx force-pushed the tixxx/nvshmem_ar branch 4 times, most recently from 36e5e7d to 52701d0 Compare February 8, 2025 00:37

Tixxx force-pushed the tixxx/nvshmem_ar branch from 52701d0 to e4b7e73 Compare February 12, 2025 06:42

commit nvshmem allreduce runtime and lowering

e4b7e73

Added more fixes on top of base branch, needs one last rebase

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add allreduce runtime support for nvshmem reduction on-stream api #21973

Add allreduce runtime support for nvshmem reduction on-stream api #21973

Tixxx commented Jan 28, 2025

google-cla bot commented Jan 28, 2025

jprabhas Jan 29, 2025

Tixxx Jan 29, 2025

Add allreduce runtime support for nvshmem reduction on-stream api #21973

Are you sure you want to change the base?

Add allreduce runtime support for nvshmem reduction on-stream api #21973

Conversation

Tixxx commented Jan 28, 2025

google-cla bot commented Jan 28, 2025

jprabhas Jan 29, 2025

Choose a reason for hiding this comment

Tixxx Jan 29, 2025

Choose a reason for hiding this comment