Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce multi-node SPMD initialization for Neuron #8046

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rpsilva-aws
Copy link
Contributor

In this PR, we adapt to account for a new initialization path that supports multi-node SPMD in Neuron. In order to minimize this change, we retain the xla.init() API, but introduce a reinitialization for PJRT alone once SPMD is enabled. Since enabling SPMD follows the initial Neuron initialization, we require reconfiguring once this is enabled, and if the user did not explicitly set XLA_USE_SPMD (via is_spmd(), as it is currently recommended). Under the hood, both APIs will guarantee that the environment is correctly configured when SPMD is enabled.

@rpsilva-aws rpsilva-aws force-pushed the rpsilva-aws_neuron_multi_node_spmd branch from 92be233 to fd39924 Compare September 20, 2024 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant