Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change sharding annotation for activation_embed_and_logits_batch #1015

Merged
merged 1 commit into from
Nov 6, 2024

Conversation

khatwanimohit
Copy link
Collaborator

No description provided.

jonb377
jonb377 previously approved these changes Nov 6, 2024
@@ -214,7 +214,7 @@ logical_axis_rules: [
# For pipeline parallelism the pre and post decoder layer tensors' batch dimension is sharded by stages.
# Microbatches are sharded by stage, so moving out of and into this sharding should be a local reshape.
# The "stage" needs to be listed first since the microbatch dimension is first before the reshape.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't see this comment, I'll defer to @gobbleturk.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sound be fine, perhaps the order here needs to match the order of mesh axes

@@ -214,7 +214,7 @@ logical_axis_rules: [
# For pipeline parallelism the pre and post decoder layer tensors' batch dimension is sharded by stages.
# Microbatches are sharded by stage, so moving out of and into this sharding should be a local reshape.
# The "stage" needs to be listed first since the microbatch dimension is first before the reshape.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sound be fine, perhaps the order here needs to match the order of mesh axes

@copybara-service copybara-service bot merged commit dd2726c into main Nov 6, 2024
19 checks passed
@copybara-service copybara-service bot deleted the mohit/sharding_change branch November 6, 2024 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants