Skip to content

fix managed pg allreduce #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

tushar00jain
Copy link
Contributor

@tushar00jain tushar00jain commented Jul 29, 2025

Summary:
managed pg allreduce should just call manager's allreduce


Stack created with Sapling. Best reviewed with ReviewStack.

Copy link
Member

@d4l3k d4l3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for putting this up! I think we need a hybrid as discussed below

@tushar00jain tushar00jain force-pushed the pr249 branch 5 times, most recently from 1afe947 to 78e1bea Compare July 30, 2025 21:58
@tushar00jain tushar00jain force-pushed the pr249 branch 18 times, most recently from a983b4d to f4a81c7 Compare August 5, 2025 22:50
This was referenced Aug 5, 2025
@tushar00jain tushar00jain requested a review from d4l3k August 6, 2025 06:05
@tushar00jain tushar00jain force-pushed the pr249 branch 3 times, most recently from b9f938d to 4033168 Compare August 6, 2025 20:16
@tushar00jain tushar00jain mentioned this pull request Aug 6, 2025
Summary:
- improve abstraction and api contract offered by  managed work - users can change future return types, ensures users use correct cuda streams
- added a managed future that is lazily attaches callbacks - based on when users call wait on the work  or the future
Summary:
managed pg allreduce should just call manager's allreduce
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants