Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[example] add a split placement tutorial #43

Merged
merged 2 commits into from
Dec 11, 2024
Merged

[example] add a split placement tutorial #43

merged 2 commits into from
Dec 11, 2024

Conversation

PeterSH6
Copy link
Collaborator

  • Add the split placement tutorial by monkey-patch the fit function in ray_trainer
  • Split the actor/rollout/ref in one set of GPUs while mapping the critic and reward model to remaining GPUs
  • Currently, for simplicity, we only parallelize the execution of actor.update_actor and critic.update_critic. We can further parallelize operation in prepare experience stage.

@vermouth1992
Copy link
Collaborator

Actually, the critic/rm group can also have rollout. And this placement is most efficient in most cases. But this requires changing the code

@PeterSH6 PeterSH6 merged commit 6e8667b into main Dec 11, 2024
2 checks passed
@PeterSH6 PeterSH6 deleted the gm/placement branch December 11, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants