New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[example] add a split placement tutorial #43

Merged

PeterSH6 merged 2 commits into main from gm/placement

Dec 11, 2024

Collaborator

PeterSH6 commented Dec 11, 2024

Add the split placement tutorial by monkey-patch the fit function in ray_trainer
Split the actor/rollout/ref in one set of GPUs while mapping the critic and reward model to remaining GPUs
Currently, for simplicity, we only parallelize the execution of actor.update_actor and critic.update_critic. We can further parallelize operation in prepare experience stage.


          [example] add a split placement tutorial

c9b5086

PeterSH6 requested a review from vermouth1992

December 11, 2024 10:20


          lint

00540ef

Collaborator

vermouth1992 commented Dec 11, 2024

Actually, the critic/rm group can also have rollout. And this placement is most efficient in most cases. But this requires changing the code

PeterSH6 merged commit 6e8667b into main

2 checks passed

PeterSH6 deleted the gm/placement branch

December 11, 2024 14:41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet