-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BREAKING][refact]: move actor/critic/hybrid_engine/reward_model/rollout/workers out of ppo directory for reuse #52
Conversation
@@ -38,7 +38,7 @@ | |||
from verl.utils.megatron.pipeline_parallel import (compute_transformers_input_shapes, make_batch_generator) | |||
from verl import DataProto | |||
from verl.trainer.ppo import core_algos | |||
from verl.trainer.ppo.actor import BasePPOActor | |||
from verl.trainer.workers.actor import BasePPOActor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also need to rename
- BasePPOActor -> BaseActor
- DataParallelPPOActor -> DataParallelActor
Other classes are similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can simply delete base class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from verl.trainer.workers.critic import DataParallelPPOCritic
from verl.trainer.workers.actor.megatron_actor import MegatronPPOActor
DDP & Megatron class have the same issue, I think we still need to delete PPO
. Shall we just make the PPO
implementation a base class? So that other algorithms can simply override the functions.
… out of verl/trainer/ppo directory to verl/trainer directory for reusage
verl.workers
: store all the files related to SPMD distributed computationverl.trainer
: store different RL algorithms, SFT, and generation scripts