Skip to content

Commit

Permalink
api: rename tracking logger to wandb logger type
Browse files Browse the repository at this point in the history
  • Loading branch information
eric-haibin-lin committed Dec 14, 2024
1 parent 0deed9a commit ed8ac39
Show file tree
Hide file tree
Showing 19 changed files with 37 additions and 27 deletions.
5 changes: 2 additions & 3 deletions docs/examples/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,7 @@ Trainer
total_epochs: 30
project_name: verl_examples
experiment_name: gsm8k
logger: ['console', 'tracking']
logger: ['console', 'wandb']
nnodes: 1
n_gpus_per_node: 8
save_freq: -1
Expand All @@ -319,8 +319,7 @@ Trainer
- ``trainer.total_epochs``: Number of epochs in training.
- ``trainer.project_name``: For wandb
- ``trainer.experiment_name``: For wandb
- ``trainer.logger``: Support console and tracking. For tracking, we
will initialize a wandb
- ``trainer.logger``: Support console and wandb
- ``trainer.nnodes``: Number of nodes used in the training.
- ``trainer.n_gpus_per_node``: Number of GPUs per node.
- ``trainer.save_freq``: The frequency (by iteration) to save checkpoint
Expand Down
4 changes: 2 additions & 2 deletions docs/examples/gsm8k_example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ We also provide various training scripts for SFT on GSM8K dataset in `gsm8k sft
trainer.project_name=gsm8k-sft \
trainer.experiment_name=gsm8k-sft-deepseek-coder-6.7b-instruct \
trainer.total_epochs=4 \
trainer.logger=['console','tracking']
trainer.logger=['console','wandb']
Step 4: Perform PPO training with your model on GSM8K Dataset
-------------------------------------------------------------
Expand Down Expand Up @@ -156,7 +156,7 @@ The script of run_deepseek7b_llm.sh
critic.model.fsdp_config.optimizer_offload=False \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example_gsm8k' \
trainer.experiment_name='deepseek_llm_7b_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
4 changes: 2 additions & 2 deletions docs/start/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ We also provide various training scripts for SFT on GSM8K dataset in `gsm8k sft
trainer.project_name=gsm8k-sft \
trainer.experiment_name=gsm8k-sft-deepseek-coder-6.7b-instruct \
trainer.total_epochs=4 \
trainer.logger=['console','tracking']
trainer.logger=['console','wandb']
Step 4: Perform PPO training with your model on GSM8K Dataset
-------------------------------------------------------------
Expand Down Expand Up @@ -163,7 +163,7 @@ The script of `run_deepseek7b_llm.sh`
critic.model.fsdp_config.optimizer_offload=False \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example_gsm8k' \
trainer.experiment_name='deepseek_llm_7b_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_deepseek7b_llm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ python3 -m verl.trainer.main_ppo \
critic.model.fsdp_config.optimizer_offload=False \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example_gsm8k' \
trainer.experiment_name='deepseek_llm_7b_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_deepseek_full_hh_rlhf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ python3 -m verl.trainer.main_ppo --config-path=./config --config-name='ppo_megat
reward_model.param_offload=False \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_megatron_full_hh_rlhf_examples' \
trainer.experiment_name='deepseek_llm_7b_model_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_deepseek_math_gsm8k_megatron.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ python3 -m verl.trainer.main_ppo --config-path=./config --config-name='ppo_megat
critic.ppo_micro_batch_size=32 \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_megatron_math_gsm8k_examples' \
trainer.experiment_name='deepseek_llm_7b_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_deepseek_megatron.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ python3 -m verl.trainer.main_ppo --config-path=./config --config-name='ppo_megat
critic.ppo_micro_batch_size=64 \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_megatron_gsm8k_examples' \
trainer.experiment_name='deepseek_llm_7b_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_gemma.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ python3 -m verl.trainer.main_ppo \
critic.model.fsdp_config.optimizer_offload=False \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example' \
trainer.experiment_name='gemma2b_function_rm' \
trainer.n_gpus_per_node=2 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_qwen2-7b.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ python3 -m verl.trainer.main_ppo \
critic.model.fsdp_config.optimizer_offload=False \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example' \
trainer.experiment_name='Qwen2-7B-Instruct_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_qwen2-7b_rm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ python3 -m verl.trainer.main_ppo \
reward_model.micro_batch_size=16 \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example' \
trainer.experiment_name='Qwen2-7B-Instruct_hybrid_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/ppo_trainer/run_qwen2.5-32b.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ python3 -m verl.trainer.main_ppo \
critic.model.fsdp_config.optimizer_offload=False \
algorithm.kl_ctrl.kl_coef=0.0001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example' \
trainer.experiment_name='Qwen2.5-32B-Instruct_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion examples/sft/gsm8k/run_deepseek_6b7.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ torchrun --standalone --nnodes=1 --nproc_per_node=$nproc_per_node \
trainer.project_name=gsm8k-sft \
trainer.experiment_name=gsm8k-sft-deepseek-coder-6.7b-instruct \
trainer.total_epochs=4 \
trainer.logger=['console','tracking']
trainer.logger=['console','wandb']
11 changes: 9 additions & 2 deletions examples/sft/gsm8k/run_gemma_2b.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,16 @@

set -x

hdfs_path=hdfs://user/verl/experiments/gsm8k/gemma-2b-it/ # replace to your own hdfs/local path
if [ "$#" -lt 2 ]; then
echo "Usage: run_gemma_2b.sh <nproc_per_node> <save_path> [other_configs...]"
exit 1
fi

nproc_per_node=$1
hdfs_path=$2

# Shift the arguments so $@ refers to the rest
shift 2

torchrun --standalone --nnodes=1 --nproc_per_node=$nproc_per_node \
-m verl.trainer.fsdp_sft_trainer \
Expand All @@ -18,4 +25,4 @@ torchrun --standalone --nnodes=1 --nproc_per_node=$nproc_per_node \
trainer.project_name=gsm8k-sft \
trainer.experiment_name=gsm8k-sft-gemma-2b-it \
trainer.total_epochs=3 \
trainer.logger=['console','tracking']
trainer.logger=['console','wandb'] $@
2 changes: 1 addition & 1 deletion examples/sft/gsm8k/run_gemma_7b.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ torchrun --standalone --nnodes=1 --nproc_per_node=$nproc_per_node \
trainer.project_name=gsm8k-sft \
trainer.experiment_name=gsm8k-sft-gemma-1.1-7b-it \
trainer.total_epochs=4 \
trainer.logger=['console','tracking']
trainer.logger=['console','wandb']
2 changes: 1 addition & 1 deletion examples/split_placement/config/ppo_trainer_split.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ trainer:
total_epochs: 30
project_name: verl_examples
experiment_name: gsm8k
logger: ['console', 'tracking']
logger: ['console', 'wandb']
nnodes: 1
n_gpus_per_node: 8
save_freq: -1
Expand Down
2 changes: 1 addition & 1 deletion examples/split_placement/run_deepseek7b_llm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ python3 main_ppo_split.py \
critic.model.fsdp_config.optimizer_offload=False \
algorithm.kl_ctrl.kl_coef=0.001 \
trainer.critic_warmup=0 \
trainer.logger=['console','tracking'] \
trainer.logger=['console','wandb'] \
trainer.project_name='verl_example_gsm8k' \
trainer.experiment_name='deepseek_llm_7b_function_rm' \
trainer.n_gpus_per_node=8 \
Expand Down
2 changes: 1 addition & 1 deletion verl/trainer/config/ppo_megatron_trainer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ trainer:
total_epochs: 30
project_name: verl_examples
experiment_name: gsm8k
logger: ['console', 'tracking']
logger: ['console', 'wandb']
nnodes: 1
n_gpus_per_node: 8
save_freq: -1
Expand Down
2 changes: 1 addition & 1 deletion verl/trainer/config/ppo_trainer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ trainer:
total_epochs: 30
project_name: verl_examples
experiment_name: gsm8k
logger: ['console', 'tracking']
logger: ['console', 'wandb']
nnodes: 1
n_gpus_per_node: 8
save_freq: -1
Expand Down
12 changes: 8 additions & 4 deletions verl/utils/tracking.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,24 @@


class Tracking(object):
supported_backend = ['tracking', 'console']
supported_backend = ['wandb', 'console']

def __init__(self, project_name, experiment_name, default_backend: Union[str, List[str]] = 'console', config=None):
if isinstance(default_backend, str):
default_backend = [default_backend]
for backend in default_backend:
assert backend in self.supported_backend, f'{backend} is not supported'
if backend == 'tracking':
import warnings
warnings.warn("`tracking` logger is deprecated. use `wandb` instead.", DeprecationWarning)
else:
assert backend in self.supported_backend, f'{backend} is not supported'

self.logger = {}

if 'tracking' in default_backend:
if 'tracking' in default_backend or 'wandb' in default_backend:
import wandb
wandb.init(project=project_name, name=experiment_name, config=config)
self.logger['tracking'] = wandb
self.logger['wandb'] = wandb

if 'console' in default_backend:
from verl.utils.logger.aggregate_logger import LocalLogger
Expand Down

0 comments on commit ed8ac39

Please sign in to comment.