Skip to content

Commit

Permalink
add reward training script (#217)
Browse files Browse the repository at this point in the history
* add reward trainer

* push changes

* format

* bug fix

* only do visualize in main process

* quick fix

* black

* update docs even more

* formatting

* fix eval logging

* add ds config

* maybe use ai2 wandb entity

* quick change

* remove unused print

* make the dataset column configurable; remove eos from prompt token;

* add a generate collator
  • Loading branch information
vwxyzjn authored Aug 9, 2024
1 parent 8c67635 commit b382041
Show file tree
Hide file tree
Showing 16 changed files with 1,852 additions and 2 deletions.
21 changes: 21 additions & 0 deletions configs/ds_configs/deepspeed_zero2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
compute_environment: LOCAL_MACHINE
debug: false
deepspeed_config:
deepspeed_multinode_launcher: standard
offload_optimizer_device: none
offload_param_device: none
zero3_init_flag: false
zero_stage: 2
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: 'bf16'
num_machines: 1
num_processes: 8
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
22 changes: 22 additions & 0 deletions configs/ds_configs/deepspeed_zero3.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
compute_environment: LOCAL_MACHINE
debug: false
deepspeed_config:
deepspeed_multinode_launcher: standard
offload_optimizer_device: none
offload_param_device: none
zero3_init_flag: true
zero3_save_16bit_model: true
zero_stage: 3
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 8
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
Loading

0 comments on commit b382041

Please sign in to comment.