Help: How did `gradient_accumulation_steps` in PPOConfig works? #2032

xmanners · 2024-09-07T13:34:40Z

xmanners
Sep 7, 2024

I was wander how Accelerator calculate batch_size when set batch_size mini_batch_size gradient_accumulation_steps in PPOConfig, but when I see in ds print_user_config json said that total batch size was equl to batch_size * num_process * gradient_accumulation_steps, instead of mini_batch_size * num_process * gradient_accumulation_steps.
This is a bit different from what I expected, and when I set the gradient_accumulation_steps in the accelerator config.yaml at the same time, the accumulation was the same with accelerator config.yaml, so what did PPOConfig gradient_accumulation_steps act during the training step? It's so wierd.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help: How did `gradient_accumulation_steps` in PPOConfig works? #2032

{{title}}

Replies: 0 comments

Select a reply

Help: How did gradient_accumulation_steps in PPOConfig works? #2032

xmanners Sep 7, 2024

Replies: 0 comments

Help: How did `gradient_accumulation_steps` in PPOConfig works? #2032

xmanners
Sep 7, 2024