Gradient Accumulation #345

ademyanchuk · 2021-01-06T14:23:48Z

ademyanchuk
Jan 6, 2021

Hi @rwightman. Thank you for the great work.

While skimming the train.py script I didn't see any gradient accumulation paths. Have you use it in your practice.
If so, does it help or not to get a better results lets say using only one GPU but bigger effective batch size? What are your thoughts about why it does or doesn't help?

Thanks in advance!

Cheers,
Alexey

Answered by rwightman

Jan 6, 2021

@ademyanchuk I don't use it because these models all use BatchNorm by default. Gradient accumulation isn't a clear win with BatchNorm as it does not improve the effective batch size for the BN running stats calc and that can cause instability or poor results with small batches. One could use GroupNorm and several models do support switching the norm layer quite easily, but not something I've experimented with.

View full answer

rwightman · 2021-01-06T17:07:35Z

rwightman
Jan 6, 2021
Maintainer

@ademyanchuk I don't use it because these models all use BatchNorm by default. Gradient accumulation isn't a clear win with BatchNorm as it does not improve the effective batch size for the BN running stats calc and that can cause instability or poor results with small batches. One could use GroupNorm and several models do support switching the norm layer quite easily, but not something I've experimented with.

1 reply

ademyanchuk Jan 6, 2021
Author

Thank you for the thorough explanation. I was trying accumulation a couple of times for Kaggle and didn't find it improving results as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient Accumulation #345

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Gradient Accumulation #345

ademyanchuk Jan 6, 2021

Replies: 1 comment · 1 reply

rwightman Jan 6, 2021 Maintainer

ademyanchuk Jan 6, 2021 Author

ademyanchuk
Jan 6, 2021

Replies: 1 comment 1 reply

rwightman
Jan 6, 2021
Maintainer

ademyanchuk Jan 6, 2021
Author