Replacing BN layer with AccumBN layer results in poorer convergence #116

andreped · 2023-09-11T09:58:28Z

Describe the bug
In the latest release of gradient-accumulator==0.5.2, there was added a method to add accum support to existing BN layers.

However, when attempting to use it in production, models seems to struggle to converge. We should benchmark this layer to verify that it is actually working as expected, and perhaps add units tests that capture if the approximation is too poor for production use before merging PR to the main branch.

Expected behavior
Swapping the BN layer with AccumBN should be seemless, transfer old weights to new layer, and yield better convergence than regular BN for accum_steps > 1 (in general).

Desktop (please complete the following information):

OS: [e.g. Ubuntu] Ubuntu
Version: [e.g. 20.04] 20.04
Python: [3.9] 3.8.10
TensorFlow: [2.8.0] 2.11.0

The text was updated successfully, but these errors were encountered:

andreped added the bug Something isn't working label Sep 11, 2023

andreped added this to Stable release Sep 11, 2023

github-project-automation bot moved this to To do in Stable release Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replacing BN layer with AccumBN layer results in poorer convergence #116

Replacing BN layer with AccumBN layer results in poorer convergence #116

andreped commented Sep 11, 2023 •

edited

Loading

Replacing BN layer with AccumBN layer results in poorer convergence #116

Replacing BN layer with AccumBN layer results in poorer convergence #116

Comments

andreped commented Sep 11, 2023 • edited Loading

andreped commented Sep 11, 2023 •

edited

Loading