Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch normalization #46

Open
jmm34 opened this issue Mar 21, 2024 · 2 comments
Open

Batch normalization #46

jmm34 opened this issue Mar 21, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@jmm34
Copy link

jmm34 commented Mar 21, 2024

I've found the Zuko library to be extremely beneficial for my work. I sincerely appreciate the effort that has gone into its development. In the Masked Autoregressive Flow paper (NeurIPS, 2017), the authors incorporated batch normalization following each autoregressive layer. Could this modification be integrated into the MaskedAutoregressiveTransform function?

@jmm34 jmm34 added the enhancement New feature or request label Mar 21, 2024
@francois-rozet
Copy link
Collaborator

francois-rozet commented Mar 21, 2024

Hello @jmm34, thanks for the kind words.

I am not a fan of batch normalization as it often leads to train/test gaps which are hard to diagnose, but I see why one would want to use it (mainly faster training).

IMO the best way to add batch normalization in Zuko would be to implement a standalone (lazy) BatchNormTranform. The user can then insert batch norm transformations anywhere in the flow.

We would accept a PR that implements this.

Edit: I think that using the current batch statistics to normalize is invalid as it would not be an invertible transformation $y = f(x)$ (impossible to know $x$ given $y$). So, we should use running statistics both during training and evaluation, and update these statistics during training.
Also, I am not sure that the scale and shift parameters are relevant (mean zero, unit variance is the target).

@jmm34
Copy link
Author

jmm34 commented Mar 21, 2024

Dear @francois-rozet, thank you very much for your quick reply. I will try to make a PR using the strategy you suggest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants