Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove type restrictions on normalization layers' scale and bias #2099

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

darsnack
Copy link
Member

@darsnack darsnack commented Nov 3, 2022

Right now, the normalization scale and bias are required to be the same type. This is an unnecessary restriction that isn't there for the weight and biases of other layers.

PR Checklist

  • Tests are added
  • Entry in NEWS.md
  • Documentation, if applicable

Comment on lines 327 to 328
β::V # bias
γ::V # scale
γ::U # scale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know where the original type param names came from, but would it make sense to harmonize them with the actual parameter names? e.g. B,S instead of U,V.

@mcabbott
Copy link
Member

mcabbott commented Nov 6, 2022

Should we consider wrapping these two in a Scale sub-layer? Replaced with identity when absent; perhaps that would allow also removing the affine field.

@ToucheSir
Copy link
Member

ToucheSir commented Nov 6, 2022

It's complicated because backends like cuDNN fuse in the affine transform as well. One idea would be to extract the scale and bias params from the sub-layer in the BN/IN/GN forward pass instead of calling the sub-layer itself, but that feels a tad hacky.

Copy link
Member

@ToucheSir ToucheSir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants