Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is role of scaling_per_output_channel in QuantReLU? #791

Closed
phixerino opened this issue Jan 5, 2024 · 1 comment
Closed

What is role of scaling_per_output_channel in QuantReLU? #791

phixerino opened this issue Jan 5, 2024 · 1 comment

Comments

@phixerino
Copy link

phixerino commented Jan 5, 2024

I'm looking at MobileNetV1 example and I see that scaling_per_output_channel is True in QuantReLU after the first layer (init_block) and then after each pointwise convolutional layer except for the last stage.
On the other hand in ProxylessNAS Mobile14 the scaling_per_output_channel is False after the first layer and then its True after each first 1x1 convolutional layer in ProxylessBlock.
So whats the purpose of scaling_per_output_channel? Thank you

@Giuseppe5
Copy link
Collaborator

Similar to what happens for weight scaling, you can have one scale factor for the entire tensor to quantize, or one per each channel of said tensor. Other slicing of the tensor to compute scale factors are also possible, although arguably less common (e.g., per-row, per-group, etc.).

The use of per tensor vs per channel depends on the network topology, hardware constraints of the device where you plan to execute your network, and other factors.

As a rule of thumb, the more fine grained the granularity of your scale factors, the better the final accuracy of the quantized network. Similarly, the computational cost and memory usage of your network will increase since scaling factors are stored in high precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants