You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm looking at MobileNetV1 example and I see that scaling_per_output_channel is True in QuantReLU after the first layer (init_block) and then after each pointwise convolutional layer except for the last stage.
On the other hand in ProxylessNAS Mobile14 the scaling_per_output_channel is False after the first layer and then its True after each first 1x1 convolutional layer in ProxylessBlock.
So whats the purpose of scaling_per_output_channel? Thank you
The text was updated successfully, but these errors were encountered:
Similar to what happens for weight scaling, you can have one scale factor for the entire tensor to quantize, or one per each channel of said tensor. Other slicing of the tensor to compute scale factors are also possible, although arguably less common (e.g., per-row, per-group, etc.).
The use of per tensor vs per channel depends on the network topology, hardware constraints of the device where you plan to execute your network, and other factors.
As a rule of thumb, the more fine grained the granularity of your scale factors, the better the final accuracy of the quantized network. Similarly, the computational cost and memory usage of your network will increase since scaling factors are stored in high precision.
I'm looking at MobileNetV1 example and I see that
scaling_per_output_channel
isTrue
in QuantReLU after the first layer (init_block) and then after each pointwise convolutional layer except for the last stage.On the other hand in ProxylessNAS Mobile14 the
scaling_per_output_channel
isFalse
after the first layer and then itsTrue
after each first 1x1 convolutional layer inProxylessBlock
.So whats the purpose of
scaling_per_output_channel
? Thank youThe text was updated successfully, but these errors were encountered: