-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control on the weight quantization #1123
Comments
I would recommend using parametrize. You register your parametrization, and those will be applied automatically before quantization. |
Sorry one further question: what is the pipeline of quantization during training? Are weights still updated or we should rely only on the |
I'm not sure I follow but gradients should be correctly propagated. If you need an example on how to use quant_weights with custom layers, this is the general implementation of a forward pass for a quantized int layer:
|
If your layer is a custom QuantConv2d and you end up calling |
My layer extend QuantConv2d, before calling its |
Could you post a code snippet to get an idea of what you're trying to achieve? |
Here an example:
|
I think the problem is in Hope it is helping! |
Might I ask you to pull the latest version of dev? |
With the latest version it is working. Thank you so much! |
Sorry one more question, by using parametrize, the values stored in I am asking this, becuase I want to be sure that once I export the model, I will get the quantized version of the modified weights. |
Dequantized version of self.weight, which is what you were asking for at the beginning of the question right? The idea is that in Brevitas, we rely on self.weight for quantization, so that's why I suggested it should work out of the box. Having said that, I have never used parametrize before (for now), so if you still have doubts, I'd recommend poking a bit with a debugger to make sure everything is in the correct place |
I am asking this question because I am working with a custom implementation of a
QuantConv2d
layer. During the training, the weights of the layer have to be processed with a series of operations that can be done in full precision, then the quantized version of the processed weights can be used to apply the convolution.So far, I have applied the preprocessing on the
. value
version of the quant_weight, breaking the quantization and then I re-apply the quantization making the weights passing through aQuantIdentity
. However, This approach is sub-optimal because it increases the quantization error and it is tricky to emulate the weight quantization scheme with the activations.Is there a way to control when the layer apply the weight quantization?
PS: During the inference there are no problems because it will behaive like a traditional
QuantConv2d
.The text was updated successfully, but these errors were encountered: