-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to handle short-cuts with different scales #1101
Comments
The error is not a bug but a feature. The idea is that if you have two quantized values, you can't really add them unless they have the same scale factor. A different scales implies that they have a different range, so adding their integer representation would produce an integer with an unclear range. There are two main solutions for this:
first_qt = IntQuantTensor(...)
second_qt = IntQuantTensor(...)
assert first_qt.scale != second_qt.scale
output = first_qt.value + second_qt.value
first_qt = IntQuantTensor(...)
second_qt = IntQuantTensor(...)
shared_requant = QuantIdentity(...)
assert first_qt.scale != second_qt.scale
first_qt = shared_requant(first_qt)
second_qt = shared_requant(second_qt )
assert first_qt.scale == second_qt.scale
output_qt = first_qt + second_qt This second solution could be optimized so that you reduce the amount of unecessary requantization, but it could serve as a good starting point. |
I see, thank you for your answer, I am sorry for the wrong Tag. However, if we consider a HW implementation of the network (on FPGA for example), we would like to avoid full-precision operation, so I think whoever is interested in the HW implementation should go for the second option, what do you think? |
Yes the second solution is generally preferred for that particular use case, and that is what we use when quantizing networks for FINN. |
Thanks a lot! |
Hi @balditommaso, sorry to butt-in but do you mind sharing how you are streamlining short-cuts during FINN compilation? The resnet example is a little vague with the transformation steps. Thanks for your help! |
I might suggest to open an issue directly on the FINN repo, @auphelia will be more than happy to help :) If it is Brevitas related, feel free to share more details |
I am in the situation where the input (quantized) is added back to the output of a block (still quantized), but the scales are different and Brevitas raise an error.
How should I handle this situation?
The text was updated successfully, but these errors were encountered: