-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate QOp Export #834
Comments
What is the reason for deprecation? |
Generally, QCDQ is much easier to use given its flexibility, whilst ONNX and Torch QOp have several constraints about how the layer input, weights, and output should be quantized to work correctly. Similarly, QCDQ is also much easier to support and work around compared to QOp. |
Thanks @Giuseppe5 |
Hi @Giuseppe5, I have tried both QCDQ and QOp ONNX export. Indeed QCDQ provides a great flexibility in order to export the models to ONNX, whereas for the QOp export one has to consider a lot of constraints. However, in order to perform full-integer inference by generating C code with the help of frameworks such as TVM, QCDQ adds several Quantize and Dequantize nodes in the ONNX graph, where all the computation essentially happens in floating points. In this case where you want to perform a full-integer inference, QOp worked quite well, as the integer tensors are passed on to the next layer if you set Since, QOp Export will be deprecated, is there any way with QCDQ export, one can perform a full-integer inference? |
Although we will keep the interface to have layer-wise export handlers, we will be deprecating support to QOp in favour of QCDQ.
The text was updated successfully, but these errors were encountered: