INT8/Quantization in torch-TensorRT 1.4 #2086
chichun-charlie-liu
started this conversation in
General
Replies: 1 comment
-
We are planning to improve and maintain the same level (with Torchscript) of INT8 support in dynamo as well. This is scheduled for next release. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
It appears that INT8 is not ready in the newly released torch-TRT 1.4, as the new dynamo.compile() checks the precision and rejects anything other than FP32 and FP16. But digging into deeper level, there seems to be some INT8/quantization components, similar to those from ver1.3?
I'm just curious if you could elaborate a little on the INT8 implementation plan or status, and if possible, any schedule to release a newer version that enables INT8?
Thanks a lot!
Beta Was this translation helpful? Give feedback.
All reactions