You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the latest version of the onnx-simplifier, I have met errors on fp8 QDQ models.
Error 1: shape inference
$> onnxsim fp8.onnx fp8-sim.onnx
Simplifying...
Traceback (most recent call last):
File "/opt/conda3/bin/onnxsim", line 8, in<module>sys.exit(main())
^^^^^^
File "/opt/conda3/lib/python3.11/site-packages/onnxsim/onnx_simplifier.py", line 453, in main
model_opt, check_ok = simplify(
^^^^^^^^^
File "/opt/conda3/lib/python3.11/site-packages/onnxsim/onnx_simplifier.py", line 187, in simplify
model_opt_bytes = C.simplify(
^^^^^^^^^^^
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:QuantizeLinear, node name: pts_bbox_head.transformer.decoder.layers.0.attentions.0.attn.query_quantizer/QuantizeLinear1): [TypeInferenceError] Inferred elem type differs from existing elem type: (17) vs (INT8)
Error 2: CSETensorHash
$> onnxsim fp8.onnx fp8-sim.onnx --skip-shape-inference
Simplifying...
Traceback (most recent call last):
File "/opt/conda3/bin/onnxsim", line 8, in<module>sys.exit(main())
^^^^^^
File "/opt/conda3/lib/python3.11/site-packages/onnxsim/onnx_simplifier.py", line 453, in main
model_opt, check_ok = simplify(
^^^^^^^^^
File "/opt/conda3/lib/python3.11/site-packages/onnxsim/onnx_simplifier.py", line 187, in simplify
model_opt_bytes = C.simplify(
^^^^^^^^^^^
RuntimeError: no supported data type: 17
Could you please add support for fp8 QDQ models?
The fp8 QDQ models are aligned with ModelOpt, which has the following structure. Compared to int8 QDQ models, the only difference is the data_type of zero points (int8 vs. float8_e4m3fn).
In the latest version of the onnx-simplifier, I have met errors on fp8 QDQ models.
Could you please add support for fp8 QDQ models?
The fp8 QDQ models are aligned with ModelOpt, which has the following structure. Compared to int8 QDQ models, the only difference is the data_type of zero points (int8 vs. float8_e4m3fn).
The text was updated successfully, but these errors were encountered: