Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quark quantization inference errors #137

Open
heman-CL opened this issue Dec 5, 2024 · 5 comments
Open

Quark quantization inference errors #137

heman-CL opened this issue Dec 5, 2024 · 5 comments
Assignees

Comments

@heman-CL
Copy link

heman-CL commented Dec 5, 2024

Dear great authors,
I've tried to use quark quantization to generate int8 model under NPU.
However, while inferring the model, it shows the following errors:

The snippet model is like: (Use onnx.load to check the node info and it seems there's attribute generated after quark quantization )

Error Unrecognized attribute: axes for operator Squeeze (Actually the original model is split but is converted into slice)
image

Could you please help to comment it?
Thanks

@cyndwith
Copy link
Collaborator

cyndwith commented Dec 5, 2024

@heman-CL

  • Could you please share few details about the specific model you are using?
  • What is the ONNX version of the model are you using?
    Having this information will help me in trying to reproduce the error and better understand the issue. Thank you!

@cyndwith cyndwith self-assigned this Dec 5, 2024
@heman-CL
Copy link
Author

heman-CL commented Dec 6, 2024

Hi cyndwith,

I've tried to convert onnx model with opset=11 again. This issue is gone. (Squeeze-11/Squeeze-13 are with different structures)
However, I've faced another errors while inferring. It shows:

KernelParamGenPass.cpp:2130] xir::Op{name = (Squeeze_output_0_DequantizeLinear_Output), type = transpose}. This order: (0,3,2,

And then it just finished without anything.
The original graph: Transpose (perm = (0,2,1)) -> Squeeze (axes = 0)
Quantized_graph: Transpose (perm = (0,2,1)) -> QuantizeLinear -> DequantizeLinear -> Squeeze (axes = 0) -> QuantizeLinear -> DequantizeLinear -> ...

Thanks

@cyndwith
Copy link
Collaborator

cyndwith commented Dec 9, 2024

Please try exporting/update the ONNX model to use opset=17, which is the recommended version for Quark quantization.

@heman-CL
Copy link
Author

Hi cyndwith,
I've tried to export onnx with torch.onnx.export( , opset_version = 17).
The issues still happen.
However, I've found the following node properties still show ai.onnx v13 (Squeeze/Split --> node properties --> module)
Do I need to do further conversions?
Thanks

@cyndwith
Copy link
Collaborator

It should function with opset_version=17. Could you provide more information about the network so I can replicate this error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants