-
Notifications
You must be signed in to change notification settings - Fork 55
[OVEP] Add a check for type mismatches in QDQ stripping #834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ovep-develop
Are you sure you want to change the base?
Conversation
|
The casting node is for QuantizeLinear node correct? Could you add the before and after graph after this casting? Will it introduce any performance penalty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the output type of the DQ node directly and fix the C++ Lint warnings (C-style casts).
onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp
Outdated
Show resolved
Hide resolved
|
The issue that causes the use of a C-style cast for the input arg is that it is provided as const by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's roll back this unordered_map which is mainly unused. As for the C-style cast, this warning could be potentially suppressed by using the const_cast<T&>().
| type_str_to_tensor_data_type_["tensor(uint64)"] = ONNX_NAMESPACE::TensorProto_DataType_UINT64; | ||
| type_str_to_tensor_data_type_["tensor(complex64)"] = ONNX_NAMESPACE::TensorProto_DataType_COMPLEX64; | ||
| type_str_to_tensor_data_type_["tensor(complex128)"] = ONNX_NAMESPACE::TensorProto_DataType_COMPLEX128; | ||
| type_str_to_tensor_data_type_["tensor(string)"] = ONNX_NAMESPACE::TensorProto_DataType_STRING; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really make sense to add a possibility to convert fp32->string or to complex types, especially in the context of QDQ stripping?
Anyway, since QuantizeLinear and DequantizeLinear aren't operations per se, but are rather metaoperations, the fact that their input and output types differ seems more like an export bug to me. Let's cover that case only, but with an additional assertion to be on the safe side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks!
|
Could you open the PR to get review from reviewers with write access? Please also add the Jira ticket. |
When rewiring the graph after eliminating QDQ pairs, the runtime now checks whether the type matches before and after the eliminated nodes and inserts a Cast node if there is a mismatch.
19f5230 to
69f09bb
Compare


Description
When rewiring the graph after eliminating QDQ pairs, the runtime now checks whether the type matches before and after the eliminated nodes and inserts a Cast node if there is a mismatch.
Motivation and Context
At present, QDQ elimination assumes the floating point type is the same before the QuantizeLinear node and after the following DequantizeLinear, producing errors if the types mismatch.
Jira Ticket :
CVS-175447