-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix window_reverse to support batch inference when using onnx or tensorrt #166
base: main
Are you sure you want to change the base?
Conversation
Thanks! I've tested the inference and did some training, which shows no inconsistency. |
I am not an expert in onnx either. I just did some experiments and found the issue. To my understanding, onnx did something like generate a graph consists operator nodes to do the inference. During generating the graph, data will be split into tensors and constants, constants will be precomputed and saved in the graph. In this case, B will be regarded as constant when generating the graph and it's related to the given input. It's OK when using fixed size input as the input shape used to generate onnx file and inference is the same. Onnx also support dynamic shape input(which I am testing), and it support setting dimensions to be dynamic (usually it's batch size) and I think it will treat the shape as a 'tensor' when generating the graph. However, in the previous code, the batch size will be fixed to the input shape ( |
I have successfully done inferencing with dynamic batch size with onnx and tensorrt and will PR some example ipynb scripts later(it take some time). I will give some brief ideas here for some interested.
|
Hi, @jhwei, thanks a lot for the explanation. However, as I tested, updating only those three lines cannot help the dynamic batch size in ONNX model inference. Someone else also made another PR on the dynamic input: #167. I'll also test the modification there. But your PR makes sense on that part, I'll accept it after figuring out all these relevant things, thanks anyway :) |
Thanks, @ZhengPeng7 This PR #167 partially does exactly what I did in idea 1 and it's the fastest way. However, the PR does not really implement dynamic batch inference as the generated onnx file still only accept batch size equals 1. I just found out timm swim transformer also used the implementation today. Also, I think using Maybe I can work out a scratch demo tomorrow for both idea 1 &2 for better explaination. |
@ZhengPeng7 Please see https://github.com/jhwei/BiRefNet/blob/45ec226ff1378fe0030d72c7bbdd7fbd7a7d3763/tutorials/BiRefNet_pth2onnx.ipynb for the demo supporting dynamic batch size. (The code may not be clean enough). For idea 2, I tested that changing |
Thanks a lot, @jhwei. I have to say that I've been too exhausted recently. But I'll definitely test all the things you made and reply to you about whether it works well. |
Thank you, take your time. It's a great chance for me to explore on onnx and trt conversion. Feel free to post any questions when you have time. And, happy Chinese New Year in advance. |
This fix in
window_reverse
is useful when converting to onnx, tensorrt or other models. The change will not affect the pytorch inference or training.The previous code may lead the converter to regard
B
as a constant value (which is1
in most cases). This will lead to a wrong answer in batch inference. The new code will regardB
as a dynamic value-1
.This PR used the same change in microsoft/Swin-Transformer#257