-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ONNX Model Fail to run #95
Comments
Hi @rajeevbaalwan I would like to confirm some points:
|
Thanks @Masao-Someki for your reply. I have used a simple transformer encoder. |
@Masao-Someki I have tried with conformer encoder based ASR model also but getting same error.
|
@rajeevbaalwan tag_name = 'your model'
m = ASRModelExport()
# Add the following export config
m.set_export_config(
max_seq_len=5000,
)
m.export_from_pretrained(tag_name, quantize=False, optimize=False) |
In the masking process, your input audio seems to have a 171 frame length, while the mask has a 127 frame length. This difference causes this issue. The frame length is estimated during the onnx inference, but the maximum frame length is limited to the |
@Masao-Someki Thanks it worked for me. But the exported ONNX models do not work with batch input, right ? It only works for a single audio clip. |
@rajeevbaalwan If you want to run batched inference, then you need to:
espnet_onnx/espnet_onnx/export/asr/models/encoders/transformer.py Lines 105 to 106 in 7cd0f78
|
@Masao-Someki Thanks for the reply. I have already made the changes in the dynamic axes but this only won't solve the problem as the forward function only takes feats and not the actual length of the inputs in the batch, that's why enc_out_length is always wrong for the batch input as features length is calculated as below
Is there any plan to handle batch inference during ONNX export in espnet_onnx? The complete inference function needs to be changed. If espnet_onnx is supposed to be implemented to prepare models for production then batch inferencing support is a must in the exported models. Single clip inference won't help in production. |
@rajeevbaalwan |
@Masao-Someki You are absolutely right ONNX exported do not give huge speed up for large batch sizes but for small batch size like 4, 8, etc it is better than single clip inferencing. So it is better to have GPU-based implementation as it will be the generic implementation that will work for both single clip as well as multiple clips so that the user can have the flexibility. Event batch implementation doesn't degrade the performance for single clip inference. So can you take this feature into consideration? |
@Masao-Someki is ESPnetLanguageModel is support in ONNX? |
@rajeevbaalwan
Yes, you can include an external language model. |
@Masao-Someki I can't find the code to export the Language Model in ONNX in the repo. |
@rajeevbaalwan espnet_onnx/espnet_onnx/export/asr/export_asr.py Lines 113 to 126 in d617487
|
Hi have exported the espnet model trained on my custom dataset using espnet_onnx. Model fails to work properly on some audios. Below is the error which i am getting
Any idea what could be the issue here. I have infered model on 1500 audio clips and i am getting exactly same error on around 400 audio clips.
The text was updated successfully, but these errors were encountered: