ONNX Model Fail to run #95

rajeevbaalwan · 2023-10-01T12:28:30Z

Hi have exported the espnet model trained on my custom dataset using espnet_onnx. Model fails to work properly on some audios. Below is the error which i am getting

Non-zero status code returned while running Add node. Name:'/encoders/encoders.0/self_attn/Add' Status Message: /encoders/encoders.0/self_attn/Add: right operand cannot broadcast on dim 3 LeftShape: {1,8,171,171}, RightShape: {1,1,1,127}

Any idea what could be the issue here. I have infered model on 1500 audio clips and i am getting exactly same error on around 400 audio clips.

The text was updated successfully, but these errors were encountered:

Masao-Someki · 2023-10-05T00:01:40Z

Hi @rajeevbaalwan I would like to confirm some points:

Would you tell me which encoder you use in your model?
Did you observe any similarities between them?

rajeevbaalwan · 2023-10-05T09:23:00Z

Hi @rajeevbaalwan I would like to confirm some points:

Would you tell me which encoder you use in your model?

Did you observe any similarities between them?

Thanks @Masao-Someki for your reply. I have used a simple transformer encoder.
I didn't get your question regarding similarity. Do you want to know the similarity between error outputs or something else ?

rajeevbaalwan · 2023-10-08T17:43:52Z

@Masao-Someki I have tried with conformer encoder based ASR model also but getting same error.

2023-10-08 23:12:29.048358681 [E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Add node. Name:'/encoders/encoders.0/self_attn/Add_5' Status Message: /encoders/encoders.0/self_attn/Add_5: right operand cannot broadcast on dim 3 LeftShape: {1,8,187,187}, RightShape: {1,1,1,127}

Masao-Someki · 2023-10-09T03:07:40Z

@rajeevbaalwan
The node /encoders/encoders.0/self_attn/Add is the masking process. I think increasing the max_seq_len will fix this issue!

tag_name = 'your model'
m = ASRModelExport()

# Add the following export config
m.set_export_config(
    max_seq_len=5000,
)

m.export_from_pretrained(tag_name, quantize=False, optimize=False)

Masao-Someki · 2023-10-09T03:55:33Z

In the masking process, your input audio seems to have a 171 frame length, while the mask has a 127 frame length. This difference causes this issue. The frame length is estimated during the onnx inference, but the maximum frame length is limited to the max_seq_len value. So increasing this value might fix this problem.

rajeevbaalwan · 2023-10-09T17:03:58Z

@Masao-Someki Thanks it worked for me. But the exported ONNX models do not work with batch input, right ? It only works for a single audio clip.

Masao-Someki · 2023-10-10T00:24:58Z

@rajeevbaalwan
Yes, it does not work with batched input.

If you want to run batched inference, then you need to:

Add the dynamic axes for batch dimension in the below script.
Fix the inference function.

espnet_onnx/espnet_onnx/export/asr/models/encoders/transformer.py

Lines 105 to 106 in 7cd0f78

    
           def get_dynamic_axes(self): 
        
               return {"feats": {1: "feats_length"}, "encoder_out": {1: "enc_out_length"}}

rajeevbaalwan · 2023-10-10T06:08:44Z

@Masao-Someki Thanks for the reply. I have already made the changes in the dynamic axes but this only won't solve the problem as the forward function only takes feats and not the actual length of the inputs in the batch, that's why enc_out_length is always wrong for the batch input as features length is calculated as below

feats_length = torch.ones(feats[:, :, 0].shape).sum(dim=-1).type(torch.long)

Is there any plan to handle batch inference during ONNX export in espnet_onnx? The complete inference function needs to be changed. If espnet_onnx is supposed to be implemented to prepare models for production then batch inferencing support is a must in the exported models. Single clip inference won't help in production.

Masao-Someki · 2023-10-11T21:38:08Z

@rajeevbaalwan
Sorry for the inconvenience, but currently we have no plan to support batch inference.
We have investigated the speed up with batched inference in our paper by tring to apply onnx hubert for training, but onnx seems to be less effective with large batch size.

rajeevbaalwan · 2023-10-12T07:08:07Z

@Masao-Someki You are absolutely right ONNX exported do not give huge speed up for large batch sizes but for small batch size like 4, 8, etc it is better than single clip inferencing. So it is better to have GPU-based implementation as it will be the generic implementation that will work for both single clip as well as multiple clips so that the user can have the flexibility. Event batch implementation doesn't degrade the performance for single clip inference. So can you take this feature into consideration?

rajeevbaalwan · 2023-10-17T13:38:09Z

@Masao-Someki is ESPnetLanguageModel is support in ONNX?

Masao-Someki · 2023-10-17T15:06:56Z

@rajeevbaalwan
I assume that the user of this library is more like an individual who wants to execute the ESPnet model on a low-resource constraint, such as Raspi. If the inference with the onnx format does not provide enough speedup, then we don't need ESPnet-ONNX, we can just use GPU.
Of course, I know having a multiple-batch inference option may be better, but I don't think it is worth implementing here.

is ESPnetLanguageModel is support in ONNX?

Yes, you can include an external language model.

rajeevbaalwan · 2023-10-17T18:02:30Z

@rajeevbaalwan I assume that the user of this library is more like an individual who wants to execute the ESPnet model on a low-resource constraint, such as Raspi. If the inference with the onnx format does not provide enough speedup, then we don't need ESPnet-ONNX, we can just use GPU. Of course, I know having a multiple-batch inference option may be better, but I don't think it is worth implementing here.

is ESPnetLanguageModel is support in ONNX?

Yes, you can include an external language model.

@Masao-Someki I can't find the code to export the Language Model in ONNX in the repo.

Masao-Someki · 2023-10-24T14:56:24Z

@rajeevbaalwan
In the following line, ESPnet-onnx has export function for language models!

espnet_onnx/espnet_onnx/export/asr/export_asr.py

Lines 113 to 126 in d617487

    
           # export lm 
        
           lm_model = None 
        
           if not model.asr_model.use_transducer_decoder: 
        
               if "lm" in model.beam_search.scorers.keys(): 
        
                   lm_model = get_lm(model.beam_search.scorers["lm"], self.export_config) 
        
           else: 
        
               if model.beam_search_transducer.use_lm: 
        
                   lm_model = get_lm(model.beam_search_transducer.lm, self.export_config) 
        
           if lm_model is not None: 
        
               self._export_lm(lm_model, export_dir, verbose) 
        
               model_config.update(lm=lm_model.get_model_config(export_dir)) 
        
           else: 
        
               model_config.update(lm=dict(use_lm=False))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX Model Fail to run #95

ONNX Model Fail to run #95

rajeevbaalwan commented Oct 1, 2023 •

edited

Loading

Masao-Someki commented Oct 5, 2023

rajeevbaalwan commented Oct 5, 2023

rajeevbaalwan commented Oct 8, 2023

Masao-Someki commented Oct 9, 2023

Masao-Someki commented Oct 9, 2023

rajeevbaalwan commented Oct 9, 2023

Masao-Someki commented Oct 10, 2023

rajeevbaalwan commented Oct 10, 2023 •

edited

Loading

Masao-Someki commented Oct 11, 2023

rajeevbaalwan commented Oct 12, 2023

rajeevbaalwan commented Oct 17, 2023

Masao-Someki commented Oct 17, 2023

rajeevbaalwan commented Oct 17, 2023

Masao-Someki commented Oct 24, 2023

ONNX Model Fail to run #95

ONNX Model Fail to run #95

Comments

rajeevbaalwan commented Oct 1, 2023 • edited Loading

Masao-Someki commented Oct 5, 2023

rajeevbaalwan commented Oct 5, 2023

rajeevbaalwan commented Oct 8, 2023

Masao-Someki commented Oct 9, 2023

Masao-Someki commented Oct 9, 2023

rajeevbaalwan commented Oct 9, 2023

Masao-Someki commented Oct 10, 2023

rajeevbaalwan commented Oct 10, 2023 • edited Loading

Masao-Someki commented Oct 11, 2023

rajeevbaalwan commented Oct 12, 2023

rajeevbaalwan commented Oct 17, 2023

Masao-Someki commented Oct 17, 2023

rajeevbaalwan commented Oct 17, 2023

Masao-Someki commented Oct 24, 2023

rajeevbaalwan commented Oct 1, 2023 •

edited

Loading

rajeevbaalwan commented Oct 10, 2023 •

edited

Loading