Can't reproduce the result by following your instructment #1

mangostation · 2023-12-04T08:39:03Z

Hi,
I tried to follow your instructment to reproduce the result of UMA but I got some problem. (except espnet version)
I only change the batch_bins and accum_grad to 6250000 and 4 in order to train it on my RTX 2080ti.
Here is my loss img and cer img. (I have trained twice)
If you have met this problem before I hope you can help me. Thanks a lot.

您好，
我想要復現您論文中所提出的模型，但在過程中遇到了一些困難，除了espnet版本不同外(還有其他實驗在進行的緣故)，其餘流程接根據您寫的指示執行。
在設定上我將batch_bins和accum_grad改成6250000和4為了在RTX 2080ti上訓練。
但在訓練時loss和cer發生下方這種情況，我訓練了兩次都發生了一樣的問題。
如果您有遇過這個問題的話，希望您能分享您的經驗給我，非常感謝。

[中文/English] both ok

FnoY0723 · 2023-12-10T14:22:18Z

I'm sorry for not getting back to you sooner. Which dataset are you referring to? Based on the training curve, the issue may be related to setting the learning rate too high.

I have added the training process for the models mentioned in the article to each dataset folder in egs2. I hope this will be helpful to you.

mangostation · 2023-12-12T17:59:39Z

Hi,
I find out that some training config in umaconf is not the same as the config.yaml in your aishell experiment with no condition.
The max_epoch, accum_grad, batch_size, batch_bins, lr are different.
I change the config to your experiment setting and still trying.
I will update the result when the training complete.
Thanks.

FnoY0723 · 2023-12-13T04:00:58Z

The experimental result of the AISHELL-1 uma_conformer I uploaded is the earlier version. This experiment was conducted before we had standardized experimental settings (with different batch-size settings and others you mentioned).

Later on, we conducted experiments that conformed to "train_asr_uma_conformer.yaml" , and the final CER (Character Error Rate) for both experiments was consistent. Therefore, you can refer to both of these experimental settings.

mangostation · 2023-12-24T05:22:49Z

Hi,
Sorry for replying so late.
I retry it on V100 and all settings follow to this github.
and than I got this.

I suppose that maybe the problem is in CTC Loss.
If you have any idea please let me know.
Thanks.

FnoY0723 · 2023-12-25T04:28:58Z

I did not come across this question, but I looked up the issues on ESPNet and this answer could be useful：
espnet/espnet#3170 (comment)

FnoY0723 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 10, 2023

FnoY0723 reopened this Dec 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't reproduce the result by following your instructment #1

Can't reproduce the result by following your instructment #1

mangostation commented Dec 4, 2023

FnoY0723 commented Dec 10, 2023

mangostation commented Dec 12, 2023

FnoY0723 commented Dec 13, 2023

mangostation commented Dec 24, 2023

FnoY0723 commented Dec 25, 2023

Can't reproduce the result by following your instructment #1

Can't reproduce the result by following your instructment #1

Comments

mangostation commented Dec 4, 2023

FnoY0723 commented Dec 10, 2023

mangostation commented Dec 12, 2023

FnoY0723 commented Dec 13, 2023

mangostation commented Dec 24, 2023

FnoY0723 commented Dec 25, 2023