update_231

update_231 update_ms231
mindspore-lab · Nov 6, 2024 · cc7db33 · cc7db33
1 parent 4c7b609
commit cc7db33
Show file tree

Hide file tree

Showing 6 changed files with 70 additions and 68 deletions.
diff --git a/README.md b/README.md
@@ -24,11 +24,11 @@ MindAudio is a toolbox of audio models and algorithms based on [MindSpore](https
 
 The following is the corresponding `mindaudio` versions and supported `mindspore` versions.
 
-| `mindspore`  | `mindaudio` |
-|--------------|-------------|
-| `master`     | `master`    |
-| `2.3.0`      | `0.4`       |
-| `2.2.10`     | `0.3`       |
+| `mindaudio` | `mindspore`         |
+|-------------|---------------------|
+| `master`    | `master`            |
+| `0.4`       | `2.3.0`/`2.3.1`     |
+| `0.3`       | `2.2.10`            |
 
 ### data processing
 

diff --git a/README_CN.md b/README_CN.md
@@ -22,11 +22,11 @@ MindAudio 是基于 [MindSpore](https://www.mindspore.cn/) 的音频模型和算
 
 下表显示了相应的 `mindaudio` 版本和支持的 `mindspore` 版本。
 
-| `mindspore`  | `mindaudio` |
-|--------------|-------------|
-| `master`     | `master`    |
-| `2.3.0`      | `0.4`       |
-| `2.2.10`     | `0.3`       |
+| `mindaudio` | `mindspore`         |
+|-------------|---------------------|
+| `master`    | `master`            |
+| `0.4`       | `2.3.0`/`2.3.1`     |
+| `0.3`       | `2.2.10`            |
 
 ### 数据处理
 

diff --git a/examples/conformer/readme.md b/examples/conformer/readme.md
@@ -16,6 +16,10 @@ The overall structure of Conformer includes SpecAug, ConvolutionSubsampling, Lin
 
   ![image-20230310165349460](https://raw.githubusercontent.com/mindspore-lab/mindaudio/main/tests/result/conformer.png)
 
+## Requirements
+| mindspore     |   ascend driver        | firmware     |  cann toolkit/kernel    |
+|:-------------:|:----------------------:|:------------:|:-----------------------:|
+|     2.3.1     |   24.1.RC2             | 7.3.0.1.231  |  8.0.RC2.bata1          |
 
 
 ## Usage Steps
@@ -103,35 +107,21 @@ python predict.py --config_path ./conformer.yaml
 # using ctc prefix beam search decoder
 python predict.py --config_path ./conformer.yaml --decode_mode ctc_prefix_beam_search
 
-# using attention decoder
-python predict.py --config_path ./conformer.yaml --decode_mode attention
-
 # using attention rescoring decoder
 python predict.py --config_path ./conformer.yaml --decode_mode attention_rescoring
 ```
 
 
-
 ## Model Performance
-The training config can be found in the [conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml).
-
-Performance tested on ascend 910 (8p) with graph mode:
-
-| model     | decoding mode          | CER          |
-|-----------|------------------------|--------------|
-| conformer | ctc greedy search      | 5.35         |
-| conformer | ctc prefix beam search | 5.36         |
-| conformer | attention decoder      | comming soon |
-| conformer | attention rescoring    | 4.95         |
-- [weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-548ee31b.ckpt) can be downloaded here.
-
----
-Performance tested on ascend 910* (8p) with graph mode:
-
-| model     | decoding mode          | CER          |
-|-----------|------------------------|--------------|
-| conformer | ctc greedy search      | 5.62         |
-| conformer | ctc prefix beam search | 5.62         |
-| conformer | attention decoder      | comming soon |
-| conformer | attention rescoring    | 5.12         |
-- [weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt) can be downloaded here.
+
+Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode:
+
+| model name| cars | batch type | jit level | s/step | recipe | weight |     decoding mode     | cer  |
+|:---------:|:----:|:----------:|:---------:|:------:|:------:|:------:|:---------------------:|:----:|
+| conformer |   8  |  bucket    |     O0    |  0.72  |[conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml)  |[weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt)     |ctc greedy search      | 5.62 |
+| conformer |   8  |  bucket    |     O0    |  0.72  |[conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml)  |[weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt)     |ctc prefix beam search | 5.62 |
+| conformer |   8  |  bucket    |     O0    |  0.72  |[conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml)  |[weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt)     |attention rescoring    | 5.12 |
+<<<<<<< HEAD
+
+=======
+>>>>>>> 1d72af4 (update_231)
diff --git a/examples/conformer/readme_cn.md b/examples/conformer/readme_cn.md
@@ -16,6 +16,11 @@ Conformer整体结构包括：SpecAug、ConvolutionSubsampling、Linear、Dropou
 
   ![image-20230310165349460](https://raw.githubusercontent.com/mindspore-lab/mindaudio/main/tests/result/conformer.png)
 
+## 版本要求
+| mindspore     |   ascend driver        | firmware     |  cann toolkit/kernel    |
+|:-------------:|:----------------------:|:------------:|:-----------------------:|
+|     2.3.1     |   24.1.RC2             | 7.3.0.1.231  |  8.0.RC2.bata1          |
+
 
 ## 使用步骤
 
@@ -102,32 +107,20 @@ python predict.py --config_path ./conformer.yaml
 # using ctc prefix beam search decoder
 python predict.py --config_path ./conformer.yaml --decode_mode ctc_prefix_beam_search
 
-# using attention decoder
-python predict.py --config_path ./conformer.yaml --decode_mode attention
-
 # using attention rescoring decoder
 python predict.py --config_path ./conformer.yaml --decode_mode attention_rescoring
 ```
 
 ## **模型表现**
-训练的配置文件见 [conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml)。
-
-在 ascend 910(8p) 图模式上的测试性能:
-
-| model     | decoding mode          | CER          |
-| --------- | ---------------------- |--------------|
-| conformer | ctc greedy search      | 5.35         |
-| conformer | ctc prefix beam search | 5.36         |
-| conformer | attention decoder      | comming soon |
-| conformer | attention rescoring    | 4.95         |
-- 训练好的 [weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-548ee31b.ckpt) 可以在此处下载。
----
-在 ascend 910*(8p) 图模式上的测试性能:
-
-| model     | decoding mode          | CER          |
-| --------- | ---------------------- |--------------|
-| conformer | ctc greedy search      | 5.62         |
-| conformer | ctc prefix beam search | 5.62         |
-| conformer | attention decoder      | comming soon |
-| conformer | attention rescoring    | 5.12         |
-- 训练好的 [weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt) 可以在此处下载。
+
+在 ascend 910* mindspore2.3.1图模式上的测试性能:
+
+| model name| cars | batch type | jit level | s/step | recipe | weight |     decoding mode     | cer  |
+|:---------:|:----:|:----------:|:---------:|:------:|:------:|:------:|:---------------------:|:----:|
+| conformer |   8  |  bucket    |     O0    |  0.72  |[conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml)  |[weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt)     |ctc greedy search      | 5.62 |
+| conformer |   8  |  bucket    |     O0    |  0.72  |[conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml)  |[weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt)     |ctc prefix beam search | 5.62 |
+| conformer |   8  |  bucket    |     O0    |  0.72  |[conformer.yaml](https://github.com/mindspore-lab/mindaudio/blob/main/examples/conformer/conformer.yaml)  |[weights](https://download-mindspore.osinfra.cn/toolkits/mindaudio/conformer/conformer_avg_30-692d57b3-910v2.ckpt)     |attention rescoring    | 5.12 |
+<<<<<<< HEAD
+
+=======
+>>>>>>> 1d72af4 (update_231)
diff --git a/examples/deepspeech2/readme.md b/examples/deepspeech2/readme.md
@@ -3,7 +3,13 @@
 
 ## Introduction
 
-DeepSpeech2 is a speech recognition model trained using CTC loss. It replaces the entire manually designed component pipeline with neural networks and can handle a variety of speech, including noisy environments, accents, and different languages. The currently provided version supports using the [DeepSpeech2](http://arxiv.org/pdf/1512.02595v1.pdf) model for training/testing and inference on the librispeech dataset on NPU and GPU.
+DeepSpeech2 is a speech recognition model trained using CTC loss. It replaces the entire manually designed component pipeline with neural networks and can handle a variety of speech, including noisy environments, accents, and different languages. The currently provided version supports using the [DeepSpeech2](http://arxiv.org/pdf/1512.02595v1.pdf) model for training/testing and inference on the librispeech dataset on NPU.
+
+
+### Requirements
+| mindspore     |   ascend driver        | firmware     |  cann toolkit/kernel    |
+|:-------------:|:----------------------:|:------------:|:-----------------------:|
+|     2.3.1     |   24.1.RC2             | 7.3.0.1.231  |  8.0.RC2.bata1          |
 
 ### Model Architecture
 
@@ -96,6 +102,12 @@ python eval.py -c "./deepspeech2.yaml"
 
 ## **Model Performance**
 
-| Model        | Machine   | LM   | Test Clean CER | Test Clean WER | Parameters                                                                                               | Weights                                                         |
-|--------------|-----------|------|----------------|----------------|----------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|
-| DeepSpeech2  | D910x8-G  | No   | 3.461          | 10.24          | [yaml](https://github.com/mindsporelab/mindaudio/blob/main/example/deepspeech2/deepspeech2.yaml)          | [weights](https://download.mindspore.cn/toolkits/mindaudio/deepspeech2/deepspeech2.ckpt)               |
+Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode:
+
+<<<<<<< HEAD
+| model name | cards | batch size | jit level | s/step | recipe | weight | test clean cer | test clean wer | 
+=======
+| model name | cards | batch size | jit level | s/step | recipe | weight | test clean cer | test clean wer |
+>>>>>>> 1d72af4 (update_231)
+|:----------:|:-----:|:----------:|:---------:|:------:|:------:|:------:|:--------------:|:--------------:|
+| deepspeech2|   8   |   64       |    O0     |  2.82  | [yaml](https://github.com/mindsporelab/mindaudio/blob/main/example/deepspeech2/deepspeech2.yaml) | [weights](https://download.mindspore.cn/toolkits/mindaudio/deepspeech2/deepspeech2.ckpt)| 3.461 | 10.24 |
diff --git a/examples/deepspeech2/readme_cn.md b/examples/deepspeech2/readme_cn.md
@@ -4,7 +4,13 @@
 
 ## 介绍
 
-DeepSpeech2是一种采用CTC损失训练的语音识别模型。它用神经网络取代了整个手工设计组件的管道，可以处理各种各样的语音，包括嘈杂的环境、口音和不同的语言。目前提供版本支持在NPU和GPU上使用[DeepSpeech2](http://arxiv.org/pdf/1512.02595v1.pdf)模型在librispeech数据集上进行训练/测试和推理。
+DeepSpeech2是一种采用CTC损失训练的语音识别模型。它用神经网络取代了整个手工设计组件的管道，可以处理各种各样的语音，包括嘈杂的环境、口音和不同的语言。目前提供版本支持在NPU上使用[DeepSpeech2](http://arxiv.org/pdf/1512.02595v1.pdf)模型在librispeech数据集上进行训练/测试和推理。
+
+
+### 版本要求
+| mindspore     |   ascend driver        | firmware     |  cann toolkit/kernel    |
+|:-------------:|:----------------------:|:------------:|:-----------------------:|
+|     2.3.1     |   24.1.RC2             | 7.3.0.1.231  |  8.0.RC2.bata1          |
 
 ### 模型结构
 
@@ -16,6 +22,7 @@ DeepSpeech2是一种采用CTC损失训练的语音识别模型。它用神经网
 - 五个双向 LSTM 层（大小为 1024）
 - 一个投影层【大小为字符数加 1（为CTC空白符号)，28】
 
+
 ### 数据处理
 
 - 音频：
@@ -104,6 +111,6 @@ python eval.py -c "./deepspeech2.yaml"
 
 ## **性能表现**
 
-| model        | LM   | test clean cer| test clean wer | config                                     | weights|
-| ----------- | ---- | -------------- | -------------- |--------------------------------------------------------------------------------------------------| ------------------------------------------------------------ |
-| deepspeech2 | No   | 3.461          | 10.24          | [yaml](https://github.com/mindsporelab/mindaudio/blob/main/example/deepspeech2/deepspeech2.yaml) | [weights](https://download.mindspore.cn/toolkits/mindaudio/deepspeech2/deepspeech2.ckpt) |
+| model name | cards | batch size | jit level | s/step | recipe | weight | test clean cer | test clean wer | 
+|:----------:|:-----:|:----------:|:---------:|:------:|:------:|:------:|:--------------:|:--------------:|
+| deepspeech2|   8   |   64       |    O0     |  2.82  | [yaml](https://github.com/mindsporelab/mindaudio/blob/main/example/deepspeech2/deepspeech2.yaml) | [weights](https://download.mindspore.cn/toolkits/mindaudio/deepspeech2/deepspeech2.ckpt)| 3.461 | 10.24 |