モデル量子化 #24

Yosshi999 · 2023-09-23T08:39:54Z

decode onnxをstatic量子化するスクリプトを書きました。
キャリブレーションデータセット全体（1000samples)では試していませんが、30サンプルで量子化(python quantize.py --iteration_max 30)したところ、以下の結果になりました。

torch vs onnx

=== time for processing 20 texts ===
baseline: 245.578 sec
model: 171.609 sec
x1.431 faster
=== model's mean PSNR (higher is better) ===
85.95259521562996 dB

torch vs 量子化onnx

=== time for processing 20 texts ===
baseline: 237.828 sec
model: 76.859 sec
x3.094 faster
=== model's PSNR (higher is better) ===
28.151807769325913 dB

つまり、音声がかなり劣化していますが、torch版より3倍、元のonnxより2倍速くなっています
テストに使ったモデルは0.0.1なので、実際のモデルでどれくらい音声が劣化するか・キャリブレーションデータセットのサイズを増やすとどうなるかは試す必要がありそうです

Hiroshiba

おー！！！ありがとうございます！！！！！！

全然知らなかったのですが、キャリブレーションの流れは実際に元のモデルで推論した結果を使う流れでしょうか。
是非時間を作って試してみたいと思っています！！！

Yosshi999 · 2023-09-23T13:33:27Z

実際に元のモデルで推論した結果を使う

そうですね。convertで生成したonnxを使っています

Patchethium · 2023-09-23T14:15:18Z

Quantization usually doesn't work well with models produce continuous values, have you tried pruning?

Yosshi999 · 2023-09-23T15:07:05Z

@Patchethium
I'm not familiar with pruning, and I don't know how to prune convolution kernel. Decreasing channel? or kernel width?
I would be grateful if you have some tutorial pages

Patchethium · 2023-09-23T16:41:40Z

This may give you a brief introduction: https://github.com/VainF/Torch-Pruning#0-how-it-works

Hiroshiba · 2023-11-10T18:21:14Z

@Yosshi999 大変お待たせしました、試してみました！！！！
結果から言うと1000のデータでquantize_staticすると、PSNR=26.00 dBほどになりました。

以下ログです！

$ python quantize.py --onnx_dir onnxmodel/ --output_dir quantmodel/ --speaker_size 3 --use_gpu
2023-11-11 01:23:47,608 - quantize - INFO - loading calibration texts
2023-11-11 01:23:47,657 - quantize - INFO - loaded 1000 texts
2023-11-11 01:23:47,658 - quantize - INFO - loading forwarder
2023-11-11 01:23:50,020 - quantize - INFO - loaded forwarder in cuda mode
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [12:57<00:00,  1.29it/s] 

$ python test.py --baseline hiho_model/ --model quantmodel/ --speaker_id 2
loaded 20 texts
yukarin_s loaded!
yukarin_sa loaded!
yukarin_sosoa loaded!
Removing weight norm...
hifi-gan loaded!
torch: 20it [00:28,  1.41s/it]
onnx: 20it [00:11,  1.70it/s]
=== time for processing 20 texts ===
baseline: 149.125 sec
model: 84.969 sec
x1.755 faster
=== model's PSNR (higher is better) ===
26.008260494727715 dB

$ python run.py --yukarin_s_model_dir onnxmodel/ --yukarin_sa_model_dir onnxmodel/ --yukarin_sosoa_model_dir onnxmodel/ --hifigan_model_dir onnxmodel/ --method onnx --speaker_ids 1 2

$ python run.py --yukarin_s_model_dir quantmodel/ --yukarin_sa_model_dir quantmodel/ --yukarin_sosoa_model_dir quantmodel/ --hifigan_model_dir quantmodel/ --method onnx --speaker_ids 1 2

ちなみに100データだとPSNR=25.62dBでした。
100データのときのサンプル音声がこんな感じです！

onnx-.-2.mp4

onnx-.-1.mp4

ちなみに量子化なしがこんな感じです。

onnx-.-2.mp4

onnx-.-1.mp4

比較すると、全体的にホワイトノイズが乗っている感じなのかなと思いました！！！
なかなか面白い結果だと思います！

今のアルゴリズムがわからないので適当に言っているのですが、多分hifigan部分だけ量子化する形にすると、スピード辺りの品質劣化が抑えられるのかもとか思いました。
ちょっと調べてみた感じ、どこが大きく変わってしまっているのかの分析も可能っぽかったです。
https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#quantization-debugging

Hiroshiba

あ、gitignoreにonnxmodelとquantmodelを追加すると使い勝手上がるかもです！

Yosshi999 added 7 commits September 17, 2023 15:12

quantize script

b6ea00f

一時ファイルと生成物を分ける

6e5d768

update description to use onnxmodel

a5fd7c6

test.py for comparing to baseline

5f24099

create new folder to quantize

6ca4ea0

add iteration_max for testing quantization

08372b6

add description for quantization

35116fd

Hiroshiba reviewed Sep 23, 2023

View reviewed changes

Hiroshiba reviewed Nov 10, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

モデル量子化 #24

モデル量子化 #24

Yosshi999 commented Sep 23, 2023 •

edited

Loading

Hiroshiba left a comment

Yosshi999 commented Sep 23, 2023 •

edited

Loading

Patchethium commented Sep 23, 2023

Yosshi999 commented Sep 23, 2023

Patchethium commented Sep 23, 2023

Hiroshiba commented Nov 10, 2023

Hiroshiba left a comment

モデル量子化 #24

Are you sure you want to change the base?

モデル量子化 #24

Conversation

Yosshi999 commented Sep 23, 2023 • edited Loading

torch vs onnx

torch vs 量子化onnx

Hiroshiba left a comment

Choose a reason for hiding this comment

Yosshi999 commented Sep 23, 2023 • edited Loading

Patchethium commented Sep 23, 2023

Yosshi999 commented Sep 23, 2023

Patchethium commented Sep 23, 2023

Hiroshiba commented Nov 10, 2023

Hiroshiba left a comment

Choose a reason for hiding this comment

Yosshi999 commented Sep 23, 2023 •

edited

Loading

Yosshi999 commented Sep 23, 2023 •

edited

Loading