Support different tts model types. #1541

csukuangfj · 2024-03-11T04:44:49Z

low (quality) -> runs faster. high (quality) -> runs slower

Need to test it before merging.

low (quality) -> runs faster. high (quality) -> runs slower

csukuangfj · 2024-03-11T04:45:36Z

egs/ljspeech/TTS/vits/vits.py

@@ -38,6 +39,36 @@
    "hifigan_multi_scale_multi_period_discriminator": HiFiGANMultiScaleMultiPeriodDiscriminator,  # NOQA
 }

+LOW_CONFIG = {


The config values are from
https://github.com/rhasspy/piper/blob/master/src/python/piper_train/__main__.py#L68

csukuangfj · 2024-03-11T04:46:41Z

egs/ljspeech/TTS/vits/export-onnx.py

-    quantize_dynamic(
-        model_input=model_filename,
-        model_output=model_filename_int8,
-        weight_type=QuantType.QUInt8,


Quantizing using quint8 is very slow at run time, so we removed it.

csukuangfj · 2024-03-12T08:37:23Z

The following wave is generated from a model trained with --model-type low.

low.mp4

The following wave is generated with a model trained with --model-type medium

medium.mp4

The following wave is generated with a model trained with --model-type high

high.mp4

Support different tts model types.

b33d382

low (quality) -> runs faster. high (quality) -> runs slower

csukuangfj commented Mar 11, 2024

View reviewed changes

csukuangfj added 6 commits March 12, 2024 16:39

Update README

9681263

Upate reamd to add a link to a medium model

71e77e0

typo fixes

7f9cbf1

minor fixes

a92c6df

Update doc

4abfdc7

minor fixes

4c41443

csukuangfj merged commit 81f518e into k2-fsa:master Mar 12, 2024
64 checks passed

csukuangfj deleted the tts-fix branch March 12, 2024 14:29

Provide feedback