Issues with Tokenizer in TTS Es Multispeaker FastPitch HiFiGAN Model - Problems with Date Tokenization #8075 #8076

rodrigoGA · 2023-12-21T23:20:53Z

rodrigoGA
Dec 21, 2023

Hello,

I am currently working on integrating a Latin American Spanish text-to-speech (TTS) model into NVIDIA Riva, specifically using the tts_es_multispeaker_fastpitchhifigan model from NVIDIA NGC. After converting the model to a .riva extension using nemo2riva, I downloaded the tokenizer and verbalizer from Inverse Normalization ES-US.

I have deployed it in riva with

docker run --init -it --rm --gpus '"'"device=0"'"'   -v $(pwd):/data  -v riva-model-repo:/data-volumen  -e "MODEL_DEPLOY_KEY=tlt_encode"   --name riva-service-maker   nvcr.io/nvidia/riva/riva-speech:2.13.0-servicemaker

riva-build speech_synthesis  tts_es_hifigan_ft_fastpitch_multispeaker.rmir:tlt_encode tts_es_fastpitch_multispeaker.riva:tlt_encode  tts_es_hifigan_ft_fastpitch_multispeaker.riva:tlt_encode  --voice_name Latin-American-Spanish --wfst_tokenizer_model=tokenize_and_classify.far --wfst_verbalizer_model=verbalize.far --sample_rate 44100 --language_code es-US --num_speakers=174 --phone_set=ipa   --subvoices 0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7,8:8,9:9,10:10,11:11,12:12,13:13,14:14,15:15,16:16,17:17,18:18,19:19,20:20,21:21,22:22,23:23,24:24,25:25,26:26,27:27,28:28,29:29,30:30,31:31,32:32,33:33,34:34,35:35,36:36,37:37,38:38,39:39,40:40,41:41,42:42,43:43,44:44,45:45,46:46,47:47,48:48,49:49,50:50,51:51,52:52,53:53,54:54,55:55,56:56,57:57,58:58,59:59,60:60,61:61,62:62,63:63,64:64,65:65,66:66,67:67,68:68,69:69,70:70,71:71,72:72,73:73,74:74,75:75,76:76,77:77,78:78,79:79,80:80,81:81,82:82,83:83,84:84,85:85,86:86,87:87,88:88,89:89,90:90,91:91,92:92,93:93,94:94,95:95,96:96,97:97,98:98,99:99,100:100,101:101,102:102,103:103,104:104,105:105,106:106,107:107,108:108,109:109,110:110,111:111,112:112,113:113,114:114,115:115,116:116,117:117,118:118,119:119,120:120,121:121,122:122,123:123,124:124,125:125,126:126,127:127,128:128,129:129,130:130,131:131,132:132,133:133,134:134,135:135,136:136,137:137,138:138,139:139,140:140,141:141,142:142,143:143,144:144,145:145,146:146,147:147,148:148,149:149,150:150,151:151,152:152,153:153,154:154,155:155,156:156,157:157,158:158,159:159,160:160,161:161,162:162,163:163,164:164,165:165,166:166,167:167,168:168,169:169,170:170,171:171,172:172,173:173

riva-deploy -f tts_es_hifigan_ft_fastpitch_multispeaker.rmir:tlt_encode /data/models

However, I've encountered a significant issue: the tokenizer struggles with words that require tokenization, especially dates. Instead of processing and pronouncing these elements correctly, it either skips them or handles them incorrectly. This is in stark contrast to the performance of the default models in riva_quickstart_v2.13.0, where the es-ES model handles such tokenization challenges efficiently.

According to the NVIDIA NGC, the text normalizers were built using NeMo, but they do not seem to function correctly for dates, unlike the default models for es-ES.

I would like to inquire:

Are there any pre-trained text normalizers for Spanish that work well with dates?
Could the issue with date tokenization be arising from some other aspect of the configuration or implementation?
Any guidance or suggestions to resolve this issue with date tokenization would be greatly appreciated.

Best regards.

Answered by XuesongYang

Jan 26, 2024

my guess it is related to text normalization issue. Please comment in this repo: NVIDIA/NeMo-text-processing#135

View full answer

XuesongYang · 2024-01-26T22:06:03Z

XuesongYang
Jan 26, 2024
Collaborator

my guess it is related to text normalization issue. Please comment in this repo: NVIDIA/NeMo-text-processing#135

1 reply

rodrigoGA Jan 26, 2024
Author

Thank you for your response.
A few weeks after the questions, Nvidia uploaded a new text normalizer to the NGC for Spanish, which I found by chance.
I tried again and the new normalizer has worked correctly.
I am not including a link because at the moment I am receiving a 404 error when accessing the NGC catalog, perhaps they are doing some maintenance.
Cheers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with Tokenizer in TTS Es Multispeaker FastPitch HiFiGAN Model - Problems with Date Tokenization #8075 #8076

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Issues with Tokenizer in TTS Es Multispeaker FastPitch HiFiGAN Model - Problems with Date Tokenization #8075 #8076

rodrigoGA Dec 21, 2023

Replies: 1 comment · 1 reply

XuesongYang Jan 26, 2024 Collaborator

rodrigoGA Jan 26, 2024 Author

rodrigoGA
Dec 21, 2023

Replies: 1 comment 1 reply

XuesongYang
Jan 26, 2024
Collaborator

rodrigoGA Jan 26, 2024
Author