Citrinet CTC Decoder Alphabet size mismatch. #9554

huks0 · 2024-06-27T10:51:35Z

I trained a Citrinet and intended to use the CTC Decoder to retrieve a corrected output.

Using the CTC Beam Search Decoder of DeepSpeech I get the following error:

[ctc_beam_search_decoder.cpp:279] FATAL: "(alphabet.GetSize()+1) == (class_dim)" check failed. Number of output classes in acoustic model does not match number of labels in the alphabet file. Alphabet file must be the same one that was used to train the acoustic model.

I have controlled the alphabet and it has the size of 1023, even though I built it with 1024 characters. I built the tokenizer and alphabet with the nemo script setting the vocab size to 1024, spe and unigram. The output shape of the model is 1025. I believe the mismatch is 1 character. I thought of blank or unk token, but I aint sure if that is the cause of the error. Any idea why this happens or how to solve it? I appreciate your help!!

huks0 added the bug Something isn't working label Jun 27, 2024

elliottnv assigned titu1994 Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Citrinet CTC Decoder Alphabet size mismatch. #9554

Citrinet CTC Decoder Alphabet size mismatch. #9554

huks0 commented Jun 27, 2024

Citrinet CTC Decoder Alphabet size mismatch. #9554

Citrinet CTC Decoder Alphabet size mismatch. #9554

Comments

huks0 commented Jun 27, 2024