Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moonshine wrong output #1544

Open
thewh1teagle opened this issue Nov 15, 2024 · 2 comments
Open

Moonshine wrong output #1544

thewh1teagle opened this issue Nov 15, 2024 · 2 comments

Comments

@thewh1teagle
Copy link
Contributor

thewh1teagle commented Nov 15, 2024

https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.10.30/sherpa-onnx-v1.10.30-osx-universal2-shared.tar.bz2
tar xf sherpa-onnx-v1.10.30-osx-universal2-shared.tar.bz2
./sherpa-onnx-v1.10.30-osx-universal2-shared/bin/sherpa-onnx-offline \
    --moonshine-preprocessor="sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx" \
    --moonshine-encoder="sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx" \
    --moonshine-uncached-decoder="sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx" \
    --moonshine-cached-decoder="sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx" \
    --tokens="sherpa-onnx-moonshine-tiny-en-int8/tokens.txt" \
    --num-threads=1 \
    audio.wav # https://www.youtube.com/watch?v=hjkIIWt4bUI English

Output:

./sherpa-onnx-v1.10.30-osx-universal2-shared/bin/sherpa-onnx-offline \
    --moonshine-preprocessor="sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx" \
    --moonshine-encoder="sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx" \
    --moonshine-uncached-decoder="sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx" \
    --moonshine-cached-decoder="sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx" \
    --tokens="sherpa-onnx-moonshine-tiny-en-int8/tokens.txt" \
    --num-threads=1 \
    audio.wav
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:375 ./sherpa-onnx-v1.10.30-osx-universal2-shared/bin/sherpa-onnx-offline --moonshine-preprocessor=sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx --moonshine-encoder=sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx --moonshine-uncached-decoder=sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx --moonshine-cached-decoder=sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx --tokens=sherpa-onnx-moonshine-tiny-en-int8/tokens.txt --num-threads=1 audio.wav 

OfflineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80, low_freq=20, high_freq=-400, dither=0), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="", decoder="", language="", task="transcribe", tail_paddings=-1), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), wenet_ctc=OfflineWenetCtcModelConfig(model=""), sense_voice=OfflineSenseVoiceModelConfig(model="", language="auto", use_itn=False), moonshine=OfflineMoonshineModelConfig(preprocessor="sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx", encoder="sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx", uncached_decoder="sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx", cached_decoder="sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx"), telespeech_ctc="", tokens="sherpa-onnx-moonshine-tiny-en-int8/tokens.txt", num_threads=1, debug=False, provider="cpu", model_type="", modeling_unit="cjkchar", bpe_vocab=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5, blank_penalty=0, rule_fsts="", rule_fars="")
Creating recognizer ...
Started
Done!

audio.wav
{"lang": "", "emotion": "", "event": "", "text": " and the other parts of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe are the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the universe, and the elements of the universe, and the elements of the universe, and the elements of the universe, and the universe, and the elements of the elements of the elements of the universe, and the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements of the elements", "timestamps": [], "tokens":[" and", " the", " other", " parts", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", " are", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " universe", ",", " and", " the", " universe", ",", " and", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " universe", ",", " and", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements", " of", " the", " elements"], "words": []}
----
num threads: 1
decoding method: greedy_search
Elapsed seconds: 28.463 s
Real time factor (RTF): 28.463 / 127.516 = 0.223

Correct transcript: https://youtubetranscript.com/?v=hjkIIWt4bUI

@csukuangfj
Copy link
Collaborator

could you upload the test wav?

also, is there a reason to send such a long wav directly to a non-streaming model?

@thewh1teagle
Copy link
Contributor Author

could you upload the test wav?

https://github.com/thewh1teagle/sherpa-rs/releases/download/v0.5.1/moonshaine-test.wav

also, is there a reason to send such a long wav directly to a non-streaming model?

I just do simple tests and it's just 2 minutes. I remember that you said that the model doesn't have inference size limitation like whisper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants