NeMo ASR Demo for Transcription #5275

titu1994 · 2022-10-28T17:50:50Z

titu1994
Oct 28, 2022
Maintainer

As the NeMo ASR Collection grows, and we support more languages, it can get a bit complicated when trying to find checkpoints for certain languages. This is especially true now that the community has started to contribute checkpoints in various languages.

So we present a new Hugging Face Space that will allow inference on all NeMo checkpoints that are uploaded to HF !

Link - https://huggingface.co/spaces/smajumdar/nemo_multilingual_language_id

In it, you can either upload a file, or use your microphone to record a piece of audio, then select a language and a model of your choice in that language to perform transcription.

We will also be adding this demo inside of the ASR docs page for ease of use.

We encourage users to submit their own checkpoints on Hugging Face so that others in the community may transcribe speech in as many languages as possible !

itzsimpl · 2022-10-29T14:28:41Z

itzsimpl
Oct 29, 2022

Are there any specific guidelines that should be adhered to, if one wants to publish a model/checkpoint that will then be automatically picked up?

1 reply

titu1994 Oct 29, 2022
Maintainer Author

This tutorial has some info on uploading any NeMo models to HF (not just ASR!)

Overall for ASR domain, as long as one follows the ASR model naming convention of "stt_<langid | multilingual>_..." This space will pick up the lang id correctly.

https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/Publish_NeMo_Model_On_Hugging_Face_Hub.ipynb

itzsimpl · 2022-10-29T14:31:41Z

itzsimpl
Oct 29, 2022

BTW. Note that the integration of the demo inside the ASR docs, has one small issue; i.e. it never requests for permission to access the microphone, hence trying the "record from microphone" path will always error out.

2 replies

titu1994 Oct 29, 2022
Maintainer Author

Hmm, that might be a gradio issue, might have to reach out to them. It seems to work on Chrome where it asks for permission the first time.

titu1994 Oct 29, 2022
Maintainer Author

Oh inside of the docs. Understood I will reach out to gradio folks. It's probably some web thing I've missed out on, or iframe doesn't work well with permissions and I need to use the js library instead.

Let's see, if that's the recommendation I'll update the docs page.

gradio-app/gradio#2565

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NeMo ASR Demo for Transcription #5275

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

NeMo ASR Demo for Transcription #5275

titu1994 Oct 28, 2022 Maintainer

Replies: 2 comments · 3 replies

itzsimpl Oct 29, 2022

titu1994 Oct 29, 2022 Maintainer Author

itzsimpl Oct 29, 2022

titu1994 Oct 29, 2022 Maintainer Author

titu1994 Oct 29, 2022 Maintainer Author

titu1994
Oct 28, 2022
Maintainer

Replies: 2 comments 3 replies

itzsimpl
Oct 29, 2022

titu1994 Oct 29, 2022
Maintainer Author

itzsimpl
Oct 29, 2022

titu1994 Oct 29, 2022
Maintainer Author

titu1994 Oct 29, 2022
Maintainer Author