Long-form audio transcription with Citrinets #2172
titu1994
started this conversation in
Show and tell
Replies: 1 comment 4 replies
-
Hey, @titu1994 really appreciate the notebook!! These notebooks are invaluable for those of us focusing more on long-form audio. I'm curious what considerations need to be taken when working with Citrinet vs the streaming conformer script added in v1.2.0 with respect to compromises in accuracy, given that Citrinet operates in offline vs the streaming conformer script. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Long form audio transcription
Long-form audio transcription in the simplest sense is when an ASR model is evaluated on audio clips that are significantly longer in duration as compared to what the model was trained on. For example: Most models are trained on audio clips of length 15-20 seconds, but can possibly be evaluated on audio clips that are around 1 minute long.
There are many variations to long form audio transcription research - whether the model is used in streaming mode or not, the difference in duration between training and evaluation, whether the model was trained with specific losses to enable long form audio transcription etc.
This notebook demonstrates how to transcribe a relatively long audio file from a podcast in a single forward pass with a Citrinet model. The model operates in offline mode (is given entire audio sequence at once) and must transcribe it without streaming inference.
To make things more realistic, we chose a podcast where the discussion revolves around a technical topic, a domain that the model was never trained on. We preprocess the podcast to be in a format similar that can easily be compared to the model's transcription. Finally we also run some checks to see what's the longest audio segment that can be transcribed in a single forward pass.
Colab link to Notebook
Beta Was this translation helpful? Give feedback.
All reactions