Skip to content

Speech to text on short audio file #169

Answered by guillaumekln
rumbu13 asked this question in Q&A
Discussion options

You must be logged in to vote

It’s possible your audio triggers the "temperature fallback" which makes the transcription much slower. But that’s how Whisper tries to recover bad transcriptions by default.

Here are things you can try:

  • Looks like your CPU has 6 cores, so add cpu_threads=6 when loading the model
  • Use beam_size=1
  • Disable the temperature fallback with temperature=0 (might impact the transcription quality)
  • If you don’t care about timestamps, disable them with without_timestamps=True

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@rumbu13
Comment options

@guillaumekln
Comment options

Answer selected by rumbu13
@rumbu13
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants