Whisper on accelerated edge devices #1124
Replies: 4 comments 1 reply
-
This is a really cool addition to the Whisper project @maxbbraun. I will be able to test it out on the new Jetson Orin Nano soon. ~10x in isolating meaning for everyone 10 recorded seconds it needs 1 second to process? |
Beta Was this translation helpful? Give feedback.
-
Thanks! Looking forward to your results. Yes, I was seeing inference performance of about 1 second per 10 seconds of audio to process. I say in isolation, because once I added in the (very unoptimized) overhead of running the recording loop, it dropped to about 10 seconds per 10 seconds. |
Beta Was this translation helpful? Give feedback.
-
Hi @essiebes, sorry for the tag, but this Jetson device looks awesome and I am really curious how your tests with Jetson Orin Nano and whisper went. Like, what is the largest model size you managed to fit in its memory? |
Beta Was this translation helpful? Give feedback.
-
Hi @maxbbraun, I'd like to know how long does it take to transcribe a 30' audio? |
Beta Was this translation helpful? Give feedback.
-
Hi all!
I'm sharing
whisper-edge
, a project to bring Whisper inference to edge devices with ML accelerator hardware. It's mainly meant for real-time transcription from a microphone.It currently works reasonably well for NVIDIA Jetson Nano after some tweaks to the runtime and without any need to modify the models themselves. There should still be plenty of room for improvement to optimize latency and quality. Some ideas here.
I also started exploring Google Coral, which requires additional surgery on the models. Potential next steps are outlined here.
Hope this is useful to some of you!
Max
Beta Was this translation helpful? Give feedback.
All reactions