transcriber

Conversion of audio files to text using Whisper from OpenAI with a simple GUI

This is my first project in Python. It uses Whisper, the brilliant, opensource, automatic speech recognition model from OpenAI. The GUI is very basic/functional at present. My inspiration for this was for the potential use in clinical settings as the data and transcription can be processed locally.

Audio files can be 'drag and dropped' or manually added for selection.

The 'transcribe' button sends the selected audio file to the model. The result is returned as text to the textbox and is automatically added to the clipboard for use in a word processor/email.

The model can be changed between any of the available options. The smaller models run faster but accuracy is reduced.

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~32x
base	74 M	`base.en`	`base`	~1 GB	~16x
small	244 M	`small.en`	`small`	~2 GB	~6x
medium	769 M	`medium.en`	`medium`	~5 GB	~2x
large	1550 M	N/A	`large`	~10 GB	1x

I will continue to make improvements/changes. I'm grateful for any feedback and advice

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

transcriber

Files

README.md

Latest commit

History

README.md

File metadata and controls

transcriber