WhisPeriment

Experimenting with the openAI Whisper model for speach to text (STT) the model is working on CPU with FP32 or GPU if available with FP16 to reduce the model memory need and be able to run it on a graphics card with only 4G of VRAM. I've added instruction to run it with CUDA or ROCm however the ROCm instructions have not been tested as I do not have the adequate hardware

This is a simple Jupyter notebook to experiment with the whisper model from OpenAi if you follow the instruction in a detailed maner it should work weather on Windows or linux.

The notebook allow you to:

Generate a adequate cuda environement
Load the required python packadges
Set:
- The recording device for themicrophone to be used
- The recording time
- The speach language
- The model task "transcribe" or "translate"
- The model to be loaded
Use a pre-recorded file or Record an audio file
Allows to play it
Use whisper to transcribe or translate the speach content of the audio file

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md
WhisPeriment.ipynb		WhisPeriment.ipynb
test_record_english.wav		test_record_english.wav
test_record_french.wav		test_record_french.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WhisPeriment

About

Releases

Packages

Languages

License

kolergy/WhisPeriment

Folders and files

Latest commit

History

Repository files navigation

WhisPeriment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages