Skip to content

kolergy/WhisPeriment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WhisPeriment

Experimenting with the openAI Whisper model for speach to text (STT) the model is working on CPU with FP32 or GPU if available with FP16 to reduce the model memory need and be able to run it on a graphics card with only 4G of VRAM. I've added instruction to run it with CUDA or ROCm however the ROCm instructions have not been tested as I do not have the adequate hardware

This is a simple Jupyter notebook to experiment with the whisper model from OpenAi if you follow the instruction in a detailed maner it should work weather on Windows or linux.

The notebook allow you to:

  • Generate a adequate cuda environement

  • Load the required python packadges

  • Set:

    • The recording device for themicrophone to be used
    • The recording time
    • The speach language
    • The model task "transcribe" or "translate"
    • The model to be loaded
  • Use a pre-recorded file or Record an audio file

  • Allows to play it

  • Use whisper to transcribe or translate the speach content of the audio file

About

Experimenting with the Whisper model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published