VoiceToJapanese v1.1.6
Installation instructions:
Download the zip file and run the UI.exe inside.
VoiceToJapanese v1.1.6 change notes:
-
Added subtitle generation with various options. You can change the audio source for subtitler through dropdown menu
-
Added options to use cloud ai providers (deepl and voicevox), you can provide your own deepl api key to use deepl translation on the cloud. You can also get voicevox on the cloud with or without api key. If you don't provide an API key, voicevox would be slower but can still be faster than running locally depending on your hardware. You can get voicevox api key here(free): https://su-shiki.com/api/
-
The problem of whisper not handling concurrent request is solved by separating recording and transcription into two different threads. The recording thread puts the recordings into a queue, the transcription thread continuously checks if the queue is empty, if it is not empty, remove the recording from the queue and transcribe. Writing subtitler this way prevents missing any audio while transcribing and ensures no concurrent requests are made.
-
fixed voice select bug where changing the speaker alone does not change the speaker id
-
filters out bad whisper output ('you', 'thank you.', 'thanks for watching.'). Whisper often recognizes background noise as these phrases. The filter is not configurable as of now.
-
translation model has been bundled with the program
-
Added check for CUDA status, however this does not work because the pytorch bundled with the program is CPU version, the GPU version is too big (4GB) for github