Speech Recognition Streaming only transcribing "You" #187

theman23290 · 2023-11-19T07:00:07Z

Then the speech recognition is streaming the transcribed output is always "you". It is using whisper for the transcribing. When I specifically use whisper and click on the microphone it works perfectly. But when streaming it only shows the word "you" on the terminal even if I don't say anything. I can confirm the microphone is activated when recording the audio. I have used SillyTavern on Windows 11, Debian, and Modded Debian with the same result. Any recommendations on what I can do to resolve this? I am on the latest ffmpeg, running the latest Extras in conda, and have enough horsepower to run the Extras program as intended.

theman23290 · 2023-11-19T14:53:27Z

This issue seems to be related to this issue with Whisper: openai/whisper#679
TLDR: Implement --condition_on_previous_text and VAD, and the issues go away. Any way to implement that fix into this project?

Cohee1207 · 2023-11-19T14:54:39Z

That's for @Tony-sama to consider.

Cohee1207 · 2023-11-19T17:20:16Z

Check the recent commit. Is that what you asked?

theman23290 · 2023-11-21T23:25:53Z

I believe so. The fix still didn't fix the original issue though. I don't know if this is a whisper issue or if it is an issues with how whisper is implemented in this code. Here is the output on the terminal while a client is connected through api.

/home/senpai/miniconda/envs/extras/lib/python3.11/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Transcripted from audio file (whisper): you
172.18.0.2 - - [19/Nov/2023 21:31:21] "POST /api/speech-recognition/streaming/record-and-transcript HTTP/1.1" 200 -
172.18.0.2 - - [19/Nov/2023 21:31:21] "OPTIONS /api/speech-recognition/streaming/record-and-transcript HTTP/1.1" 200 -
Start recording from: default with samplerate 44100
Transcripted from microphone stream (vosk):
Recorded message saved to stt_test.wav
/home/senpai/miniconda/envs/extras/lib/python3.11/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Transcripted from audio file (whisper): you
172.18.0.2 - - [19/Nov/2023 21:31:27] "POST /api/speech-recognition/streaming/record-and-transcript HTTP/1.1" 200 -
172.18.0.2 - - [19/Nov/2023 21:31:27] "OPTIONS /api/speech-recognition/streaming/record-and-transcript HTTP/1.1" 200 -
Start recording from: default with samplerate 44100
Transcripted from microphone stream (vosk):
Recorded message saved to stt_test.wav

It repeats this output until the client disconnects. IDK where the bug is. From the research that I look into, it is more of an issue with the way whisper is implemented.

Statford · 2024-01-29T06:25:23Z

Hi, I had the same problem and all I did was leave it for a week, reboot it, and it (for whatever reason) worked perfectly after that. I wish I could be more helpful than that, but I had the same problem with my installation of whisper.
#217

Cohee1207 added a commit that referenced this issue Nov 19, 2023

#187 Set condition_on_previous_text for whisper modules

61efbb9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech Recognition Streaming only transcribing "You" #187

Speech Recognition Streaming only transcribing "You" #187

theman23290 commented Nov 19, 2023

theman23290 commented Nov 19, 2023 •

edited

Loading

Cohee1207 commented Nov 19, 2023

Cohee1207 commented Nov 19, 2023

theman23290 commented Nov 21, 2023

Statford commented Jan 29, 2024

Speech Recognition Streaming only transcribing "You" #187

Speech Recognition Streaming only transcribing "You" #187

Comments

theman23290 commented Nov 19, 2023

theman23290 commented Nov 19, 2023 • edited Loading

Cohee1207 commented Nov 19, 2023

Cohee1207 commented Nov 19, 2023

theman23290 commented Nov 21, 2023

Statford commented Jan 29, 2024

theman23290 commented Nov 19, 2023 •

edited

Loading