Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what is the best combination for low-latency? #4

Open
sergenti opened this issue Jul 20, 2023 · 1 comment
Open

what is the best combination for low-latency? #4

sergenti opened this issue Jul 20, 2023 · 1 comment

Comments

@sergenti
Copy link

best TTS + STT pair?
is there one that has medium-high quality and a short time to wait between responses?

@lugia19
Copy link
Owner

lugia19 commented Jul 20, 2023

In terms of TTS, the only real option is elevenlabs, but it's about as latency-optimized as it can be.
In terms of STT, I'd say local whisper (if your GPU is up to the task of running it).

The smaller the model, the faster the voice recognition runs, but the less accurate it is. For english, I've found the medium size works well enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants