You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In terms of TTS, the only real option is elevenlabs, but it's about as latency-optimized as it can be.
In terms of STT, I'd say local whisper (if your GPU is up to the task of running it).
The smaller the model, the faster the voice recognition runs, but the less accurate it is. For english, I've found the medium size works well enough.
best TTS + STT pair?
is there one that has medium-high quality and a short time to wait between responses?
The text was updated successfully, but these errors were encountered: