Release v0.1.109 · vocodedev/vocode-core

Optimizations:

Refactors StreamingConversation as a pipeline of consumer-producer workers - now transcription / agent response / synthesis are decoupled into their own async processes. Shoutout to @jnak for helping us out with the refactor. Upshots:
- The LLM call no longer blocks the processing of new transcripts
- Playing the output audio runs concurrently with both generating the responses and synthesizing audio, so while each sentence is being played, the next response is being generated and synthesized - for synthesizer with latencies > 1s, there is no longer a delay between each sentence of a response.
- Resource management: synthesizers no longer need a dedicated thread, so e.g. a single telephony server can now support double the number of concurrent phone calls

Contribution / Code cleanliness:

Simple tests that assert StreamingConversation works across all supported Python versions: run this locally with make test
Typechecking with mypy: run this locally with make typecheck

Features:

ElevenLabs optimize_streaming_latency parameter
Adds the Twilio to and from numbers to the CallConfig in the ConfigManager (h/t @Nikhil-Kulkarni)
AssemblyAI buffering (solves vocodedev/vocode-react-sdk#6) (h/t @m-ods)
Option to record Twilio calls (h/t @shahafabileah)
Adds mute_during_speech parameter to Transcribers as a solution to speaker feedback into microphone: see note in #16

Provide feedback