Releases: vocodedev/vocode-core
v0.1.114a2
poetry version prerelease (#680)
v0.1.114a1
poetry version prerelease (#665)
0.1.114a0
Full Changelog: v0.1.113...v0.1.114a0
0.1.113
Super excited to announce a new release after a while - this release is special, in that it marks a change for Vocode. Going forward, we will be working on Vocode Core as our priority, and no longer gating functionality behind our Hosted API - the foundation for the API will be available here in vocode-core. Our team will be building features to benefit the whole community, and it'll all be open source.
Highlights
👥 Conversation Mechanics
- Better endpointing (agnostic of transcribers)
- Better interruption handling
- Guide
🕵️ Agents
- ✨NEW✨ Anthropic-based Agent
- Supports all Claude 3 Models
- OpenAI GPT-4o Support
- Azure OpenAI revamp
💪 Actions
- ✨NEW✨ External Actions - Guide
- Improved Call Transfer
- ✨NEW✨ Wait Actions (IVR Navigation)
- ✨NEW✨ Phrase triggers for actions (instead of function calls) - Guide
🗣️ Synthesizers
- ElevenLabs
- ✨NEW✨ Websocket-based Client
- Updated RESTful client
- ✨NEW✨ PlayHT Synthesizer “v2” with PlayHT On-Prem Support
- Rime Mist support
✍️ Transcribers
- ✨NEW✨ Deepgram built-in endpointing
📞 Telephony
- Twilio
- Stronger interruption handling by clearing audio queues
- Vonage
- Koala Noise Suppression
🎉 DevEx / Miscellaneous
- ✨NEW✨ Loguru for improved logging formatting - Guide
- Some new utilities to make setting up loguru in your projects fast and easy 😉
- Sentry for Metric / Error Collection - Guide
- Clean handling of content filters in ChatGPT agents
- Redis Message Queue for tracking mid-call events across different instances
Thanks so much to the folks who worked on this! @arpagon, @DanteNoguez, @rjheeta, @skirdey, @ajar98, @adnaans, @Kian1354, @srhinos, @VladCuciureanu
Full Changelog: v0.1.111...v0.1.113
0.1.111
🚀 Highlights since 0.1.110:
Action agents
Uses the OpenAI function calls API to take actions during a call: see https://docs.vocode.dev/action-agents for docs!
- We currently support a few actions out of the box - sending an email (via Nylas) and transferring a phone call to another number (h/t @sethgw ). We'd love to see more PRs adding more integrations to make Vocode agents more powerful!
Streaming MP3
The ElevenLabs synthesizer now can stream mp3 chunk by chunk! This will greatly improve the performance of ElevenLabs - but it's currently behind an experimental flag since we're still messing around with it:
ElevenLabsSynthesizerConfig.from_output_device(output_device, ..., experimental_streaming=True)
Other highlights
- Vector Database support: connect your Pinecone and have the bot query your knowledge base to inform its responses
- Support for llama.cpp agents: 6c726e7
- Other integrations: Gladia, Vertex AI,
🌆 On the horizon:
- ElevenLabs / Play.ht Input Streaming: https://twitter.com/elevenlabsio/status/1688638033980014592
- More work on sentence splitting: #338
- More releases! We plan to publish the package more often so folks can try out the stuff we experiment with - if we're not sure the version is super stable we'll publish a pre-release and announce it on Discord.
Full Changelog: v0.1.110...v0.1.111
New Contributors
- @zaptrem made their first contribution in #172
- @m-ods made their first contribution in #177
- @yantao0527 made their first contribution in #136
- @reuben made their first contribution in #189
- @AlanLiu96 made their first contribution in #216
- @osilverstein made their first contribution in #242
- @khryniewicz made their first contribution in #247
- @KShah707 made their first contribution in #263
- @Van0SS made their first contribution in #266
- @sethgw made their first contribution in #275
- @ramatronics made their first contribution in #270
- @applebaconsoda123 made their first contribution in #295
- @wwzeng1 made their first contribution in #306
- @divst3r made their first contribution in #325
- @bjquinn made their first contribution in #328
- @arpagon made their first contribution in #335
0.1.110
🚀 Features:
digits
parameter inOutboundCall
to send DTMF tones to a phone call before the call is picked up- Azure OpenAI support for
ChatGPTAgent
- Tracing docs: https://docs.vocode.dev/tracing
- Refactors Agents as workers (PR) - now, user implemented agents have full access to the output queue, which means they can send responses into the conversation without being specifically prompted. e.g. "Are you still there?"
🌅 On the horizon:
- Benchmarking app to time various transcribers, agents, and synthesizers
- Support for taking actions in a conversation: see wip PR
v0.1.109
Optimizations:
- Refactors
StreamingConversation
as a pipeline of consumer-producer workers - now transcription / agent response / synthesis are decoupled into their own async processes. Shoutout to @jnak for helping us out with the refactor. Upshots:- The LLM call no longer blocks the processing of new transcripts
- Playing the output audio runs concurrently with both generating the responses and synthesizing audio, so while each sentence is being played, the next response is being generated and synthesized - for synthesizer with latencies > 1s, there is no longer a delay between each sentence of a response.
- Resource management: synthesizers no longer need a dedicated thread, so e.g. a single telephony server can now support double the number of concurrent phone calls
Contribution / Code cleanliness:
- Simple tests that assert
StreamingConversation
works across all supported Python versions: run this locally withmake test
- Typechecking with
mypy
: run this locally withmake typecheck
Features:
- ElevenLabs
optimize_streaming_latency
parameter - Adds the Twilio
to
andfrom
numbers to theCallConfig
in theConfigManager
(h/t @Nikhil-Kulkarni) - AssemblyAI buffering (solves vocodedev/vocode-react-sdk#6) (h/t @m-ods)
- Option to record Twilio calls (h/t @shahafabileah)
- Adds
mute_during_speech
parameter to Transcribers as a solution to speaker feedback into microphone: see note in #16