vocodedev · ZeeshanLone · Jul 6, 2024 · Jul 8, 2024 · Jul 8, 2024 · Jul 9, 2024
diff --git a/apps/livekit/.env.example b/apps/livekit/.env.example
@@ -0,0 +1,6 @@
+LIVEKIT_API_KEY=your_livekit_api_key
+LIVEKIT_API_SECRET=your_livekit_api_secret
+LIVEKIT_WS_URL=your_livekit_ws_url
+OPENAI_API_KEY=your_openai_api_key
+DEEPGRAM_API_KEY=your_deepgram_api_key
+ELEVENLABS_API_KEY=your_elevenlabs_api_key
diff --git a/docs/agents.mdx b/docs/agents.mdx
@@ -29,6 +29,6 @@ agent behavior:
 - `language` sets the agent language (for more context see [Multilingual Agents](/multilingual))
 - `initial_message` controls the agents first utterance.
 - `initial_message_delay` adds a delay to the initial message from when the call begins
-- `ask_if_human_present_on_idle` allows the agent to speak when there is more than 4s of silence on the call
+- `ask_if_human_present_on_idle` allows the agent to speak when there is more than 15s of silence on the call
 - `llm_temperature` controls the behavior of the underlying language model. Values can range from 0 to 1, with higher
   values leading to more diverse and creative results. Lower values generate more consistent outputs.
diff --git a/docs/configuring-number.mdx b/docs/configuring-number.mdx
@@ -16,7 +16,7 @@ by modifying:
 
 ### Voice
 
-First, let's create a new voice via [ElevenLabs]("https://elevenlabs.io) and grab the voice ID.
+First, let's create a new voice via [ElevenLabs](https://elevenlabs.io) and grab the voice ID.
 
 ```
 voice = vocode_client.voices.create_voice(

diff --git a/docs/external-actions.mdx b/docs/external-actions.mdx
@@ -207,12 +207,12 @@ Vocode expects responses from the user’s API in JSON in the following format:
 
 ```python
 Response {
-	result: Any
+	result: Dict[str, Any]
 	agent_message: Optional[str] = None
 }
 ```
 
-- `result` is a payload containing the result of the action on the user’s side, and can be in any format
+- `result` is a payload containing the result of the action on the user’s side, and can have any schema
 - `agent_message` optionally contains a message that will be synthesized into audio and sent back to the phone call (see [Configuring the External Action](/external-actions#configuring-the-external-action) above for more info)
 
 In the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below, the user’s API could return back a JSON response that looks like:

diff --git a/docs/hosted-quickstart.mdx b/docs/hosted-quickstart.mdx
@@ -81,5 +81,5 @@ If you'd prefer to hit our API directly, take a look at our [API Reference](/api
 
 # Hosted Walkthrough
 
-Once you have Vocode installed, we suggest going through the [Hosted Walkthrough](/getting-number) which will
+Once you have Vocode installed, we suggest going through the [Hosted Walkthrough](/walkthrough_intro) which will
 show you how to start interacting with the API.
diff --git a/docs/images/livekit_keys.png b/docs/images/livekit_keys.png
diff --git a/docs/mint.json b/docs/mint.json
@@ -65,6 +65,7 @@
         "open-source/python-quickstart",
         "open-source/react-quickstart",
         "open-source/telephony",
+        "open-source/livekit-webrtc",
         "open-source/turn-based-conversation"
       ]
     },

diff --git a/docs/open-source/livekit-webrtc.mdx b/docs/open-source/livekit-webrtc.mdx
@@ -0,0 +1,45 @@
+---
+title: "Using WebRTC with LiveKit"
+description: "Deploy your Vocode Agents using WebRTC"
+---
+
+# Overview
+
+[WebRTC](https://webrtc.org/) is an alternative to websockets for real-time P2P communication. Vocode Agents are compatible with both WebRTC and websockets, enabling developers to pick
+the stack best suited for their application.
+
+To connect Vocode agents to WebRTC, Vocode uses [LiveKit](https://livekit.io/)–an open source platform for building on WebRTC. For a background on how LiveKit
+works, please see their [documentation](https://docs.livekit.io/home/get-started/intro-to-livekit/).
+
+In this guide, we'll be walking through how to connect a Vocode Agent to the [LiveKit Agents Playground](https://agents-playground.livekit.io/).
+
+# Walkthrough: hooking up a Vocode Agent to a LiveKit Room
+
+## Setting up your LiveKit Server
+
+First, you'll want to set up a LiveKit Server for your Agent. For simplicity, we are using LiveKit's hosted offering–but it can also be self hosted, since LiveKit is open source!
+
+In our LiveKit dashboard, we first generate our websocket URL, API key, and Secret Key.
+
+![Setup](/images/livekit_keys.png)
+
+## Deploying your Vocode agent to a LiveKit Room
+
+Once you have your LiveKit Server credentials, we can hook it up to Vocode via the `LiveKitConversation` abstraction. Using the starter code in
+[vocode-core/apps/livekit/app.py](https://github.com/vocodedev/vocode-core/blob/main/apps/livekit/app.py), you can quickly deploy a Vocode Agent to accept
+new job requests.
+
+Fill in your credentials in `.env`:
+
+```bash
+
+LIVEKIT_SERVER_URL=wss://your-livekit-ws-url.livekit.cloud
+LIVEKIT_API_KEY="KEY"
+LIVEKIT_API_SECRET="SECRET"
+```
+
+Followed by
+`poetry run python app.py dev`
+
+And now you can connect to the [Agents Playground](https://agents-playground.livekit.io/) to interact with your agent. With LiveKit, you can connect Vocode
+agents to any web application and leverage their [React Component](https://docs.livekit.io/reference/components/react/) library.
diff --git a/docs/open-source/using-synthesizers.mdx b/docs/open-source/using-synthesizers.mdx
@@ -13,13 +13,14 @@ Vocode currently supports the following synthesizers:
 
 1. Azure (Microsoft)
 2. Google
-3. Eleven Labs
-4. Rime
-5. Play.ht
-6. GTTS (Google Text-to-Speech)
-7. Stream Elements
-8. Bark
-9. Amazon Polly
+3. Cartesia
+4. Eleven Labs
+5. Rime
+6. Play.ht
+7. GTTS (Google Text-to-Speech)
+8. Stream Elements
+9. Bark
+10. Amazon Polly
 
 These synthesizers are defined using their respective configuration classes, which are subclasses of the `SynthesizerConfig` class.
 
@@ -83,7 +84,38 @@ synthesizer_config=PlayHtSynthesizerConfig.from_telephone_output_device(
 ...
 ```
 
-### Example 2: Using Azure in StreamingConversation locally
+### Example 2: Using Cartesia's streaming synthesizer
+
+We support Cartesia's [low-latency streaming API](https://docs.cartesia.ai/api-reference/endpoints/stream-speech-websocket) enabled by WebSockets. You can use the `CartesiaSynthesizer` with the `CartesiaSynthesizerConfig` to enable this feature.
+
+#### Telephony
+
+```python
+synthesizer_config=CartesiaSynthesizerConfig.from_telephone_output_device(
+    api_key=os.getenv("CARTESIA_API_KEY"),
+    voice_id=os.getenv("CARTESIA_VOICE_ID"),
+)
+```
+
+In this example, the `CartesiaSynthesizerConfig.from_output_device()` method is used to create a configuration object for the Cartesia synthesizer.
+The method takes a `speaker_output` object as an argument, and extracts the `sampling_rate` and `audio_encoding` from the output device.
+
+#### Controlling Speed & Emotions
+
+You can set the `speed` and `emotion` parameters in the `CartesiaSynthesizerConfig` object to control the speed and emotions of the agent's voice! See [this page](https://docs.cartesia.ai/user-guides/voice-control) for more details.
+
+```python
+CartesiaSynthesizerConfig(
+    api_key=os.getenv("CARTESIA_API_KEY"),
+    voice_id=os.getenv("CARTESIA_VOICE_ID"),
+    experimental_voice_controls={
+        "speed": "slow",
+        "emotion": "positivity: high"
+    }
+)
+```
+
+### Example 3: Using Azure in StreamingConversation locally
 
 ```python
 from vocode.streaming.models.synthesizer import AzureSynthesizerConfig

diff --git a/docs/walkthrough_intro.mdx b/docs/walkthrough_intro.mdx
@@ -6,8 +6,8 @@ description: "Setting up a simple receptionist agent"
 Welcome to the Vocode API! We've got a lot of powerful features that we're going to illustrate
 by setting up a receptionist agent that can take calls and book calendar appointments.
 
-We'll cover how to do it step-by-step entirely via API or you could also follow along usig our
-[Dashboard]("https://dashboard.vocode.dev).
+We'll cover how to do it step-by-step entirely via API or you could also follow along using our
+[Dashboard](https://dashboard.vocode.dev).
 
 In particular, we'll go through the following steps:
 

diff --git a/playground/streaming/agent/chat.py b/playground/streaming/agent/chat.py
@@ -197,7 +197,7 @@ async def sender():
 
     await asyncio.gather(receiver(), sender())
     if actions_worker is not None:
-        actions_worker.terminate()
+        await actions_worker.terminate()
 
 
 async def agent_main():
@@ -233,7 +233,7 @@ async def agent_main():
     try:
         await run_agent(agent, interruption_probability=0, backchannel_probability=0)
     except KeyboardInterrupt:
-        agent.terminate()
+        await agent.terminate()
 
 
 if __name__ == "__main__":