livekit-examples · ShayneP · Jun 18, 2025 · Jun 18, 2025 · Jun 18, 2025 · Jun 23, 2025
diff --git a/basics/uninterruptable/README.md b/basics/uninterruptable/README.md
@@ -0,0 +1,131 @@
+# Uninterruptable Agent
+
+A voice assistant that demonstrates non-interruptible speech behavior using LiveKit's voice agents, useful for delivering information without interruption.
+
+## Overview
+
+**Uninterruptable Agent** - A voice-enabled assistant configured to complete its responses without being interrupted by user speech, demonstrating the `allow_interruptions=False` configuration option.
+
+## Features
+
+- **Simple Configuration**: Single parameter controls interruption behavior
+- **Voice-Enabled**: Built using LiveKit's voice capabilities with support for:
+  - Speech-to-Text (STT) using Deepgram
+  - Large Language Model (LLM) using OpenAI GPT-4o
+  - Text-to-Speech (TTS) using OpenAI
+  - Voice Activity Detection (VAD) disabled during agent speech
+
+## How It Works
+
+1. User connects to the LiveKit room
+2. Agent automatically starts speaking a long test message
+3. User attempts to interrupt by speaking
+4. Agent continues speaking without stopping
+5. Only after the agent finishes can the user's input be processed
+6. Subsequent responses are also uninterruptible
+
+## Prerequisites
+
+- Python 3.10+
+- `livekit-agents`>=1.0
+- LiveKit account and credentials
+- API keys for:
+  - OpenAI (for LLM and TTS capabilities)
+  - Deepgram (for speech-to-text)
+
+## Installation
+
+1. Clone the repository
+
+2. Install dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+
+3. Create a `.env` file in the parent directory with your API credentials:
+   ```
+   LIVEKIT_URL=your_livekit_url
+   LIVEKIT_API_KEY=your_api_key
+   LIVEKIT_API_SECRET=your_api_secret
+   OPENAI_API_KEY=your_openai_key
+   DEEPGRAM_API_KEY=your_deepgram_key
+   ```
+
+## Running the Agent
+
+```bash
+python uninterruptable.py dev
+```
+
+The agent will immediately start speaking a long message. Try interrupting to observe the non-interruptible behavior.
+
+## Architecture Details
+
+### Key Configuration
+
+The critical setting that makes this agent uninterruptible:
+
+```python
+Agent(
+    instructions="...",
+    stt=deepgram.STT(),
+    llm=openai.LLM(model="gpt-4o"),
+    tts=openai.TTS(),
+    allow_interruptions=False  # This prevents interruptions
+)
+```
+
+### Behavior Comparison
+
+| Setting | User Speaks While Agent Talks | Result |
+|---------|------------------------------|---------|
+| `allow_interruptions=True` (default) | Agent stops mid-sentence | User input processed immediately |
+| `allow_interruptions=False` | Agent continues speaking | User input queued until agent finishes |
+
+### Testing Approach
+
+The agent automatically generates a long response on entry to facilitate testing:
+```python
+self.session.generate_reply(user_input="Say something somewhat long and boring so I can test if you're interruptable.")
+```
+
+## Use Cases
+
+### When to Use Uninterruptible Agents
+
+1. **Legal Disclaimers**: Must be read in full without interruption
+2. **Emergency Instructions**: Critical safety information
+3. **Tutorial Steps**: Sequential instructions that shouldn't be skipped
+4. **Terms and Conditions**: Required complete playback
+
+
+## Implementation Patterns
+
+### Selective Non-Interruption
+
+```python
+# Make only critical messages uninterruptible
+async def say_critical(self, message: str):
+    self.allow_interruptions = False
+    await self.session.say(message)
+    self.allow_interruptions = True
+```
+
+## Important Considerations
+
+- **User Experience**: Non-interruptible agents can be frustrating if overused
+- **Message Length**: Keep uninterruptible segments reasonably short
+- **Clear Indication**: Consider informing users when interruption is disabled
+- **Fallback Options**: Provide alternative ways to skip or pause if needed
+
+## Example Interaction
+
+```
+Agent: [Starts long message] "I'm going to tell you a very long and detailed story about..."
+User: "Stop!" [Agent continues]
+Agent: "...and that's why the chicken crossed the road. The moral of the story is..."
+User: "Hey, wait!" [Agent still continues]
+Agent: "...patience is a virtue." [Finally finishes]
+User: "Finally! Can you hear me now?"
+Agent: "Yes, I can hear you now. How can I help?"
+```
diff --git a/basics/uninterruptable.py → basics/uninterruptable/uninterruptable.py b/basics/uninterruptable.py → basics/uninterruptable/uninterruptable.py
diff --git a/pipeline-stt/keyword-detection/README.md b/pipeline-stt/keyword-detection/README.md
@@ -0,0 +1,89 @@
+## Overview
+
+**Keyword Detection Agent** - A voice-enabled agent that monitors user speech for predefined keywords and logs when they are detected.
+
+## Features
+
+- **Real-time Keyword Detection**: Monitors speech for specific keywords as users talk
+- **Custom STT Pipeline**: Intercepts the speech-to-text pipeline to detect keywords
+- **Logging System**: Logs detected keywords with proper formatting
+- **Voice-Enabled**: Built using voice capabilities with support for:
+  - Speech-to-Text (STT) using Deepgram
+  - Large Language Model (LLM) using OpenAI
+  - Text-to-Speech (TTS) using OpenAI
+  - Voice Activity Detection (VAD) using Silero
+
+## How It Works
+
+1. User connects to the LiveKit room
+2. Agent greets the user and starts a conversation
+3. As the user speaks, the custom STT pipeline monitors for keywords
+4. When keywords like "Shane", "hello", "thanks", or "bye" are detected, they are logged
+5. The agent continues normal conversation while monitoring in the background
+6. All speech continues to be processed by the LLM for responses
+
+## Prerequisites
+
+- Python 3.10+
+- `livekit-agents`>=1.0
+- LiveKit account and credentials
+- API keys for:
+  - OpenAI (for LLM and TTS capabilities)
+  - Deepgram (for speech-to-text)
+
+## Installation
+
+1. Clone the repository
+
+2. Install dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+
+3. Create a `.env` file in the parent directory with your API credentials:
+   ```
+   LIVEKIT_URL=your_livekit_url
+   LIVEKIT_API_KEY=your_api_key
+   LIVEKIT_API_SECRET=your_api_secret
+   OPENAI_API_KEY=your_openai_key
+   DEEPGRAM_API_KEY=your_deepgram_key
+   ```
+
+## Running the Agent
+
+```bash
+python keyword_detection.py console
+```
+
+The agent will start a conversation and monitor for keywords in the background. Try using words like "hello", "thanks", or "bye" in your speech and watch them come up in logging.
+
+## Architecture Details
+
+### Main Classes
+
+- **KeywordDetectionAgent**: Custom agent class that extends the base Agent with keyword detection
+- **stt_node**: Overridden method that intercepts the STT pipeline to monitor for keywords
+
+### Keyword Detection Pipeline
+
+The agent overrides the `stt_node` method to create a custom processing pipeline:
+1. Receives the parent STT stream
+2. Monitors final transcripts for keywords
+3. Logs detected keywords
+4. Passes all events through unchanged for normal processing
+
+### Current Keywords
+
+The agent monitors for these keywords (case-insensitive):
+- "Shane"
+- "hello"
+- "thanks"
+- "bye"
+
+### Logging Output
+
+When keywords are detected, you'll see log messages like:
+```
+INFO:keyword-detection:Keyword detected: 'hello'
+INFO:keyword-detection:Keyword detected: 'thanks'
+```
diff --git a/pipeline-stt/keyword_detection.py → ...tt/keyword-detection/keyword_detection.py b/pipeline-stt/keyword_detection.py → ...tt/keyword-detection/keyword_detection.py
@@ -9,14 +9,14 @@
 
 load_dotenv(dotenv_path=Path(__file__).parent.parent / '.env')
 
-logger = logging.getLogger("listen-and-respond")
+logger = logging.getLogger("keyword-detection")
 logger.setLevel(logging.INFO)
 
-class SimpleAgent(Agent):
+class KeywordDetectionAgent(Agent):
     def __init__(self) -> None:
         super().__init__(
             instructions="""
-                You are a helpful agent.
+                You are a helpful agent that detects keywords in user speech.
             """,
             stt=deepgram.STT(),
             llm=openai.LLM(),
@@ -28,7 +28,7 @@ async def on_enter(self):
         self.session.generate_reply()
 
     async def stt_node(self, text: AsyncIterable[str], model_settings: Optional[dict] = None) -> Optional[AsyncIterable[rtc.AudioFrame]]:
-        keywords = ["Shane", "hello", "thanks"]
+        keywords = ["Shane", "hello", "thanks", "bye"]
         parent_stream = super().stt_node(text, model_settings)
 
         if parent_stream is None:
@@ -53,7 +53,7 @@ async def entrypoint(ctx: JobContext):
     session = AgentSession()
 
     await session.start(
-        agent=SimpleAgent(),
+        agent=KeywordDetectionAgent(),
         room=ctx.room
     )
 

diff --git a/pipeline-stt/transcriber/README.md b/pipeline-stt/transcriber/README.md
@@ -0,0 +1,85 @@
+# Transcriber Agent
+
+A speech-to-text logging agent that transcribes user speech and saves it to a file using LiveKit's voice agents.
+
+## Overview
+
+**Transcriber Agent** - A voice-enabled agent that listens to user speech, transcribes it using Deepgram STT, and logs all transcriptions with timestamps to a local file.
+
+## Features
+
+- **Real-time Transcription**: Converts speech to text as users speak
+- **Persistent Logging**: Saves all transcriptions to `user_speech_log.txt` with timestamps
+- **Voice-Enabled**: Built using LiveKit's voice capabilities with support for:
+  - Speech-to-Text (STT) using Deepgram
+  - Minimal agent configuration without LLM or TTS
+- **Event-Based Processing**: Uses the `user_input_transcribed` event for efficient transcript handling
+- **Automatic Timestamping**: Each transcription entry includes date and time
+
+## How It Works
+
+1. User connects to the LiveKit room
+2. Agent starts listening for speech input
+3. Deepgram STT processes the audio stream in real-time
+4. When a final transcript is ready, it triggers the `user_input_transcribed` event
+5. The transcript is appended to `user_speech_log.txt` with a timestamp
+6. The process continues for all subsequent speech
+
+## Prerequisites
+
+- Python 3.10+
+- `livekit-agents`>=1.0
+- LiveKit account and credentials
+- API keys for:
+  - Deepgram (for speech-to-text)
+
+## Installation
+
+1. Clone the repository
+
+2. Install dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+
+3. Create a `.env` file in the parent directory with your API credentials:
+   ```
+   LIVEKIT_URL=your_livekit_url
+   LIVEKIT_API_KEY=your_api_key
+   LIVEKIT_API_SECRET=your_api_secret
+   DEEPGRAM_API_KEY=your_deepgram_key
+   ```
+
+## Running the Agent
+
+```bash
+python transcriber.py console
+```
+
+The agent will start listening for speech and logging transcriptions to `user_speech_log.txt` in the current directory.
+
+## Architecture Details
+
+### Main Components
+
+- **AgentSession**: Manages the agent lifecycle and event handling
+- **user_input_transcribed Event**: Fired when Deepgram completes a transcription
+- **Transcript Object**: Contains the transcript text and finality status
+
+### Log File Format
+
+Transcriptions are saved in the following format:
+```
+[2024-01-15 14:30:45] Hello, this is my first transcription
+[2024-01-15 14:30:52] Testing the speech to text functionality
+```
+
+### Minimal Agent Configuration
+
+This agent uses a minimal configuration without LLM or TTS:
+```python
+Agent(
+    instructions="You are a helpful assistant that transcribes user speech to text.",
+    stt=deepgram.STT()
+)
+```
diff --git a/pipeline-stt/transcriber.py → pipeline-stt/transcriber/transcriber.py b/pipeline-stt/transcriber.py → pipeline-stt/transcriber/transcriber.py