LiveKit-powered chat agent for the DianaChat platform.
DianaChat Agent is a Python-based AI agent that enables real-time voice and text interactions through the DianaChat platform. It processes natural language input and generates intelligent responses using state-of-the-art AI models.
- Python: 3.11+
- LiveKit: Real-time communication
- OpenAI GPT-4: Language model
- Deepgram: Speech-to-text
- OpenAI TTS: Text-to-speech
- Redis: Response caching
- pytest: Testing framework
-
DianaChat Frontend (separate repository)
- Web and mobile clients
- LiveKit integration
- Real-time communication
-
DianaChat Agent (this repository)
- LiveKit Multimodal Agent
- Async I/O operations
- Error handling and recovery
- Response caching
-
Speech-to-Text (Deepgram)
- Nova-2 model
- Real-time transcription
- Language detection
- Error recovery
-
Language Model (OpenAI)
- GPT-4 Turbo
- Context management
- Token optimization
- Safety filters
-
Text-to-Speech (OpenAI)
- Shimmer voice
- Stream processing
- Error handling
dianachat-agent/
├── src/dianachat_agent/
│ ├── agents/ # Agent implementations
│ ├── api/ # API endpoints
│ ├── config/ # Configuration
│ ├── models/ # Data models
│ ├── services/ # Core services
│ └── utils/ # Utilities
├── tests/ # Test suites
├── alembic/ # Database migrations
├── .env.example # Environment template
├── pyproject.toml # Project metadata
└── README.md # Documentation
- Python Environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
- Environment Setup
cp .env.example .env
# Edit .env with your credentials
# LiveKit
LIVEKIT_URL=wss://your-livekit-server
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret
# OpenAI
OPENAI_API_KEY=your_openai_key
OPENAI_MODEL=gpt-4-turbo-preview
OPENAI_VOICE=shimmer
OPENAI_TEMPERATURE=0.7
# Deepgram
DEEPGRAM_API_KEY=your_deepgram_key
DEEPGRAM_MODEL=nova-2
DEEPGRAM_LANGUAGE=en-US
DEEPGRAM_TIER=enhanced
# Agent Settings
ENABLE_VIDEO=false
MAX_RESPONSE_TOKENS=400
ENABLE_RESPONSE_CACHING=true
CACHE_TTL_SECONDS=3600
# Run all tests
pytest
# Run specific test file
pytest tests/test_agent.py
# Run with coverage
pytest --cov=dianachat_agent
The agent implements multiple layers of error handling:
-
Service Errors
- STT transcription failures
- LLM timeouts or errors
- TTS synthesis issues
- Network connectivity problems
-
Recovery Strategies
- Automatic retries for transient failures
- Graceful degradation (text-only mode)
- User-friendly error messages
- Detailed error logging
-
Monitoring
- Error rate tracking
- Response latency monitoring
- Token usage metrics
- Cost optimization alerts
- Fork the repository
- Create a feature branch
git checkout -b feat/your-feature
- Make changes following our standards:
- Type annotations
- PEP 8 style
- Comprehensive tests
- Clear documentation
- Submit a pull request
The agent supports multiple deployment strategies:
-
Production
- Main branch
- Zero-downtime updates
- Automatic rollbacks
- Full monitoring
-
Staging
- Staging branch
- Integration testing
- Performance testing
- Cost monitoring
DianaChat Agent uses RAG to enhance responses with relevant context from your knowledge base.
The RAG system consists of three main components:
-
Document Processing
- Splits documents into paragraphs
- Generates embeddings using OpenAI's text-embedding-3-small model
- Stores vectors in an Annoy index for efficient similarity search
- Saves paragraph text in a pickle file for retrieval
-
RAG Service
- Manages vector database operations
- Performs real-time embedding generation
- Handles similarity search and context retrieval
- Integrates with both voice and text pipelines
-
Agent Integration
- Enriches user messages with relevant context
- Uses before_llm_cb hook for voice pipeline
- Directly enriches messages in text processing
- Gracefully handles RAG failures
- Environment Variables
# OpenAI API (required for embeddings)
OPENAI_API_KEY=your_api_key
# RAG Configuration (optional, shown with defaults)
RAG_MODEL=text-embedding-3-small
RAG_EMBEDDINGS_DIMENSION=1536
RAG_INDEX_PATH=src/dianachat_agent/rag/data/vdb_data
RAG_DATA_PATH=src/dianachat_agent/rag/data/paragraphs.pkl
- Document Preparation
# Create docs directory if it doesn't exist
mkdir -p docs
# Add your documents to docs/
# Supported formats: .txt, .md, .rst, .py
- Build Vector Database
# From project root
python -m dianachat_agent.rag.create_vector
- Examine Embeddings
# Interactive mode
python -m dianachat_agent.rag.examine_embeddings
# List all embeddings
python -m dianachat_agent.rag.examine_embeddings --list
# Direct query
python -m dianachat_agent.rag.examine_embeddings --query "your query"
- RAG Pipeline The RAG system automatically:
- Processes each user message
- Finds relevant context from your documents
- Enriches the prompt with this context
- Works for both voice and text interactions
- Vector Storage
- Uses Annoy (Approximate Nearest Neighbors) for efficient similarity search
- Angular distance metric for comparing embeddings
- In-memory index for fast retrieval
- Persistent storage for both vectors and text
- Context Enrichment
# Example enriched message format
Here is some relevant context:
[Retrieved relevant text from your documents]
User message: [Original user message]
Please use this context to inform your response.
- Error Handling
- Graceful degradation if RAG service fails
- Fallback to raw message if no relevant context
- Automatic recovery on service initialization
- Detailed logging for troubleshooting
- Updating Knowledge Base
- Add new documents to
docs/
- Run
create_vector.py
to rebuild index - No need to restart the agent
- Monitoring
- Check logs for RAG-related messages
- Monitor embedding API usage
- Review context relevance using examine_embeddings
- Adjust similarity thresholds if needed
- Memory Usage
- Annoy index stays in memory
- Paragraph text stored on disk
- Lazy loading of components
- API Costs
- One embedding per user message
- Uses efficient embedding model
- Caches embeddings where possible
- Response Time
- Fast vector similarity search (~ms)
- Async embedding generation
- Minimal impact on response time
- API keys in environment variables
- Input sanitization
- Rate limiting
- Access control
- Audit logging
- Regular security updates
[Insert License Information]