Syri - AI Voice Assistant

An open-source AI voice assistant that uses:

OpenAI Whisper for speech-to-text
Web browser-based agent for AI response generation (using Claude 3.7 Sonnet)
OpenAI TTS for text-to-speech

This project enables a fully conversational AI experience similar to Siri, but using powerful AI models, a web browser agent, and high-quality audio APIs.

Setup Instructions

Step 1: Prerequisites

API Keys:
- Sign up for OpenAI to get an API key (used for both STT and TTS)
- Sign up for Portkey to get API keys for Claude 3.7 Sonnet
Install PortAudio (required for audio recording):
- Debian/Ubuntu: apt install portaudio19-dev
- MacOS: brew install portaudio sox
For MacOS users only: Install MPV for audio streaming
- brew install mpv
Chrome Browser:
- Google Chrome must be installed as the web agent will launch and control Chrome

Step 2: Install Python Dependencies

uv sync

Step 3: Configure Environment Variables

Copy the template file to create your own environment file:
```
cp .envtemplate .env
```

Edit the .env file and replace the placeholder values with your actual API keys:

OPENAI_API_KEY=your_actual_openai_key_here
PORTKEY_API_KEY=your_actual_portkey_key_here
PORTKEY_VIRTUAL_KEY_ANTHROPIC=your_actual_portkey_virtual_key_here

Optional TTS configuration:

SYRI_TTS_VOICE=coral     # Options: alloy, echo, fable, onyx, nova, shimmer
SYRI_TTS_SPEED=1.2       # Speech speed multiplier

Usage

You can run the assistant using either of these methods:

Method 1: Using the runner script (recommended)

uv run run.py

This script performs pre-checks and starts the assistant in an inactive listening state.

To start listening, press Enter or run ./scripts/start_listening.sh (useful for automation)
Describe your request
Press Enter again or run ./scripts/stop_listening.sh when done
The AI will transcribe your speech, process it through the web agent, and respond both in text (console) and through speech

Context

This is PoC sketch. Newer version uses OpenAI voice agents and was moved to different repo.

How It Works

When you speak to Syri:

Your voice is recorded using PyAudio
The recording is transcribed to text using OpenAI Whisper
The transcribed text is sent to a web agent that runs Chrome browser automation
The web agent uses Claude 3.7 Sonnet through Portkey to generate responses
The response is converted to speech using OpenAI TTS

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
assets		assets
scripts		scripts
src		src
triggers		triggers
.cursorrules		.cursorrules
.envtemplate		.envtemplate
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
run.py		run.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Syri - AI Voice Assistant

Setup Instructions

Step 1: Prerequisites

Step 2: Install Python Dependencies

Step 3: Configure Environment Variables

Usage

Method 1: Using the runner script (recommended)

Context

How It Works

License

About

Releases

Packages

Contributors 2

Languages

BurnyCoder/syri

Folders and files

Latest commit

History

Repository files navigation

Syri - AI Voice Assistant

Setup Instructions

Step 1: Prerequisites

Step 2: Install Python Dependencies

Step 3: Configure Environment Variables

Usage

Method 1: Using the runner script (recommended)

Context

How It Works

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages