A Gradio-based application for end-to-end voice translation. It uses pyannote for speaker diarization, NVIDIA Canary-1b-v2 for translation, and Piper-TTS for voice synthesis.
- NVIDIA GPU with CUDA Toolkit 12.8 installed.
 - Python 3.10 (ensure it's added to your system's PATH).
 
- 
Clone the repository:
git clone https://github.com/Juste-Leo2/VoiceToVoice-Translation.git cd VoiceToVoice-Translation - 
Create and activate a virtual environment:
python -m venv .venv # On Windows .venv\Scripts\activate # On Linux / macOS # source .venv/bin/activate
 - 
Install dependencies using
uv:python -m pip install --upgrade pip pip install uv uv pip install -r requirements.txt --no-deps --index-strategy unsafe-best-match
 
- Run the application:
python app.py
 - Open the local URL provided in the terminal (e.g., 
http://127.0.0.1:7860) in your browser.