Voice Transcribe

Effortless speech-to-text transcription for Ubuntu desktop.

Overview

Voice Transcribe is a simple, keyboard-driven solution for transcribing spoken words to text. Powered by OpenAI's Whisper model for accurate speech recognition, this tool allows users to trigger transcription with a global hotkey, automatically inserting the transcribed text into any active application.

Key Features

Global hotkey: Trigger transcription with a customizable hotkey combination (Shift + Windows key by default).
Accurate speech recognition: Powered by the Whisper model for high-quality transcription.
Automatic text insertion: Transcribed text is automatically inserted into the active application at the current cursor position.
Ubuntu desktop integration: Seamlessly integrates with the Ubuntu desktop environment.

Getting Started

Prerequisites

Ubuntu Linux (Tested on Ubuntu Desktop)
Python 3.9+
Conda (for virtual environment management)
Docker (optional, for containerized deployment)

Installation

Clone the repository:

git clone https://github.com/your-username/voice-transcribe.git
cd voice-transcribe

Set up Conda environment:

Install dependencies using the provided Conda environment:
```
conda env create -f environment.yml
conda activate voice_transcribe
```
Install system dependencies:

Ensure xdotool is installed to allow text insertion:
```
sudo apt-get install xdotool
```
Run the application:
```
python hotkey_listener.py
```
This will start the application, allowing it to listen for the hotkey combination and perform transcription.

Docker Deployment (Optional)

You can also run the application in a Docker container:

Build the Docker image:

docker build -t voice_transcribe:latest .

Run the Docker container:

docker run -d \
  --name voice_transcribe \
  --device /dev/snd \
  -v /tmp/.X11-unix:/tmp/.X11-unix \
  -e DISPLAY=$DISPLAY \
  -v $XAUTHORITY:/root/.Xauthority \
  --network host \
  --privileged \
  voice_transcribe:latest

Development Environment

Conda virtual environment: Dependencies are managed with Conda for consistent development environments (see environment.yml).
Docker containerization: The application can be containerized and deployed using Docker (see Dockerfile).

Contributing

Contributions are welcome! Please see the CONTRIBUTING.md file for guidelines on how to contribute to this project.

License

[Insert license information, e.g., MIT License]

Acknowledgments

Whisper model: This project uses the Whisper model for speech recognition, developed by OpenAI.

Replace your-username with your GitHub username and update the script name or license information accordingly.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.vscode		.vscode
README.md		README.md
environment.yml		environment.yml
hotkey_listener.py		hotkey_listener.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Transcribe

Overview

Key Features

Getting Started

Prerequisites

Installation

Docker Deployment (Optional)

Development Environment

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

bdavidriggins/VoiceScribe

Folders and files

Latest commit

History

Repository files navigation

Voice Transcribe

Overview

Key Features

Getting Started

Prerequisites

Installation

Docker Deployment (Optional)

Development Environment

Contributing

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages