Speech to Text is a simple web application that allows users to transcribe audio files into text using the Facebook Wav2Vec2 model. This project is built using Flask and leverages the Hugging Face Transformers library.
- Transcribe audio files in formats like .wav, .mp3, and .flac.
- Display the transcribed text on the web interface.
Make sure you have Python and pip installed on your system.
-
Clone the repository:
git clone https://github.com/yourusername/speech-to-text.git cd speech-to-text
-
Create a virtual environment (optional but recommended):
python -m venv venv
-
Activate the virtual environment:
-
On Windows:
venv\Scripts\activate
-
On macOS/Linux:
source venv/bin/activate
-
-
Install the required packages:
pip install -r requirements.txt
-
Run the Flask application:
python app.py
-
Open your web browser and navigate to http://127.0.0.1:5000/.
-
Upload an audio file and click the "Transcribe" button to see the transcribed text.
- Choose an audio file (supported formats: .wav, .mp3, .flac).
- Click the "Transcribe" button to get the transcribed text.