Speech-to-Text Transcription Tool v2

Description

This repository provides a comprehensive tool for converting audio files to text. It leverages state-of-the-art machine learning models to deliver accurate and efficient transcriptions. Key features include:

Audio Format Conversion: Easily convert audio files between different formats (e.g., MP3 to WAV).
Segment Extraction: Extract specific segments from audio files for targeted transcription.
Speaker Identification: Identify and label different speakers in the audio.

Features

Easy-to-use Interface: Simplified commands for audio processing and transcription.
High Accuracy: Utilizes advanced models like OpenAI's Whisper for precise transcriptions.
Customizable Parameters: Adjust settings like language, model size, and number of speakers.
Memory Efficient: Handles large audio files efficiently by processing in segments.

Installation

Clone the repository:

git clone https://github.com/manuelarguelles/speechtotext_v2.git
cd speechtotext_v2

Install the necessary packages:
```
pip install -r requirements.txt
```

Usage

Converting Audio Format

from audio_converter import convert_to_wav

# Convert an MP3 file to WAV format
convert_to_wav('path/to/input.mp3', 'path/to/output.wav')



## Extracting Audio Segment
from audio_extractor import extract_audio_segment

# Extract a segment from an audio file
extract_audio_segment('path/to/input.wav', start_time=120, end_time=720, output_path='path/to/output_segment.wav')
#Transcribing Audio
from transcriber import AudioTranscriber

## Create an instance of the AudioTranscriber
transcriber = AudioTranscriber('path/to/audio.wav', model_size='tiny', num_speakers=1)

# Transcribe the audio file
segments = transcriber.transcribe()

# Save the transcription to a text file
transcriber.save_transcription(segments, 'path/to/transcription.txt')

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
20240624_Speech(LongAudio_Estereo)_to_Text.ipynb		20240624_Speech(LongAudio_Estereo)_to_Text.ipynb
20240625_Whisper_transcribir_traducir.ipynb		20240625_Whisper_transcribir_traducir.ipynb
20240625_Whisper_transcribir_traducir_2_interlocutores.ipynb		20240625_Whisper_transcribir_traducir_2_interlocutores.ipynb
LICENSE		LICENSE
README.md		README.md
audio_converter.py		audio_converter.py
audio_extractor.py		audio_extractor.py
requirements.txt		requirements.txt
transcriber.py		transcriber.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-to-Text Transcription Tool v2

Description

Features

Installation

Usage

Converting Audio Format

About

Releases

Packages

Languages

License

ernestcr/Speech-to-Text-Transcription-Tool-v2

Folders and files

Latest commit

History

Repository files navigation

Speech-to-Text Transcription Tool v2

Description

Features

Installation

Usage

Converting Audio Format

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages