Skip to content

yonas-g/Afaan-Oromo-Speech-to-Text-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Afaan Oromo Speech to Text Dataset

This repository constains preprocessed audion mfcc and transcripts. The dataset is separated into train, dev and test sets.

The Dataset statistics

Train 979
Dev 122
Test 123

Folder structure

\train:
    \mfcc:
    \transcript:
\dev:
    \mfcc:
    \transcript:
\test:
    \mfcc:
    \transcript:

The Dataset statistics

Total clips: 1,224
Total Words: 17,559
Total characters: 116,439
Total Duration: 03:11:13
Min clip length: 1 sec
max clip length: 59 sec
Unique words: 5,040

Dataset Source: https://data.mendeley.com/datasets/hnvkvj589y/1

Girma, Birhanu Shimelis; Senbatu, Dereje Hinsermu (2022), “Afaan Oromoo Text-to-Speech Dataset”, Mendeley Data, V1, doi: 10.17632/hnvkvj589y.1

About

Afaan Oromo Speech to Text Dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published