Spoken Language Classification

This repository contains some models pretrained on the Voxlingua107 dataset to be used for spoken (audio based) language classification. The dataset (and therefore the models) can distinguish between 107 different types of languages. Four models are provided ( See below ).

Usage

git clone https://github.com/RicherMans/SpokenLanguageClassifiers
pip install -r requirements.txt
python3 predict.py AUDIOFILE

The models (see below) can be also modified. Currently four models have been pretrained. All of which are accessed with the --model MODELNAME parameter.

By default the models just print the top N results (N=5 and can be changed with --N NUMBER).

Models

Four models were pretrained and can be chosen as the back-end:

CNN6 (default) : A six layer CNN model, using attention as temporal aggregation.
CNN10: A ten layer CNN model, using mean and max pooling as temporal aggregation.
MobilenetV2: A mobilenet implementation for audio classification.
CNNVAD: A model that simultaneously does VAD and classification. The VAD model is taken from GPV and Data-driven GPVAD. Model training has been done by fine-tuning both VAD and Language classification models. The back-end model here is the default CNN6.

Since I don't have access to other datasets for cross-dataset evaluation, I provide the current performance on my held-out cross-validation dataset:

Model	Precision	Recall	Accuracy
CNN6	81.7	84.4	83.6
CNN10	89.9	90.9	90.8
MobileNetV2	80.0	80.1	79.3
CNNVAD	81.0	82.4	82.9

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
models.py		models.py
predict.py		predict.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spoken Language Classification

Usage

Models

About

Releases

Packages

Languages

License

BakuDev/SpokenLanguageClassifiers

Folders and files

Latest commit

History

Repository files navigation

Spoken Language Classification

Usage

Models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages