MiniASR

A mini, simple, and fast end-to-end automatic speech recognition toolkit.

Intro

Why Mini?

Minimal Training ⏱
Self-supervised pre-trained models + minimal fine-tuning.
Simple and Flexible ⚙️
Easy to understand and customize.
Colab Compatible 🧪
Train your model directly on Google Colab.

ASR Pipeline

Preprocessing (run_preprocess.py)
- Find all audio files and transcriptions.
- Generate vocabularies (character/word/subword/code-switched).
Training (run_asr.py)
- Dataset (miniasr/data/dataset.py)
  - Tokenizer for text data (miniasr/data/text.py)
- DataLoader (miniasr/data/dataloader.py)
- Model (miniasr/model/base_asr.py)
  - Feature extractor
  - Data augmentation
  - End-to-end CTC ASR
Testing (run_asr.py)
- CTC greedy/beam decoding
- Performance measures: error rates, RTF, latency

Instructions

Requirements

Python 3.6+
Install sox on your OS
Install latest s3prl (at least v0.4)

git clone https://github.com/s3prl/s3prl.git
cd s3prl
pip install -e ./
cd ..

Install via pip:

pip install -e ./

Additional libraries:

flashlight: to decode with LM and beam search.

Pre-trained ASR

You can directly use pre-trained ASR models for any applications. (under construction 🚧)

from miniasr.utils import load_from_checkpoint
from miniasr.data.audio import load_waveform

# Option 1: Loading from a checkpoint
model, args, tokenizer = load_from_checkpoint('path/to/ckpt', 'cuda')
# Option 2: Loading from torch.hub (TODO)
model = torch.hub.load('vectominist/MiniASR', 'ctc_eng').to('cuda')

# Load waveforms and recognize!
waves = [load_waveform('path/to/waveform').to('cuda')]
hyps = model.recognize(waves)

Preprocessing

For already implemented corpora, please see egs/.
To customize your own dataset, please see miniasr/preprocess.

miniasr-preprocess

Options:

  --corpus Corpus name.
  --path Path to dataset.
  --set Which subsets to be processed.
  --out Output directory.
  --gen-vocab Specify whether to generate vocabulary files.
  --char-vocab-size Character vocabulary size.
  --word-vocab-size Word vocabulary size.
  --subword-vocab-size Subword vocabulary size.
  --gen-subword Specify whether to generate subword vocabulary.
  --subword-mode {unigram,bpe} Subword training mode.
  --char-coverage Character coverage.
  --seed SEED Set random seed.
  --njobs Number of workers.
  --log-file Logging file.
  --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL} Logging level.

Training & Testing

See examples in egs/.

miniasr-asr

Options:

  --config Training configuration file (.yaml).
  --test Specify testing mode.
  --ckpt Checkpoint for testing.
  --test-name Specify testing results' name.
  --cpu Using CPU only.
  --seed Set random seed.
  --njobs Number of workers.
  --log-file Logging file.
  --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL} Logging level.

TODO List

torch.hub support
Releasing pre-trained ASR models

Reference Papers

Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks, Graves et al.
Neural Machine Translation of Rare Words with Subword Units, Sennrich et al.
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units, Hsu et al.
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition, Park et al.

Reference Repos

Citation

@misc{chang2021miniasr,
  title={{MiniASR}},
  author={Chang, Heng-Jui},
  year={2021},
  url={https://github.com/vectominist/MiniASR}
}

Name	Name	Last commit message	Last commit date
Latest commit vectominist Merge pull request #2 from vectominist/timit Dec 6, 2022 d98ab07 · Dec 6, 2022 History 27 Commits
.github/workflows	.github/workflows	fix github build & setup	Jul 19, 2022
egs	egs	add Conv2dGT & CIF	Dec 6, 2022
example	example	update recognition example	Oct 2, 2021
miniasr	miniasr	add Conv2dGT & CIF	Dec 6, 2022
script	script	fix format.py	Jul 20, 2022
.gitignore	.gitignore	fix some problems for Colab support & add logo	Sep 7, 2021
.pre-commit-config.yaml	.pre-commit-config.yaml	reformat & add pre-commit checks	Jul 19, 2022
LICENSE	LICENSE	Initial commit	Jul 14, 2021
README.md	README.md	update readme & setup	Dec 6, 2022
get_sample.py	get_sample.py	transformer, s3prl 0.4, timit	Oct 26, 2022
hubconf.py	hubconf.py	transformer, s3prl 0.4, timit	Oct 26, 2022
logo.png	logo.png	fix some problems for Colab support & add logo	Sep 7, 2021
run_asr.py	run_asr.py	transformer, s3prl 0.4, timit	Oct 26, 2022
run_preprocess.py	run_preprocess.py	transformer, s3prl 0.4, timit	Oct 26, 2022
setup.py	setup.py	update readme & setup	Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiniASR

Intro

Why Mini?

ASR Pipeline

Instructions

Requirements

Pre-trained ASR

Preprocessing

Training & Testing

TODO List

Reference Papers

Reference Repos

Citation

About

Languages

License

vectominist/MiniASR

Folders and files

Latest commit

History

Repository files navigation

MiniASR

Intro

Why Mini?

ASR Pipeline

Instructions

Requirements

Pre-trained ASR

Preprocessing

Training & Testing

TODO List

Reference Papers

Reference Repos

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages