Improving Speech Recognition with Jargon Injection

Paper ｜ Benchmarks ｜ Install ｜ Usage

Introduction

We introduce a new method that improves the performance of Automatic speech recognition (ASR) engines, e.g., Whisper in practical cases. Different from prior methods that usually require both speech data and its transcription for decoding, our method only uses jargon as the context for decoding. To do that, the method first represents the jargon in a trie tree structure for efficient storing and traversing. The method next forces the decoding of Whisper to more focus on the jargon by adjusting the probability of generated tokens with the use of the trie tree. To further improve the performance, the method utilizes the prompting method that uses the jargon as the context. Final tokens are generated based on the combination of prompting and decoding. Experimental results on Japanese and English datasets show that the proposed method helps to improve the performance of Whisper, specially for domain-specific data. The method is simple but effective and can be deployed to any encoder-decoder ASR engines in actual cases.

Benchmarks

Install

pip install -r requirements.txt

Usage

Inference

Run inference.py in each recipe, which corresponds to each benchmark dataset. For example:

python recipes/jnas/inference.py

Finetune

Finetune the Whisper ASR model with and without Jargon Injection by running train.py in each recipe For example:

python recipes/hgp-600/train.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs/images		docs/images
finetune		finetune
recipes		recipes
whisper_main		whisper_main
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
create_dict.py		create_dict.py
evaluate_drr.py		evaluate_drr.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Speech Recognition with Jargon Injection

Paper ｜ Benchmarks ｜ Install ｜ Usage

Introduction

Benchmarks

Install

Usage

Inference

Finetune

About

Releases

Packages

Languages

Cinnamon/whisper-jargon

Folders and files

Latest commit

History

Repository files navigation

Improving Speech Recognition with Jargon Injection

Paper ｜ Benchmarks ｜ Install ｜ Usage

Introduction

Benchmarks

Install

Usage

Inference

Finetune

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages