cltl-asr

Speech to text service for Leolani. This repository is a component of the Leolani framework. For usage of the component within the framework see the instructions there.

Description

This package contains multiple implementations to convert text from spoken language for any written text.

Getting started

Prerequisites

This repository uses Python >= 3.9

Be sure to run in a virtual python environment (e.g. conda, venv, mkvirtualenv, etc.)

Installation

In the root directory of this repo run
```
pip install -e .
```

Implementations

There are various implementations included in cltl.asr. Depending on which implementation is used different dependencies are required for this package. To include those during installation specify the required extra dependencies with pip, e.g. like:

pip install cltl.asr[whisper]

For the available options refer to setup.py.

Whisper ASR

Whisper C++ ASR

This implementation utilizes whisper.cpp, which is a C++ implementation of whisper that can be run natively and provides improved performance over the Python implemenation of Whisper.

In order to use this implementation a whisper.cpp server needs to be setup. To setup the server follow the instructions in the whisper.cpp README and server specific README.

For the server setup on OS X on an Apple Silicon device refer to this setup script.

Google ASR

Speechbrain ASR

Usage

For using this repository as a package in a different project and on a different virtual environment, you may

install a published version from PyPI:
```
pip install cltl.asr
```

or, for the latest snapshot, run:

pip install git+git://github.com/leolani/cltl-asr.git@main

Then you can import it in a python script as:

import numpy as np
import soundfile as sf
from importlib_resources import path
from cltl.asr.speechbrain_asr import SpeechbrainASR

asr = SpeechbrainASR("speechbrain/asr-transformer-transformerlm-librispeech")

with path("resources", "test.wav") as wav:
    speech_array, sampling_rate = sf.read(wav, dtype=np.int16)
transcript = asr.speech_to_text(speech_array, sampling_rate)

Examples

Please take a look at the example scripts provided to get an idea on how to run and use this package. Each example has a comment at the top of the script describing the behaviour of the script.

For these example scripts, you need

To change your current directory to ./examples/
Run some examples (e.g. python test_speechbrain_asr.py)

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

To DO

Fix logging
Fix config of language, save/play audio file, audio directory
Check if we can switch voices via different APIs
Check implementation middle layer

License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
config		config
src		src
support		support
tests		tests
util @ d18af3b		util @ d18af3b
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
makefile		makefile
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cltl-asr

Description

Getting started

Prerequisites

Installation

Implementations

Whisper ASR

Whisper C++ ASR

Google ASR

Speechbrain ASR

Usage

Examples

Contributing

To DO

License

Authors

About

Releases

Packages

Contributors 4

Languages

License

leolani/cltl-asr

Folders and files

Latest commit

History

Repository files navigation

cltl-asr

Description

Getting started

Prerequisites

Installation

Implementations

Whisper ASR

Whisper C++ ASR

Google ASR

Speechbrain ASR

Usage

Examples

Contributing

To DO

License

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages