Skip to content

PyTorch Dataset for Speech and Music audio

Notifications You must be signed in to change notification settings

AMAAI-Lab/AudioLoader

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AudioLoader

AudioLoader is a PyTorch dataset based on torchaudio. It contains a collection of datasets that are not available in torchaudio yet.

Currently supported datasets:

  1. Speech
    1. Multilingual LibriSpeech (MLS)
    2. TIMIT
    3. SpeechCommands v2 (12 classes)
  2. Automatic Music Transcription (AMT)
    1. MAPS
    2. MusicNet
    3. MAESTRO
  3. Music Source Separation (MSS)
    1. FastMUSDB
    2. MusdbHQ

Example code

A complete example code is available in this repository. The following pseudo code shows the general idea of how to apply AudioLoader to your existing code.

from AudioLoader.speech import TIMIT
from torch.utils.data import DataLoader

# AudioLoader helps you to set up supported datasets
dataset = TIMIT('./YourFolder',
                split='train',
                groups='all',
                download=True)
train_loader = DataLoader(dataset,
                          batch_size=4)

# Pass the dataset to you 
model = MyModel()
trainer = pl.Trainer()
trainer.fit(model, train_loader)

Installation

pip install git+https://github.com/KinWaiCheuk/AudioLoader.git

News & Changelog

version 0.0.3 (10 Sep 2021):

  1. Replace broken links with a working links for MAPS and TIMIT
  2. Remove the slience indicators in the phonemic labels for TIMIT

About

PyTorch Dataset for Speech and Music audio

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%