Skip to content

Latest commit

 

History

History
40 lines (30 loc) · 1.09 KB

README.md

File metadata and controls

40 lines (30 loc) · 1.09 KB

attention

Accent recognition, for great justice.

Scripts

build_config.py

  1. Parses a directory containing {.mov,.wav} files.
  2. Builds config file of the form: {language, count}

get_dataset.py

  1. Parses a config file generated by build_config.py
  2. Downloads (via ftp) and converts to .wav (via ffmpeg).
  3. Involves multi-processing.
  4. Puts everything (.wav) into a single directory (/data).

feature_extraction.py

  1. Parses the files in /data (from get_dataset.py)
  2. Extracts features (mfcc, et al.) from /data,
  3. writes as serialized numpy arrays to /processed.

Notebooks

Audiolab.ipynb

prototyping environment, spectrograms, signal-vectors

Config Files

dataset.conf

lang count of source files (complete)

Data Files

  1. /data (.wav encoded audio)
  2. speech_archive_meta.tsv: Complementary dataset, contains additional info about speakers involved in each recording.

Todo:

  1. Extract features, store in database (sqlite).
  2. Parse speech_archive_meta.tsv, put into database
  3. Do ML, hope for the best.
  4. Get different features return to step 1.