Skip to content

Training your own custom audio classification models

Jyotika Singh edited this page May 20, 2022 · 2 revisions

Training and Classifying Audio files

Audio data can be trained, tested and classified using pyAudioProcessing. Please see feature options and classifier model options for more information.

Sample spoken location name dataset for spoken instances of "london" and "boston" can be found here.

Training

There are 2 ways to pass the training data in.

  1. Using locations of files in a dictionary format as the input file_names.

  2. Passing in a folder_path containing sub-folders and audio. Please refer to the section on Training and Testing Data structuring to use your own data instead.

from pyAudioProcessing.run_classification import train

# Training
train(
	file_names={
		"music": [<path to audio>, <path to audio>, ..],
		"speech": [<path to audio>, <path to audio>, ..]
	},
	feature_names=["gfcc", "spectral", "chroma"],
	classifier="svm",
	classifier_name="svm_test_clf"
)

Or, to use a directory containing audios organized as in structuring guidelines, the following can be used

train(
	folder_path="../data", # path to dir
	feature_names=["gfcc", "spectral", "chroma"],
	classifier="svm",
	classifier_name="svm_test_clf"
)

The above logs files analyzed, hyperparameter tuning results for recall, precision and F1 score, along with the final confusion matrix.

Classification

To classify audio samples with the classifier you created above,

# Classify a single file 

results = classify(
	file = "<path to audio>",
	feature_names=["gfcc", "spectral", "chroma"],
	classifier="svm",
	classifier_name="svm_test_clf"
)

# Classify multiple files with known labels and locations
results = classify(
	file_names={
		"music": [<path to audio>, <path to audio>, ..],
		"speech": [<path to audio>, <path to audio>, ..]
	},
	feature_names=["mfcc", "gfcc", "spectral", "chroma"],
	classifier="svm",
	classifier_name="svm_test_clf"
)

# or you can specify a folder path as described in the training section.

The above logs the filename where the classification results are saved along with the details about testing files and the classifier used if you pass in logfile=True into the function call.

Command-line examples

If you cloned the project via git, the following command line example of training and classification with gfcc,spectral,chroma features and svm classifier can be used as well. Sample data can be found here. Please refer to the section on Training and Testing Data structuring to use your own data instead.

Training:

python pyAudioProcessing/run_classification.py -f "data_samples/training" -clf "svm" -clfname "svm_clf" -t "train" -feats "gfcc,spectral,chroma"

Classifying:

python pyAudioProcessing/run_classification.py -f "data_samples/testing" -clf "svm" -clfname "svm_clf" -t "classify" -feats "gfcc,spectral,chroma" -logfile "../classifier_results"

Classification results get saved in ../classifier_results_svm_clf.json.