GitHub - techmo-pl/vamp-wavelet-fft: Feature extraction based on wavelet and FFT

Techmo Sp. z o.o. module for audio features extraction

How to use

⚠️ Add ! character if you install the module in a jupyter notebook

pip install techmo-wavelet 

#import functions for feature extraction
from techmo.feature_extraction import calculate_wavelet_fft, calculate_fft_wavelet

# install numpy first in case is not installed in your environment
import numpy as np 

# signal must be 1d array read from wav file, e.x by using Soundfile. Here we generate random signal
signal = np.random.uniform(-1.0, 1.0, 16000)

# Here's an example of how to use `calculate_wavelet_fft` function
features = calculate_wavelet_fft(signal)

# Here's an example of how to use `calculate_fft_wavelet` function
features = calculate_fft_wavelet(signal)

The code implements 2 functions to extract features:

The calculate_wavelet_fft function implements an algorithm consisting of the following stages:

If the number of samples N is greater than or equal to 4800, the signal is divided into int(N/2400) segments to compute finally 60 features for each segment containing int(N/int(N/2400)) samples, i.e. the feature vector will have 60*int(N/2400) elements,
Segments are processed by the Hann window,
Segments are normalized separately,
Each segment is processed by the Wavelet Transform (WT),
Each WT subband is subjected to the Fast Fourier Transform (FFT),
FFT spectra are inputs of the triangular filtration to obtain the feature sub-vectors of length 60 for each segment,
The logarithms of filter outputs are computed to obtain the feature sub-vectors of length 60 for each segment.
Sub-vectors are concatenated to obtain a final feature matrix as numpy ndarray of shape int(N/2400), 60.

The calculate_fft_wavelet function implements an algorithm consisting of the following stages:

If the number of samples N is greater than or equal to 9600, the signal is divided into int(N/4800) segments to compute finally 60 features for each segment containing int(N/int(N/4800)) samples, i.e. the feature vector will have 60*int(N/4800) elements,
Segments are processed by the Hann window,
Segments are normalized separately,
Speech segments are processed by the the Fast Fourier Transform,
The complex spectra are subjected to Wavelet Transform (WT),
Absolute values of WT are calculated,
The computed modules are inputs of the triangular filtration,
The logarithms of filter outputs are computed to obtain the feature sub-vectors of length 60 for each segment.
Sub-vectors are concatenated to obtain a final feature matrix as numpy ndarray of shape int(N/4800), 60.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
techmo		techmo
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Techmo Sp. z o.o. module for audio features extraction

How to use

The code implements 2 functions to extract features:

About

Releases 4

Packages

Languages

License

techmo-pl/vamp-wavelet-fft

Folders and files

Latest commit

History

Repository files navigation

Techmo Sp. z o.o. module for audio features extraction

How to use

The code implements 2 functions to extract features:

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages