features

Jump to bottom

github-actions[bot] edited this page May 25, 2025 · 1 revision

Speech Features Extraction

See feature_extraction.py for more detail

Speech features are extracted from the Signal with sample_rate, frame_ms, stride_ms and num_feature_bins.

Speech features has the shape (B, T, num_feature_bins, num_channels) and it contains from 1-4 channels:

Spectrogram, Log Mel Spectrogram, Log Gammatone Spectrogram or MFCCs
TODO: Delta features: like librosa.feature.delta from the features extracted on channel 1.
TODO: Delta deltas features: like librosa.feature.delta with order=2 from the features extracted on channel 1.
TODO: Pitch features: like librosa.core.piptrack from the signal

Implementation in tensorflow keras layer

Spectrogram

Log Mel Spectrogram

MFCCs

Log Gammatone Spectrogram