This repository is a study of Digital Signal Processing techniques for Audio signals
DSP helps creation of features for ML
- We look at creation of features for traditional ML - for example a SVM classifer
- And, we look at creating features for Deep learning techniques such as MLP, CNN, RNN and LSTM
For traditional ML feature engineering, we look at Time domain, Frequency domain and time-frequency features
Time Domain features:
- A-D-S-R model: Attack-Decay-Sustain-Release model for audio
- Amplitude envelope (AE)
- Root-mean-square Energy (RMS-E)
- Zero-crossing rate (ZCR)
Frequency Domain features:
- FFT - Fast Fourier Transform
- STFT - Short Time Fourier Transform
- MFCC - Mel Frequency Cepstral Coefficents
- MFCC: MFCC feature extraction technique basically includes windowing the signal, applying the DFT, taking the log of the magnitude, and then warping the frequencies on a Mel scale, followed by applying the inverse DCT.
For deep learning, we use MFCC converted to images
We build a CNN using TensorFlow Keras
To-Do: Build RNN, LSTM for DSP
For additional technical notes please see:
- 20-May-2021
- Attempt to train to recognize faulty valves
- Unseen data will be from another folder