Sadzam

A slow version of the Shazam algorithm implemented in Python.

Why

This algorithm is extremely elegant in my opinion. It starts with the simple question "How do you search audio?" But it must also handle noise, use a short sample, and be lightning fast.

The algorithm solves these through filtering, windowing, and FFTs to generate an audio fingerprint.

Importantly, it doesn't use neural nets or "machine learning".

Algorithm

Read in audio file
Convert from stereo to mono
Low pass filter (butterworth)
Downsample
Hamming window in 0.1s intervals
FFT and sort into (logarithmic) bins
Save "loudest" frequencies into spectrogram
Create ordering of points
For each "target" point, calc distance from neighboring cluster
Save as keys in dict pointing to songID
Normalize results, count matches

Considerations

Use whatever window besides rectangular to prevent spectral leakage
The frequency response of the human ear is highest at 3000Hz. We split it up into 6 different frequency bins
We take the magnitude of the frequency
Complexity cannot scale up with size of database. You can't just do cross correlation

Visualization

Running the tests

pytest

Demo

Add about 50 songs to the database folder
Convert songs to wav using ffmpeg
Build database
Obtain 10 second sample (noise optional)
Create frequencies array, spectrogram
Count hits, make pairs
Analyze hits

The following output correctly identifies oblivion sample + noise

[(('oblivion', 'lilypotter'), 0.02227589908749329),
 (('ijustcalledtosayiloveyou', 'steviewonder'), 0.016269960831575777),
 (('jessiesgirl', 'rickspringfield'), 0.011674641148325358),
 (('canttakemyeyesoffofyou', 'frankievalli'), 0.010076185795035636),
 (('ilikethat', 'janellemonae'), 0.009678668215253582)]

Acknowledgments

https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf
http://coding-geek.com/how-shazam-works/
3b1b's video on fourier transform
Lots and lots of wikipedia
- https://en.wikipedia.org/wiki/Short-time_Fourier_transform

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
FFT		FFT
Sound		Sound
XC		XC
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sadzam

Why

Algorithm

Considerations

Visualization

Running the tests

Demo

Acknowledgments

About

Releases

Packages

Languages

Graystripe17/Sadzam

Folders and files

Latest commit

History

Repository files navigation

Sadzam

Why

Algorithm

Considerations

Visualization

Running the tests

Demo

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages