Skip to content

Latest commit

 

History

History
101 lines (58 loc) · 103 KB

chroma.md

File metadata and controls

101 lines (58 loc) · 103 KB

In [1]:

%matplotlib inline
import numpy, scipy, matplotlib.pyplot as plt, IPython.display as ipd
import librosa, librosa.display
import stanford_mir; stanford_mir.init()

← Back to Index

Constant-Q Transform and Chroma

Constant-Q Transform

Unlike the Fourier transform, but similar to the mel scale, the constant-Q transform (Wikipedia) uses a logarithmically spaced frequency axis. For more information, read the original paper:

Let's load a file:

In [2]:

x, sr = librosa.load('audio/simple_piano.wav')
ipd.Audio(x, rate=sr)

Out[2]:

Your browser does not support the audio element.

To compute a constant-Q spectrogram, will use librosa.cqt:

In [3]:

fmin = librosa.midi_to_hz(36)
hop_length = 512
C = librosa.cqt(x, sr=sr, fmin=fmin, n_bins=72, hop_length=hop_length)

Display:

In [4]:

logC = librosa.amplitude_to_db(numpy.abs(C))
plt.figure(figsize=(15, 5))
librosa.display.specshow(logC, sr=sr, x_axis='time', y_axis='cqt_note', fmin=fmin, cmap='coolwarm')

Out[4]:

<matplotlib.axes._subplots.AxesSubplot at 0x115336dd8>

Note how each frequency bin corresponds to one MIDI pitch number.

Chroma

A chroma vector (Wikipedia) (FMP, p. 123) is a typically a 12-element feature vector indicating how much energy of each pitch class, {C, C#, D, D#, E, ..., B}, is present in the signal.

In [5]:

chromagram = librosa.feature.chroma_stft(x, sr=sr, hop_length=hop_length)
plt.figure(figsize=(15, 5))
librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', hop_length=hop_length, cmap='coolwarm')

Out[5]:

<matplotlib.axes._subplots.AxesSubplot at 0x114f50898>

In [6]:

chromagram = librosa.feature.chroma_cqt(x, sr=sr, hop_length=hop_length)
plt.figure(figsize=(15, 5))
librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', hop_length=hop_length, cmap='coolwarm')

Out[6]:

<matplotlib.axes._subplots.AxesSubplot at 0x115264780>

Chroma energy normalized statistics (CENS) (FMP, p. 375). The main idea of CENS features is that taking statistics over large windows smooths local deviations in tempo, articulation, and musical ornaments such as trills and arpeggiated chords. CENS are best used for tasks such as audio matching and similarity.

librosa.feature.chroma_cens

In [7]:

chromagram = librosa.feature.chroma_cens(x, sr=sr, hop_length=hop_length)
plt.figure(figsize=(15, 5))
librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', hop_length=hop_length, cmap='coolwarm')

Out[7]:

<matplotlib.axes._subplots.AxesSubplot at 0x11510b160>

← Back to Index