Skip to content

Latest commit

 

History

History
245 lines (142 loc) · 152 KB

kmeans.md

File metadata and controls

245 lines (142 loc) · 152 KB

In [1]:

import numpy, scipy, matplotlib.pyplot as plt, sklearn, librosa, mir_eval, IPython.display, urllib
plt.rcParams['figure.figsize'] = (14, 4)

← Back to Index

K-Means Clustering

Sometimes, an unsupervised learning technique is preferred. Perhaps you do not have access to adequate training data, or perhaps the training data's labels are not completely clear. Maybe you just want to quickly sort real-world, unseen, data into groups based on its feature similarity.

In such cases, clustering is a great option!

Download an audio file:

In [2]:

filename = '125_bounce.wav'
urllib.urlretrieve('http://audio.musicinformationretrieval.com/' + filename,
                   filename=filename)

Out[2]:

('125_bounce.wav', <httplib.HTTPMessage instance at 0x1138a6c20>)

Play the audio file:

In [3]:

IPython.display.Audio(filename)

Out[3]:

Your browser does not support the audio element.

Load the audio file into an array.

In [4]:

x, fs = librosa.load(filename)
print fs

22050

Plot audio signal:

In [5]:

librosa.display.waveplot(x, fs)

Out[5]:

<matplotlib.collections.PolyCollection at 0x115623690>

Onset Detection

Detect onsets:

In [6]:

onset_frames = librosa.onset.onset_detect(x, sr=fs, delta=0.04, wait=4)
onset_times = librosa.frames_to_time(onset_frames, sr=fs)
onset_samples = librosa.frames_to_samples(onset_frames)

Listen to detected onsets:

In [7]:

x_with_beeps = mir_eval.sonify.clicks(onset_times, fs, length=len(x))
IPython.display.Audio(x + x_with_beeps, rate=fs)

Out[7]:

Your browser does not support the audio element.

Feature Extraction

Let's compute the zero crossing rate and energy for each detected onset.

Plot the zero crossing rate:

In [8]:

def extract_features(x, fs):
    zcr = librosa.zero_crossings(x).sum()
    energy = scipy.linalg.norm(x)
    return [zcr, energy]

In [9]:

frame_sz = fs*0.090
features = numpy.array([extract_features(x[i:i+frame_sz], fs) for i in onset_samples])
print features.shape

(36, 2)

Feature Scaling

Scale the features (using the scale function) from -1 to 1.

In [10]:

min_max_scaler = sklearn.preprocessing.MinMaxScaler(feature_range=(-1, 1))
features_scaled = min_max_scaler.fit_transform(features)
print features_scaled.shape
print features_scaled.min(axis=0)
print features_scaled.max(axis=0)

(36, 2)
[-1. -1.]
[ 1.  1.]

Plot the features.

In [11]:

plt.scatter(features_scaled[:,0], features_scaled[:,1])
plt.xlabel('Zero Crossing Rate (scaled)')
plt.ylabel('Spectral Centroid (scaled)')

Out[11]:

<matplotlib.text.Text at 0x115603410>

Using K-Means

Time to cluster! Let's initialize the algorithm to find three clusters.

In [12]:

model = sklearn.cluster.KMeans(n_clusters=2)
labels = model.fit_predict(features_scaled)
print labels

[0 0 1 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 0 0]

Plot the results.

In [13]:

plt.scatter(features_scaled[labels==0,0], features_scaled[labels==0,1], c='b')
plt.scatter(features_scaled[labels==1,0], features_scaled[labels==1,1], c='r')
plt.xlabel('Zero Crossing Rate (scaled)')
plt.ylabel('Energy (scaled)')
plt.legend(('Class 0', 'Class 1'))

Out[13]:

<matplotlib.legend.Legend at 0x115c3ded0>

Listen to onsets assigned to Class 0:

In [14]:

x_with_beeps = mir_eval.sonify.clicks(onset_times[labels==0], fs, length=len(x))
IPython.display.Audio(x + x_with_beeps, rate=fs)

Out[14]:

Your browser does not support the audio element.

Class 1:

In [15]:

x_with_beeps = mir_eval.sonify.clicks(onset_times[labels==1], fs, length=len(x))
IPython.display.Audio(x + x_with_beeps, rate=fs)

Out[15]:

Your browser does not support the audio element.

Affinity Propagation

In scikit-learn, other clustering algorithms such as affinity propagation can cluster without defining the number of clusters beforehand.

All we need to do is swap out KMeans for AffinityPropagation:

In [16]:

model = sklearn.cluster.AffinityPropagation()
labels = model.fit_predict(features_scaled)
print labels

[0 0 2 1 2 0 1 0 2 2 0 1 2 1 2 0 1 0 2 2 0 1 2 1 2 0 1 0 2 0 2 1 2 2 0 0]

Plot features:

In [17]:

plt.scatter(features_scaled[labels==0,0], features_scaled[labels==0,1], c='b')
plt.scatter(features_scaled[labels==1,0], features_scaled[labels==1,1], c='r')
plt.scatter(features_scaled[labels==2,0], features_scaled[labels==2,1], c='y')
plt.xlabel('Zero Crossing Rate (scaled)')
plt.ylabel('Energy (scaled)')
plt.legend(('Class 0', 'Class 1', 'Class 2'))

Out[17]:

<matplotlib.legend.Legend at 0x115c81590>

Play a beep upon each frame in the same cluster:

Class 0:

In [18]:

x_with_beeps = mir_eval.sonify.clicks(onset_times[labels==0], fs, length=len(x))
IPython.display.Audio(x + x_with_beeps, rate=fs)

Out[18]:

Your browser does not support the audio element.

Class 1:

In [19]:

x_with_beeps = mir_eval.sonify.clicks(onset_times[labels==1], fs, length=len(x))
IPython.display.Audio(x + x_with_beeps, rate=fs)

Out[19]:

Your browser does not support the audio element.

Class 2:

In [20]:

x_with_beeps = mir_eval.sonify.clicks(onset_times[labels==2], fs, length=len(x))
IPython.display.Audio(x + x_with_beeps, rate=fs)

Out[20]:

Your browser does not support the audio element.

← Back to Index