Skip to content

Latest commit

 

History

History
153 lines (84 loc) · 690 KB

magnitude_scaling.md

File metadata and controls

153 lines (84 loc) · 690 KB

← Back to Index

Magnitude Scaling

Often, the raw amplitude of a signal in the time- or frequency-domain is not as perceptually relevant to humans as the amplitude converted into other units, e.g. using a logarithmic scale.

For example, let's consider a pure tone whose amplitude grows louder linearly. Define the time variable:

In [3]:

T = 4.0      # duration in seconds
sr = 22050   # sampling rate in Hertz
t = numpy.linspace(0, T, int(T*sr), endpoint=False)

Create a signal whose amplitude grows linearly:

In [4]:

amplitude = numpy.linspace(0, 1, int(T*sr), endpoint=False) # time-varying amplitude
x = amplitude*numpy.sin(2*numpy.pi*440*t)

Listen:

Out[5]:

Your browser does not support the audio element.

Plot the signal:

In [6]:

librosa.display.waveplot(x, sr=sr)

Out[6]:

<matplotlib.collections.PolyCollection at 0x111179198>

Now consider a signal whose amplitude grows exponentially, i.e. the logarithm of the amplitude is linear:

In [7]:

amplitude = numpy.logspace(-2, 0, int(T*sr), endpoint=False, base=10.0)
x = amplitude*numpy.sin(2*numpy.pi*440*t)

Out[8]:

Your browser does not support the audio element.

In [9]:

librosa.display.waveplot(x, sr=sr)

Out[9]:

<matplotlib.collections.PolyCollection at 0x1111cdfd0>

Even though the amplitude grows exponentially, to us, the increase in loudness seems more gradual. This phenomenon is an example of the Weber-Fechner law (Wikipedia) which states that the relationship between a stimulus and human perception is logarithmic.

Spectrogram Visualization: Linear Amplitude¶

Let's plot a magnitude spectrogram where the colorbar is a linear function of the spectrogram values, i.e. just plot the raw values.

In [10]:

x, sr = librosa.load('audio/latin_groove.mp3', duration=8)
ipd.Audio(x, rate=sr)

Out[10]:

Your browser does not support the audio element.

In [11]:

X = librosa.stft(x)
X.shape

Out[11]:

(1025, 345)

Raw amplitude:

In [12]:

Xmag = abs(X)
librosa.display.specshow(Xmag, sr=sr, x_axis='time', y_axis='log')
plt.colorbar()

Out[12]:

<matplotlib.colorbar.Colorbar at 0x114f156a0>

Spectrogram Visualization: Log Amplitude¶

Now let's plot a magnitude spectrogram where the colorbar is a logarithmic function of the spectrogram values.

Decibel (Wikipedia)

librosa.amplitude_to_db:

In [13]:

Xdb = librosa.amplitude_to_db(Xmag)
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='log')
plt.colorbar()

Out[13]:

<matplotlib.colorbar.Colorbar at 0x110fa1780>

One common variant is the $\log (1 + \lambda x)$ function, sometimes known as logarithmic compression (FMP, p. 125). This function operates like $y = \lambda x$ when $\lambda x$ is small, but it operates like $y = \log \lambda x$ when $\lambda x$ is large.

In [14]:

Xmag = numpy.log10(1+10*abs(X))
librosa.display.specshow(Xmag, sr=sr, x_axis='time', y_axis='log', cmap="gray_r")
plt.colorbar()

Out[14]:

<matplotlib.colorbar.Colorbar at 0x111058710>

Spectrogram Visualization: Perceptual Weighting¶

librosa.perceptual_weighting:

In [15]:

freqs = librosa.core.fft_frequencies(sr=sr)

In [16]:

Xmag = librosa.perceptual_weighting(abs(X)**2, freqs)
librosa.display.specshow(Xmag, sr=sr, x_axis='time', y_axis='log')
plt.colorbar()

Out[16]:

<matplotlib.colorbar.Colorbar at 0x11a184400>

← Back to Index