Skip to content

Audio Mezzanine

Nicholas Frame edited this page May 21, 2021 · 1 revision

Scope

  • Testing CTA-5003 8.2 Sequential Track Playback.
    • Optimized for testing section 8.2.5 requirements.
    • May be extended in future to also serve as base for other tests.

Audio mezzanine generation

Reference audio mezzanine files generated using Python script (audiomezz.py).

  • Can be identically reproduced in the observation framework using the same parametrization of the random noise generation function.
    • Samples are generated from a uniform distribution using the NumPy PCG-64 BitGenerator.
    • By default, a string of the output filename is converted into Unicode code values that are concatenated, then converted to an integer that is used as the seed for the PRNG. A different text string can be provided using the -s or --seed parameter, and will be used instead of the filename.
    • For example: seed = 11610111511695494895504611997118 for an output mezzanine file named test_10_2.wav.
  • Not necessary to supply generated noise files to the test setup, as they can be recreated using a known seed and PRNG.

Audio signal design

Test signal

  • Bandlimited (lowpass) version of a pseudo random noise sequence.
  • Replicated into the left and right channels, with other channels filled with digital silence.

Why pseudo random noise?

  • Disregarding the band limitation, each sample is a unique sequence of random values that, with high probability, does not appear anywhere else in the test vector.
  • Given a sample from an unknown test signal at an unknown offset, it is possible to identify exactly where that sample was located in the signal.

Why the band limitation?

  • Modern codecs employ techniques that, for higher frequencies, do not attempt to replicate the waveform but rather just the energies.
  • The identification technique is dependent on the waveform, so band limiting the signal maximizes the signal that is faithfully reproduced from the codec.

Why only left and right channels?

  • Maximizes the probability that the waveform is reproduced faithfully, circumventing signal processing that virtualizes surround or height channels.

Identification of samples taken

With a sample defined as 1024 contiguous PCM audio samples from the signal.

  • A sample is checked against every 1024 PCM sample subsequence in the reference: measure of similarity is computed.
  • Maximum similarity when subsequence in the reference corresponds to the sample.
  • Very efficient implementation of this algorithm provided by algorithms for cross correlation (typically FFT based).
    • Open source Python package SciPy provides implementations of this algorithm.

Test using microphone recording

  • Recording device under test audio output (e.g. headphones, line-out, S/PDIF) is not always possible.
  • If playing the signal back from the device under test and recording speaker audio output with a microphone, must assume
    • noise added to recorded observation,
    • phase shift with regards to reference.
  • Where direct signal comparison would be challenging with the added noise and phase-shift, identification of samples within the signal via cross correlation is robust against noise in the measurement.