Here are some notes, links, and resources to accompany my video: Sine-Wave Speech Will Blow Your Mind Plus Teach You Something About Pattern Recognition and AI
Sine-wave speech is synthesized by using a few (typically three or four) time-varying sinusoids that mimic the frequency and amplitude patterns of the resonance peaks of natural speech.
A formant is a concentration of acoustic energy around a particular frequency in the speech wave.
Formants are critical for the perception of vowel sounds. Different vowel sounds are distinguished largely by their formant frequencies.
Sine-wave speech is generated by using a formant tracker to detect the formant frequencies found in an utterance, and then synthesising sine waves that track the centre of these formants.