-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DSP to parse audio signal into MIDI sequence #3
Comments
Here a short term FFT analyzer. If I understand correctly you want blind separation of many sources mixed together. All I know is that for monophonic signals time-domain methods are faster, more accurate and with lower latency than FFT and for polyphonic signals it all break down and you have to go frequential, which brings quite a lot of latency. Do you really need low latency? You might preprocess the songs. |
That helps! :) I suspect lots of filtering/smoothing of the output will be required that will be fairly tricky to get accurate readings at very low latency. |
OK (stop me if I'm wrong) the inputs are:
Desired output:
For (a), Autotune claim to use auto-correlation methods (very basically FFT of a FFT then peak detection) to detect pitch. There are rumors that it's actually time-domain, and in my experience you can have something like 10ms latency for typical material. Note onset/offset is not that easy too since thresholds will inevitably be volume dependent. |
Sounds more or less right to me. 10ms is probably okay. Frames are 16ms, and the UI layer draws later in the frame, so it can be afforded the better part of the frame (most time spent rendering the background scene). It's a pretty involved piece of work. Hopefully someone more qualified than me steps forward to have a go at it! :) |
I will probably add a pitch detector to Unfortunately the latency of the audio API (and buffer size) has a way higher impact then mere detection. |
Yeah, I suspect some headache with the capture API's. We'll see how it goes when we get there. |
https://github.com/p0nce/dplug/blob/master/dsp/dplug/dsp/goldrabiner.d I've made a test program which output a WAV with pitch, voiced/unvoiced and a crude resynthesized output with volume = 1. The thing to get is that when there is no pitch (voicedness towards 0), the pitch output is wrong and shouldn't be used. It can be used for monophonic voice and probably other instruments. |
Necessary to support vocals and 'pro' guitar.
The text was updated successfully, but these errors were encountered: