Various configuration questions #15

skeggse · 2018-03-19T02:58:35Z

I realize you're working on a tutorial for configuring odas, but I'm attempting to understand it in the meantime. I'm wondering what what the separated and postfiltered streams respresent. From listening to them, it sounds like the separated streams should include a single channel for each tracked source, and the postfiltered stream might contain the noise corresponding to each source - is that an apt description?

I'm noticing that there's often significant overlap between two of the channels - that they're tracking what seems to be the same source. Would that be resolved by tuning the settings via something like ODAS Studio? I built a configuration for the PS3 Eye microphone array (like @Efreeto was looking to do in introlab/manyears#2; happy to open a pull-request to add the configuration 😄), so it's likely not well-tuned.

What's the relationship between hopSize and frameSize? The example general configuration on the wiki sets hopSize to 128, and frameSize to 256; I'd assumed hopSize was the chunk size for audio processing, and frameSize was the number of samples in a frame/chunk, but that'd be backwards for 16-bit audio.

The text was updated successfully, but these errors were encountered:

FrancoisGrondin · 2018-03-19T19:45:23Z

Thank you for your questions. There is in fact a tutorial being written, but I'll try to answer your specific questions the best I can.

The separated streams correspond to the separation obtained from linear demixing from multiple channels to a single channels, using methods such as delay-and-sum beamforming or geometric sound source separation. This streams contains the target speech with little distortion, and some interference and noise in the background. The postfiltering step aims to reduce the gain in some frequency bands where noise or interference sources are dominant. This improves the SNR, but also introduces some distortion.

From my understanding your are using a linear array on a PS3 Eye, is this correct? If so, tracking should be performed on a 2D arc, and not a 3D sphere. I will have to code this, and we could release code to support linear array with ODAS. Would this be convenient for you?

The parameters hopSize and frameSize stand for the distance in samples between successive frames, and the frame size in samples. For instance, if you set hopSize = 100 and frameSize = 256, frame 0 contains samples [0,255], frame 1 contains samples [100,355], frame 2 contains samples [200, 455], and so on.

Does this answer your questions?

Thank you,

skeggse · 2018-03-20T00:39:02Z

Your description of the separated streams and the postfiltering helps a lot, but doesn't entirely explain the behavior I'm seeing. The postfiltered streams appear to contain substantially more noise and distortion than the separated streams. I suppose this could be because it's a linear array?

I will have to code this, and we could release code to support linear array with ODAS. Would this be convenient for you?

That would be remarkably convenient - let me know if I can help in any way. I'll open a issue for this.

The parameters hopSize and frameSize stand for the distance in samples between successive frames, and the frame size in samples. For instance, if you set hopSize = 100 and frameSize = 256, frame 0 contains samples [0,255], frame 1 contains samples [100,355], frame 2 contains samples [200, 455], and so on.

Ah, so neither of these refer to byte sizes, but to sizes in number of samples, and each frame (normally) overlaps with one or more previous frames.

Super helpful, thanks!

FrancoisGrondin · 2018-03-20T01:18:36Z

Your description of the separated streams and the postfiltering helps a lot, but doesn't entirely explain the behavior I'm seeing. The postfiltered streams appear to contain substantially more noise and distortion than the separated streams. I suppose this could be because it's a linear array?

Yes probably. It is hard to tell without the data. Can you provide me with the recordings in raw format and your cfg file?

skeggse · 2018-03-20T19:23:46Z

Audio samples, config file

If there's specific source configurations or noise I can add that would be more helpful, let me know.

FrancoisGrondin · 2018-03-22T18:26:15Z

Ok, I'll try to have a look asap, and get back to you with answers.

FrancoisGrondin · 2018-03-24T23:54:00Z

Please have a look at issue #18 and see if the new code to handle linear arrays solve this issue at the same time.

skeggse mentioned this issue Mar 20, 2018

Support linear microphone arrays #18

Closed

FrancoisGrondin added the question label Mar 20, 2018

FrancoisGrondin self-assigned this Mar 22, 2018

FrancoisGrondin closed this as completed Mar 29, 2018

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various configuration questions #15

Various configuration questions #15

skeggse commented Mar 19, 2018

FrancoisGrondin commented Mar 19, 2018

skeggse commented Mar 20, 2018

FrancoisGrondin commented Mar 20, 2018

skeggse commented Mar 20, 2018

FrancoisGrondin commented Mar 22, 2018

FrancoisGrondin commented Mar 24, 2018

Various configuration questions #15

Various configuration questions #15

Comments

skeggse commented Mar 19, 2018

FrancoisGrondin commented Mar 19, 2018

skeggse commented Mar 20, 2018

FrancoisGrondin commented Mar 20, 2018

skeggse commented Mar 20, 2018

FrancoisGrondin commented Mar 22, 2018

FrancoisGrondin commented Mar 24, 2018