-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AnalyzerNode API #253
Comments
The reason is that I think the compiler will not allow you to ship the reference to another thread. All in all I'm not sure if the current system is the best way to do it. |
Yup I see, I just had a look to the chromium source code and all analysis seems to be performed in the control thread (which really makes sens), see the Maybe we could kind of reverse the logic by doing something like this?
This way I guess we could avoid both the I'll try to have a shot on this, see where it goes :) edit: ... even if I'm realizing maybe this is not doable without memory allocation or the unsafe trick you used |
Your plan sounds alright, let's try to work it out later |
I managed to make a small prototype of a kind of lock free ring buffer that I think is thread safe. This is quite low-level and finally rely on use std::ptr;
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
const RING_BUFFER_SIZE: usize = 65536; // MAX_FFT_SIZE * 2
const QUANTUM_SIZE: usize = 128;
struct Analyser {
// keeping this around seems to prevent memory to be released while we use the ptr
buffer: Arc<[f32; RING_BUFFER_SIZE]>,
buffer_ptr: *mut f32,
index: AtomicUsize,
}
// @todo (?) - impl drop to drop the pointer manually
impl Analyser {
pub fn new() -> Self {
let mut buffer = [0.; RING_BUFFER_SIZE];
let buffer_ptr = buffer.as_mut_ptr();
Self {
buffer: Arc::new(buffer),
buffer_ptr: buffer_ptr,
index: AtomicUsize::new(0),
}
}
// this runs in the audio thread
pub fn add_input(&self, src: &[f32]) {
let mut index = self.index.load(Ordering::SeqCst);
let len = src.len();
// push src data in ring bufer
if index + len > RING_BUFFER_SIZE {
// in our test conditions we can't be there yet
} else {
// we have enough room to copy src in one shot
unsafe {
let src_ptr = src.as_ptr();
let dst_ptr = self.buffer_ptr.add(index);
ptr::copy_nonoverlapping(src_ptr, dst_ptr, len);
}
}
index += len;
if index >= RING_BUFFER_SIZE {
index -= RING_BUFFER_SIZE;
}
self.index.store(index, Ordering::SeqCst);
}
// if we read only below index in control thread we are sure the memory is clean
}
fn main() {
let analyser = Analyser::new();
for _ in 0..2 {
for i in 0..(RING_BUFFER_SIZE / QUANTUM_SIZE) {
let data = [i as f32; QUANTUM_SIZE];
analyser.add_input(&data);
println!("{:?}", analyser.index.load(Ordering::SeqCst));
}
}
} What do you think, should we try to continue this way or do you see something wrong I didn't catch ? |
Hum, actually it doesn't seem to work well, copied values are garbage. That's weird it was working well when doing exactly the same thing without the |
I think I maybe found the problem (inspired from https://github.com/utaal/spsc-bip-buffer/blob/master/src/lib.rs#L89), using a pub fn new() -> Self {
// inspired from https://github.com/utaal/spsc-bip-buffer/blob/master/src/lib.rs#L89
// allocated in the stack but done in the control thread
let mut buffer = Box::new([0.; RING_BUFFER_SIZE]);
let buffer_ptr = buffer.as_mut_ptr();
Self {
buffer,
buffer_ptr,
index: AtomicUsize::new(0),
}
} Quite a funny thing |
Ok, ended up with that: https://gist.github.com/b-ma/a0909191089037b9cbebc2f7bd1c8117, which I think should work quite well in our case. From the unit tests I have made, I really don't see what could go wrong as the logic behind is finally quite simple, but maybe I miss something. Did it in some dummy lib project to really focus on the problem, but from that point I really think adapting the Analyser should be quite straightforward |
Hey @b-ma I am very sorry to ruin your party. But reading and writing to the same memory location concurrently is undefined behaviour.. :( This is what
I'm sharing your gut feeling that writing to a static location of But still then, I am not entirely convinced your example will work. There is no reliable way to read the full buffer without risking an intermediate write, and this will result in a garbage FFT. You could maybe check if there are any crates that attempt to solve this |
Please note however, I think you are diving in a very cool domain here. We should explore further. A safe implementation I can imagine is this:
|
Huhu, no problem I can understand your concerns But (yes there is a but), I'm still convinced it works (or at least it can / should, whatever idiot compilers say :) ) because there is no possible way you are reading the memory location that you are currently writing even if both processes do it concurrently:
So, from a strictly logical point of view, I really don't see where there could be any problem with corrupted data (except if For information, I inspired from these post and code:
(what is Just seen your new post, so I continue: I understand and agree with your concerns that I'm playing with weird stuff that I'm not fully understand here, and that more hardcore low-level expertise would be welcome :) On the algorithm you describe, almost everything is there (except obviously the first point). The only small other differences I can see are:
In any case, I'm perfectly ok to continue on the "safe" (I will ask later to more low-level (C, C++) colleagues what they think about the |
Hey, I asked a colleague (who already did this kind of stuff in C/C++) about the strategy I proposed for the lock free ring buffer and few things to add to the discussion:
Maybe, we could ask p Adenot for its insight too |
Thanks for sharing your new insights. Interesting stuff. Let me get very straigth: the 'safe' version with What the safe version does guard against is undefined behaviour. Which is a thing we should avoid at all cost. Also, you cannot statistically make undefined behaviour go away. Unlikely undef is still undef. Also, rust targets 16-bit architectures so we should probably not make any assumptions about 'probably safe'. Let's measure performance with the |
ok, you get the point: 1. benches are important to know what we are talking about and 2. reading this is important to know what we are talking about :) |
The safe version is merged now. Also I added a small render thread CI benchmark for the AnalyserNode which will aid further measurements. Some more reading for our interest:
Which is what I used for my standpoint "don't go into the UB territory" :) |
Nice! |
Hey,
Trying to wrap the Analyzer I just ran on the fact that
get_float_frequency_data
andget_float_time_domain_data
are both waiting for aVec
while it seems to me that everywhere else the spec declare aFloat32Array
we used a&mut [f32]
, cf.AudioBuffer::copy_from_channel
for example.Is there any reason for that ? I don't really see why we couldn't use the same API here
The text was updated successfully, but these errors were encountered: