Immutable Event implementation #3198

lifers · 2024-08-12T12:48:38Z

Implement the Event struct with no need of mutating methods using RwLock. I'd think it's safe from deadlock because:

RwLock can block when reader and writer threads are contesting access
When swap lock is acquired, we either have:
- an exclusive write access from one of add, remove, or clear
- a shared read access from call and one of add, remove, or clear
Otherwise,
- when a change lock is acquired, we have a single read access from one of add, remove, or clear
- when no lock is acquired, there is no access
Therefore delegates shouldn't deadlock

Please let me know if there is other considerations for this feature. Thank you!

No deadlock reasoning: - RwLock can block when reader and writer threads are contesting access - When swap lock is acquired, we either have: - an exclusive write access from one of add, remove, or clear - a shared read access from call and one of add, remove, or clear - Otherwise, - when a change lock is acquired, we have a single read access from one of add, remove, or clear - when no lock is acquired, there is no access - Therefore delegates shouldn't deadlock

lifers · 2024-08-12T12:51:51Z

@microsoft-github-policy-service agree

kennykerr · 2024-08-12T13:51:51Z

Additional synchronization should not be required above the swap and change locks. For reference, I wrote the Rust implementation based on the implementation I wrote for C++/WinRT here:

https://github.com/microsoft/cppwinrt/blob/master/strings/base_events.h

That in turn is loosely based on the WRL implementation.

kennykerr · 2024-08-12T14:31:28Z

Basically all we need to do here is make add, remove, and clear take a &self. The only tricky bit is convincing Rust that Array::swap is thread safe.

Assuming we have an exclusive access to the array

Assuming we have exclusive access to the array

lifers · 2024-08-13T01:23:50Z

Since we have exclusive access to the delegate array when we hold the swap lock, would you rather have the swap done like this? (e.g. lines 104-105, 160-167)

kennykerr · 2024-08-13T12:48:09Z

Looks good - is it necessary to use a pointer rather than a mutable reference?

use core::cell::UnsafeCell.

lifers · 2024-08-13T16:55:37Z

I figured out a way to use a mutable reference using UnsafeCell.

Once the class design is complete, I guess we should impl Send and Sync right 🤔

kennykerr · 2024-08-13T17:07:11Z

Yes, it can be Send and Sync.

kennykerr · 2024-08-13T19:08:27Z

Would appreciate some more eyes on this. The change here is just to make the Event type - part of the supporting type system for implementing WinRT events - to not require mutable references to self since the implementation is otherwise entirely thread safe and is meant to be used in the context of a multi-threaded COM object that can receive concurrent requests for adding/removing/calling event handlers.

@ChrisDenton @dpaoliello @sivadeilra

dpaoliello · 2024-08-13T21:19:27Z

Seems reasonable to me. @sivadeilra thoughts?

sivadeilra · 2024-08-13T22:33:28Z

Reading.

crates/libs/core/src/event.rs

sivadeilra

I don't see how this code is safe (sound). It accesses the delegates list while acquiring different locks in different pub methods.

Can you explain why this is safe? If delegates is associated with exactly one lock, then why is it not contained within one of the Mutex instances? Since this code acquires different locks in different paths, I don't understand how it is sound.

lifers · 2024-08-13T23:03:52Z

From my understanding:

We don't want concurrent add/remove/clear, so we have the self.change Mutex to prevent it from happening.
When add/remove/clear is reading delegates and hasn't mutated anything, we don't want to block call since it only needs an immutable reference.
However, when add/remove/clear is writing to delegates, we do need an exclusive mutable reference so we have another lock self.swap for it.
The only time call can't read delegates is when delegates are being written, or in other words self.swap is being locked. So we simply make call acquire self.swap in order to read it.

sivadeilra

Ok, I re-read this with Chris Denton's comments in mind.

I hate to say it, but I think this code makes the wrong trade-off of safety vs. performance. First, this code may currently be sound, but it is dangerously close to the edge. And there are safe abstractions that already provide the semantics that are wanted.

From my reading of this code, RwLock would provide exactly the same level of synchronization, and would do so safely. The only scenario where RwLock would have different behavior would be that writers would briefly block reads (call paths), but I consider that acceptable for this design because modifying the list should be extremely infrequent, compared to reading the list.

So I think this PR needlessly uses unsafe code; there is no advantage here to using unsafe code. Even worse, this PR mixes the implementation of a R/W lock with a particular usage of that R/W lock, so it mixes two things together. That makes verification harder and makes it more likely that a future PR that looks innocent will actually break the safety invariants. And there's just no advantage over using RwLock.

I'm also troubled by the performance implication of cloning the delegate list every time call runs. That's a lot of allocator calls.

I think this would be improved by replacing all of this with RwLock<Arc<Array<T>>>. The writer paths (add, remove, etc.) would acquire it in exclusive mode and do the obvious thing. The call path would acquire the RwLock in shared mode, call Arc::clone (which is a single interlocked increment), then drop the RwLock guard. This would eliminate all of the allocator calls in call.

This implementation would be 100% safe code, and obviously correct from a borrow-checker and concurrency point of view.

kennykerr · 2024-08-14T00:47:19Z

Thanks for the feedback. As I mentioned in #3198 (comment), this implementation is copied from C++/WinRT which is based on WRL. The choice around performance tradeoffs was done before my time but it was stressed that C++/WinRT must provide the same performance tradeoffs as WRL as they were calculated at some point to be preferable for the OS. Having said that, I don't mind making a different set of performance tradeoffs for Rust if it means the code is simpler and thus easier to reason about. I will say that this implementation (at least the C++ implementations) has stood the test of time and very heavy usage.

However, there are some constraints that must be preserved. I suspect RwLock<Arc<Array<T>>> could preserve them but I thought I'd just be explicit about that here so that we're on the same page. That will also help folks understand the existing implementation. The constraints are essentially:

No locks can be held while invoking delegates (event handlers). This is to avoid COM reentrancy issues and deadlocks.
No locks can be held indefinitely that would prevent add or remove calls from acquiring and executing within a lock. This is to avoid DoS for unrelated code.

If a simpler implementation is possible that preserves these constraints, then I'm not opposed to it. 😊

sivadeilra · 2024-08-14T01:01:45Z

Both of those constraints would be met by using RwLock<Arc<Array<T>>>. I understand the need for them.

I think a simpler, provably-safe implementation is the best place to start. If there is quantitative evidence that this implementation can't meet the same needs as the equivalent C++, then I'm open to adjusting that, but not without some evidence.

If it would help, I can sketch out the implementation.

kennykerr · 2024-08-15T14:31:45Z

Hey @lifers I realize this is probably more than you bargained for. 😊 Would you like someone else to handle this?

lifers · 2024-08-15T15:11:07Z

Hi, this has gotten more interesting and I'd like to figure this out unless the maintainers need to finish this ASAP with their background knowledge 😄. It takes a while for me to figure out what else is happening inside the initial code. From what I've seen,

Do we need to keep the fallible allocation that returns E_OUTOFMEMORY instead of Rust's default abort?
Using the structs from Rust will simplify a lot of things but might increase the memory footprint (e.g. using std::sync::Arc instead of imp::RefCount for reference counting, using RwLock to guard the Array). Is this something of a concern?
Is there a limitation of where structs can come from? Can we use things freely from std::?

kennykerr · 2024-08-15T15:48:03Z

A few thoughts.

Consistent error handling via Result (and HRESULT) is important.
The downside to using RwLock<Arc<Array<T>>> is that it would result in two allocations. The existing implementations handle this by storing the array and the reference count in a single allocation. This is also how the Rust event's internal Array works.
Some of the additional reasoning about the existing implementations is persuaded by the understanding that events will almost always only ever have zero or one event handlers so if you're going to optimize that's what you should optimize for. Thus, needing two allocations for storing a single event handler seems wasteful.

I wonder if we can use RwLock<Array<T>> instead of RwLock<Arc<Array<T>>> given that Array already provides a thread-safe reference counting.

sivadeilra · 2024-08-15T16:42:50Z

So here's a sketch of what I propose:

pub struct Event<T> {
    delegates: RwLock<Arc<[T]>>,
}

impl<T> Event<T> {
    pub fn add(&self, delegate: T) {
        let mut guard = self.delegates.write().unwrap();
        let mut new_list = Vec::with_capacity(guard.len() + 1);
        for old_delegate in guard.iter() {
            new_list.push(old_delegate.clone());
        }
        new_list.push(delegate);
        let new_list_arc: Arc<[T]> = new_list.into(); // <-- converts Vec<T> into Arc<[T]>
        *guard = new_list_arc;
    }

    pub fn clear(&self) {
        let mut guard = self.delegates.write().unwrap();
        guard.clear();
    }

    pub fn remove(&self, delegate: &T) {
        let mut guard = self.delegates.write().unwrap();
        if let Some(i) = guard.iter().find(|old_delegate| old_delegate == delegate) {
            let mut new_list = Vec::with_capacity(guard.len() - 1);
            new_list.extend(guard[..i].iter().cloned());
            new_list.extend(guard[i + 1..].iter().cloned());
            let new_list_arc: Arc<[T]> = new_list.into(); // <-- converts Vec<T> into Arc<[T]>
            *guard = new_list_arc;
        } else {
            // We never found it, so don't modify the list.
        }
    }

    pub fn call(&self) {
        let delegates: Arc<[T]> = {
            let guard = self.delegates.read().unwrap();
            guard.clone()
            // <-- lock is released here
        };
        for delegate in delegates.iter() {
            // ...
        }
    }

Obviously this is just a sketch, but the idea should be clear. The paths that modify the list use RwLock::write. The path that reads the list (call) uses RwLock::read, but only for a very brief time so that they can increase the refcount on the list.

lifers · 2024-08-15T17:25:00Z

If the fallible allocation is still needed, we could use a simplified Array just to allocate like:

/// A thread-safe reference-counted array of delegates.
struct Array<T: Interface> {
    buffer: *mut Delegate<T>,
    len: usize,
}

impl<T: Interface> Default for Array<T> {
    fn default() -> Self {
        Self::with_capacity(0).unwrap()
    }
}

impl<T: Interface> Array<T> {
    /// Creates a new, empty `Array<T>` with the specified capacity.
    fn with_capacity(capacity: usize) -> Result<Self> {
        if capacity != 0 {
            let alloc_size = capacity * size_of::<Delegate<T>>();
            let header = heap_alloc(alloc_size)? as *mut Delegate<T>;

            Ok(Self {
                buffer: header,
                len: 0,
            })
        } else {
            Ok(Self {
                buffer: null_mut(),
                len: 0,
            })
        }
    }

which will be used as

pub struct Event<T: Interface> {
    delegates: RwLock<Arc<Array<T>>>,
}

impl<T: Interface> Event<T> {
    pub fn add(&self, delegate: T) -> Result<()> {
        let mut guard = self.delegates.write().unwrap();
        let mut new_list = Array::with_capacity(guard.len() + 1)?;
        for old_delegate in guard.as_slice().iter() {
            new_list.push(old_delegate.clone());
        }
        new_list.push(Delegate::new(&delegate)?);
        // let new_list_arc: Arc<[Delegate<T>]> = new_list.into(); // <-- converts Vec<T> into Arc<[T]>
        *guard = new_list.into();
        Ok(())
    }

    pub fn clear(&self) -> Result<()> {
        let mut guard = self.delegates.write().unwrap();
        *guard = Array::with_capacity(0)?.into();
        Ok(())
    }
...
    pub fn call(&self) {
          let delegates = {
              let guard = self.delegates.read().unwrap();
              guard.clone()
              // <-- lock is released here
          };
          for delegate in delegates.as_slice().iter() {
              // ...
          }
      }

kennykerr · 2024-08-15T17:47:01Z

Don't worry about allocation failure.

sivadeilra · 2024-08-15T18:01:27Z

Honestly, I think providing the "fallible" allocation pathway is worse. Just stick with the standard Rust approach. If you are implementing a memory allocation container, the appropriate response to OOM is to call std::alloc::handle_alloc_error. This initiates the panic / stack unwind.

The `Option` that wraps `Arc<[Delegate<T>]>` will let us assign `None` instead of creating `Arc::new([])` with no space penalty.

crates/libs/core/src/event.rs

sivadeilra · 2024-08-15T22:26:36Z

Looks a lot simpler -- thank you!

The Clippy suggestion about a Default impl is a good one.

crates/libs/core/src/event.rs

Now we only need to allocate once for Arc

lifers · 2024-08-16T14:31:05Z

By the way, now that Event<T> can be used concurrently, should we have tests for that?

kennykerr · 2024-08-16T14:55:56Z

Concurrency tests can be challenging - fortunately the code is now a lot simpler so perhaps just a basic smoke test confirming that it is in fact Send and Sync may be helpful e.g. can be used with thread::spawn or similar.

crates/libs/core/src/event.rs

lifers · 2024-08-16T15:07:16Z

Just an idea for a new test...

```Rust #[test] fn across_threads() -> Result<()> { let event = Arc::new(Event::>::new());

  let check = Arc::new(AtomicI32::new(0));
  let other = Arc::new(AtomicI32::new(0));

  // Register event handler on a different thread.
  let event1 = event.clone();
  let check1 = check.clone();
  let _ = std::thread::spawn(move || {
      event1.add(&EventHandler::<i32>::new(move |_, args| {
          check1.fetch_add(*args, Ordering::Relaxed);
          Ok(())
      })).unwrap();
  }).join().unwrap();

  // Raise event on the main thread.
  assert_eq!(check.load(Ordering::Relaxed), 0);
  event.call(|delegate| delegate.Invoke(None, 1));
  assert_eq!(check.load(Ordering::Relaxed), 1);

  let event2 = event.clone();
  let other2 = other.clone();
  let check2 = check.clone();
  std::thread::scope(move |s| {
      // Call the `check` event handler on a another thread.
      let event3 = event2.clone();
      s.spawn(move || {
          assert_eq!(check2.load(Ordering::Relaxed), 1);
          event3.call(|delegate| delegate.Invoke(None, 2));
          assert_eq!(check2.load(Ordering::Relaxed), 3);
      });

      // Register and call `other` event on another thread.
      let event3 = event2.clone();
      s.spawn(move || {
          let other3 = other2.clone();

          event3.add(&EventHandler::<i32>::new(move |_, args| {
              other3.fetch_add(*args, Ordering::Relaxed);
              Ok(())
          })).unwrap();

          assert_eq!(other2.load(Ordering::Relaxed), 0);
          event.call(|delegate| delegate.Invoke(None, 4));
          assert_eq!(other2.load(Ordering::Relaxed), 4);
      });
  });

  // Check final value after the thread scope ends.
  assert_eq!(check.load(Ordering::Relaxed), 7);
  assert_eq!(other.load(Ordering::Relaxed), 4);

  Ok(())

}

</details>

sivadeilra · 2024-08-16T15:27:27Z

I'm not opposed to that test, but from experience in testing concurrent algorithms, I don't think it actually has much value. I know, more testing is always supposed to be better, but when it comes to concurrency, tests can often be flakey, or even if they aren't flakey, they don't actually exercise meaningful properties and so they just give a false sense of security.

If you think this test is valuable, and it runs in a very short period of time, I'm fine with it. However, don't drop a thread joiner; that's nearly always a code smell. If you think you've already synchronized with the thread, then retain the thread joiner and join on it at the end of the test function. It should be a no-op, but it gives me higher confidence in the correctness of the test.

kennykerr · 2024-08-16T15:31:26Z

Yep, I'd not worry about testing the concurrency itself and just test that the compiler can actually move the references around - the thing that wouldn't work if Send and Sync weren't implemented - in the simplest way possible.

Should fail to compile if not

crates/libs/core/src/event.rs

Co-authored-by: Kenny Kerr <[email protected]>

kennykerr

Appreciate all the work on this - looks great!

lifers · 2024-08-16T17:17:41Z

Glad to help! 😁 Thanks everyone for following through.

lifers added 3 commits August 11, 2024 22:28

update tests to use non-mutable event

09a4de9

revert accidental change on Array::swap

0337f17

lifers added 4 commits August 12, 2024 21:17

use cast to *mut Array<T>

33d8896

Assuming we have an exclusive access to the array

use cast to *mut Array<T>

db90add

Assuming we have an exclusive access to the array

use cast to *mut Array<T>

48024c2

Assuming we have exclusive access to the array

Merge branch 'master' of https://github.com/lifers/windows-rs

49509e8

take &mut self to swap array

253a005

use core::cell::UnsafeCell.

lifers added 2 commits August 13, 2024 13:08

clippy lint disallowed mut_from_ref

b0df0b2

impl Send and Sync for Event

fdaeb15

dpaoliello requested a review from sivadeilra August 13, 2024 21:18

sivadeilra reviewed Aug 13, 2024

View reviewed changes

crates/libs/core/src/event.rs Outdated Show resolved Hide resolved

sivadeilra requested changes Aug 13, 2024

View reviewed changes

Replace Array<T> with Rust types

9e4a471

The `Option` that wraps `Arc<[Delegate<T>]>` will let us assign `None` instead of creating `Arc::new([])` with no space penalty.

kennykerr reviewed Aug 15, 2024

View reviewed changes

crates/libs/core/src/event.rs Outdated Show resolved Hide resolved

kennykerr reviewed Aug 16, 2024

View reviewed changes

crates/libs/core/src/event.rs Show resolved Hide resolved

kennykerr requested review from sivadeilra and removed request for sivadeilra August 16, 2024 13:52

lifers added 2 commits August 16, 2024 10:07

construct Arc directly from iterator

9449391

Now we only need to allocate once for Arc

Add Default, Send, and Sync trait

d987a51

kennykerr reviewed Aug 16, 2024

View reviewed changes

crates/libs/core/src/event.rs Show resolved Hide resolved

kennykerr reviewed Aug 16, 2024

View reviewed changes

crates/libs/core/src/event.rs Outdated Show resolved Hide resolved

drop old delegates after lock is released

520b595

lifers added 3 commits August 16, 2024 11:40

Prefer more explicit drop order

dff4499

Test for Send and Sync

208e52f

Should fail to compile if not

formatting

bf72954

kennykerr reviewed Aug 16, 2024

View reviewed changes

crates/libs/core/src/event.rs Outdated Show resolved Hide resolved

Take the guard as late as possible

b0bc9b1

Co-authored-by: Kenny Kerr <[email protected]>

kennykerr approved these changes Aug 16, 2024

View reviewed changes

sivadeilra approved these changes Aug 16, 2024

View reviewed changes

kennykerr merged commit f19edde into microsoft:master Aug 16, 2024
79 checks passed

kennykerr mentioned this pull request Aug 19, 2024

Consistent allocation failure handling #3209

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Immutable Event implementation #3198

Immutable Event implementation #3198

lifers commented Aug 12, 2024

lifers commented Aug 12, 2024

kennykerr commented Aug 12, 2024

kennykerr commented Aug 12, 2024

lifers commented Aug 13, 2024 •

edited

Loading

kennykerr commented Aug 13, 2024

lifers commented Aug 13, 2024

kennykerr commented Aug 13, 2024

kennykerr commented Aug 13, 2024

dpaoliello commented Aug 13, 2024

sivadeilra commented Aug 13, 2024

sivadeilra left a comment

lifers commented Aug 13, 2024

sivadeilra left a comment

kennykerr commented Aug 14, 2024

sivadeilra commented Aug 14, 2024 •

edited

Loading

kennykerr commented Aug 15, 2024

lifers commented Aug 15, 2024 •

edited

Loading

kennykerr commented Aug 15, 2024

sivadeilra commented Aug 15, 2024 •

edited

Loading

lifers commented Aug 15, 2024

kennykerr commented Aug 15, 2024

sivadeilra commented Aug 15, 2024

sivadeilra commented Aug 15, 2024

lifers commented Aug 16, 2024

kennykerr commented Aug 16, 2024

lifers commented Aug 16, 2024

sivadeilra commented Aug 16, 2024

kennykerr commented Aug 16, 2024

kennykerr left a comment

lifers commented Aug 16, 2024

Immutable Event implementation #3198

Immutable Event implementation #3198

Conversation

lifers commented Aug 12, 2024

lifers commented Aug 12, 2024

kennykerr commented Aug 12, 2024

kennykerr commented Aug 12, 2024

lifers commented Aug 13, 2024 • edited Loading

kennykerr commented Aug 13, 2024

lifers commented Aug 13, 2024

kennykerr commented Aug 13, 2024

kennykerr commented Aug 13, 2024

dpaoliello commented Aug 13, 2024

sivadeilra commented Aug 13, 2024

sivadeilra left a comment

Choose a reason for hiding this comment

lifers commented Aug 13, 2024

sivadeilra left a comment

Choose a reason for hiding this comment

kennykerr commented Aug 14, 2024

sivadeilra commented Aug 14, 2024 • edited Loading

kennykerr commented Aug 15, 2024

lifers commented Aug 15, 2024 • edited Loading

kennykerr commented Aug 15, 2024

sivadeilra commented Aug 15, 2024 • edited Loading

lifers commented Aug 15, 2024

kennykerr commented Aug 15, 2024

sivadeilra commented Aug 15, 2024

sivadeilra commented Aug 15, 2024

lifers commented Aug 16, 2024

kennykerr commented Aug 16, 2024

lifers commented Aug 16, 2024

sivadeilra commented Aug 16, 2024

kennykerr commented Aug 16, 2024

kennykerr left a comment

Choose a reason for hiding this comment

lifers commented Aug 16, 2024

lifers commented Aug 13, 2024 •

edited

Loading

sivadeilra commented Aug 14, 2024 •

edited

Loading

lifers commented Aug 15, 2024 •

edited

Loading

sivadeilra commented Aug 15, 2024 •

edited

Loading