Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sdk): Dropping an UpdatesSubscriber release its reader token for the garbage collector #4102

Conversation

Hywan
Copy link
Member

@Hywan Hywan commented Oct 9, 2024

The event cache stores its events in a linked chunk. The linked chunk
supports updates (ObservableUpdates) via LinkedChunk::updates().
This ObservableUpdates receives all updates that are happening inside
the LinkedChunk. An ObservableUpdates wraps UpdatesInner, which
is the real logic to handle multiple update readers. Each reader has a
unique ReaderToken. UpdatesInner has a garbage collector that drops
all updates that are read by all readers. And here comes the problem.

A category of readers are UpdatesSubscriber, returned by
ObservableUpdates::subscribe(). When an UpdatesSubscriber is
dropped, its reader token was still alive, thus preventing the garbage
collector to clear all its pending updates: they were kept in memory
for the eternity.

This patch implements Drop for UpdatesSubscriber to correctly remove
its ReaderToken from UpdatesInner. This patch also adds a test that
runs multiple subscribers, and when one is dropped, its pending updates
are collected by the garbage collector.


@Hywan Hywan requested a review from a team as a code owner October 9, 2024 15:47
@Hywan Hywan requested review from andybalaam and removed request for a team October 9, 2024 15:47
Copy link

codecov bot commented Oct 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.59%. Comparing base (95ae5d1) to head (02e0e9c).
Report is 81 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4102      +/-   ##
==========================================
- Coverage   84.70%   84.59%   -0.11%     
==========================================
  Files         269      269              
  Lines       28766    28759       -7     
==========================================
- Hits        24366    24329      -37     
- Misses       4400     4430      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. Need a few clarifications in the test IMO, but good to merge after comments have been addressed 👍

linked_chunk.push_items_back(['b']);
linked_chunk.push_items_back(['c']);

// The waker must have been called only once for the two updates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand why, but can you elaborate in the comment? (It's because we only add a waker whenever we explicitly start polling, which we did once before for each stream.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A waker is consumed after it's been called. A waker is registered when a Future or a Stream is polled. I'm updating the comment.

Comment on lines +747 to +754
// For the sake of this test, we also need to advance the main reader token.
let _ = linked_chunk.updates().unwrap().take();
let _ = linked_chunk.updates().unwrap().take();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh? Why do we need to do it twice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's something I want to change. The garbage collector is run before take_with_token is called (which is called by take). I think it's better to run the garbage collector after take_with_token is called: “I read new items, I clean read items” vs. “I clean previously read items, I read new ones”. Right now, we are in the second pattern, but I believe the first is preferable. I'll address that in the next pull request.

@bnjbvr bnjbvr removed the request for review from andybalaam October 10, 2024 10:41
…r the GC.

The event cache stores its events in a linked chunk. The linked chunk
supports updates (`ObservableUpdates`) via `LinkedChunk::updates()`.
This `ObservableUpdates` receives all updates that are happening inside
the `LinkedChunk`. An `ObservableUpdates` wraps `UpdatesInner`, which
is the real logic to handle multiple update readers. Each reader has a
unique `ReaderToken`. `UpdatesInner` has a garbage collector that drops
all updates that are read by all readers. And here comes the problem.

A category of readers are `UpdatesSubscriber`, returned by
`ObservableUpdates::subscribe()`. When an `UpdatesSubscriber` is
dropped, its reader token was still alive, thus preventing the garbage
collector to clear all its pending updates: they were kept in memory
for the eternity.

This patch implements `Drop` for `UpdatesSubscriber` to correctly remove
its `ReaderToken` from `UpdatesInner`. This patch also adds a test that
runs multiple subscribers, and when one is dropped, its pending updates
are collected by the garbage collector.
@Hywan Hywan force-pushed the feat-sdk-event-cache-linked-chunk-remove-reader-token branch from d2d5240 to 02e0e9c Compare October 21, 2024 08:35
@Hywan Hywan enabled auto-merge (rebase) October 21, 2024 08:39
@Hywan Hywan merged commit a7f6997 into matrix-org:main Oct 21, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants