Skip to content

Move persist into async part of the sweeper #3819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

joostjager
Copy link
Contributor

@joostjager joostjager commented Jun 2, 2025

Prepares for making the kv store async in #3778. Otherwise it might be necessary to use block_on in the sweeper. For block_on, a runtime would be needed.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jun 2, 2025

👋 Thanks for assigning @valentinewallace as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@joostjager joostjager requested a review from tnull June 2, 2025 13:53
@joostjager joostjager marked this pull request as ready for review June 4, 2025 09:50
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a first look and left some comments.

Besides, I still think if we go this way we should just also switch to use a Notifier to wake the background processor to trigger persistence.

let result = {
self.regenerate_and_broadcast_spend_if_necessary_internal().await?;

// If there is still dirty state, we need to persist it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird pattern. Why not move persistence out of regenerate_and_broadcast_spend_if_necessary_internal and just set the dirty flag there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at that, but I think we have to persist before we broadcast? Or is that not necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at that, but I think we have to persist before we broadcast? Or is that not necessary?

Hmm, not sure if necessary, but yes, it's probably cleaner to persist that we broadcasted before we attempt it.

However, I think you can avoid the entire 'if it's still dirty'-pattern if you'd trigger the repersistence via a Notifier rather than through the call to regenerate_and_broadcast_if_necessary, as discussed below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. Probably still need a dirty flag to prevent unnecessary persists when only sweeps need to be checked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. Probably still need a dirty flag to prevent unnecessary persists when only sweeps need to be checked.

Well, this was never the question, the question was around whether we need to run the 'if it's still dirty'-pattern after we may have just persisted. And to avoid that, we should just switch to use the notifier, as we intend to do that anyways.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@joostjager
Copy link
Contributor Author

Besides, I still think if we go this way we should just also switch to use a Notifier to wake the background processor to trigger persistence.

You mean as part of this PR? I agree that that would be nicer than a timer, but it seems orthogonal to what we are doing here?

@tnull
Copy link
Contributor

tnull commented Jun 4, 2025

You mean as part of this PR? I agree that that would be nicer than a timer, but it seems orthogonal to what we are doing here?

Yes, I presume it would just be another (~ 20 LoC ?) commit that I don't consider orthogonal to changing the persistence scheme of the OutputSweeper, but very much in-line with / related to the effort in this PR.

@joostjager
Copy link
Contributor Author

It is of course related, but it is not necessary to do it in this PR? For unblocking the async kv store, what's in this PR is all I need.

@tnull
Copy link
Contributor

tnull commented Jun 4, 2025

It is of course related, but it is not necessary to do it in this PR? For unblocking the async kv store, what's in this PR is all I need.

See #3819 (comment): I think you can avoid that 'double-check' pattern if you have repersistence triggered via a notifier.

@joostjager joostjager force-pushed the sweeper-async-persist branch 4 times, most recently from 7cfec6a to 6138980 Compare June 9, 2025 14:44
@joostjager joostjager requested a review from tnull June 9, 2025 14:44
@joostjager
Copy link
Contributor Author

joostjager commented Jun 9, 2025

@tnull @TheBlueMatt and I have also been looking ahead to the follow up to this where the kv store is made async. We need to ensure that await doesn't happen inside the sweeper state lock.

One way of dealing with that is to just get the future inside the lock, and then await outside of it. And document on the trait that the call order needs to be preserved in the implementation of the kv store.

@joostjager joostjager force-pushed the sweeper-async-persist branch from 6138980 to f71c795 Compare June 9, 2025 16:44
Copy link

codecov bot commented Jun 9, 2025

Codecov Report

Attention: Patch coverage is 62.50000% with 24 lines in your changes missing coverage. Please review.

Project coverage is 89.88%. Comparing base (0848e7a) to head (f71c795).

Files with missing lines Patch % Lines
lightning/src/util/sweep.rs 62.50% 20 Missing and 4 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3819   +/-   ##
=======================================
  Coverage   89.88%   89.88%           
=======================================
  Files         160      160           
  Lines      129654   129668   +14     
  Branches   129654   129668   +14     
=======================================
+ Hits       116534   116547   +13     
- Misses      10425    10428    +3     
+ Partials     2695     2693    -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -616,6 +633,9 @@ where
);
e
})
.map(|_| {
sweeper_state.dirty = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also again problematic when converting to async. If we don't hold the lock across the write await, we can't just update that dirty flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, see above, we'd need to keep persist_state take a non-mutable reference and re-acquire the Mutex to unset the flag. However, that of course might just be as race-y as setting/unsetting the AtomicBool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, no more atomic update. But regardless of an in-state or independent dirty flag, this doesn't seem like a great direction either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also it may be unavoidable?

@joostjager joostjager removed the request for review from tnull June 10, 2025 11:29
}

fn persist_state(&self, sweeper_state: &SweeperState) -> Result<(), io::Error> {
/// Flushes the current state to the persistence layer and marks the state as clean.
fn flush_state(&self, sweeper_state: &mut SweeperState) -> Result<(), io::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm onboard with this name change, as I have no intuition what 'flush_state' would mean. Plus, I still think this shouldn't take &mut SweeperState, but rather &PersistentState now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change. Name back to persist_state and taking an immutable ref. It does lead, as mentioned before, to code duplication. The dirty flag needs to be reset at multiple locations.

@joostjager
Copy link
Contributor Author

I pushed a commit to verify whether this PR will actually work with an async kv store, and not holding the mutex across awaits. Also not that easy.

@tnull
Copy link
Contributor

tnull commented Jun 11, 2025

I pushed a commit to verify whether this PR will actually work with an async kv store, and not holding the mutex across awaits. Also not that easy.

Okay, but I think it would be preferable to make the Async-KVStore changes in the Async-KVStore PR. Adding them here just makes following the changes harder.

@joostjager
Copy link
Contributor Author

Just pushed the commit to discuss whether this is the direction that we want to go to. Should have made that more clear.

@joostjager joostjager force-pushed the sweeper-async-persist branch 2 times, most recently from 67d9f8f to eec1c6b Compare June 11, 2025 14:55
@joostjager
Copy link
Contributor Author

To avoid confusion, I've parked the gist of the follow up here: joostjager/rust-lightning@sweeper-async-persist...joostjager:rust-lightning:sweeper-async-kvstore

@joostjager joostjager force-pushed the sweeper-async-persist branch 2 times, most recently from 36ab2ee to 663254e Compare June 12, 2025 07:09
@joostjager joostjager requested a review from tnull June 12, 2025 07:11
@joostjager joostjager force-pushed the sweeper-async-persist branch from 663254e to 84ce2f2 Compare June 13, 2025 08:23
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current approach is fine, but if we'd want to make setting/unsetting the dirty flag safer (i.e., ensure that we don't forget something going forward), we could consider using an RAII pattern or similar to isolate the modifications and re-persists of the state further.

@@ -382,7 +382,8 @@ where
output_spender: O, change_destination_source: D, kv_store: K, logger: L,
) -> Self {
let outputs = Vec::new();
let sweeper_state = Mutex::new(SweeperState { outputs, best_block });
let sweeper_state =
Mutex::new(SweeperState { persistent: PersistentSweeperState { outputs, best_block } });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Persistent is a bit clunky terminology, IMO, but nbd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, couldn't come up with something that is clearly better. Leaving as is.

Comment on lines 385 to 388
let sweeper_state = Mutex::new(SweeperState {
persistent: PersistentSweeperState { outputs, best_block },
dirty: false,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO adding the wrapper struct causes us to litter the code with .persistent in a lot of places where it's not relevant what's written to disk and what's not, and I'm not sure the concrete benefit besides the principle. Not worth holding up the PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oookay, given that everybody else but me doesn't care or rather prefers to merge the state objects, let's go this way. 🤷

@joostjager Mind dropping the persistent field and merging everything into SweeperState afterall?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Reverted the sub-state.

@joostjager joostjager requested a review from tnull June 13, 2025 14:39
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

To prepare for an async kv store trait that must be awaited, this commit
moves the kv store calls from the chain notification handlers to the
background process. It uses a dirty flag to communicate that there is
something to persist. The block height is part of the persisted data. If
that data does not make it to disk, the chain notifications are replayed
after restart.
@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, mod one question

Would also be great to hear whether we still intend to go the notifier route in a follow-up at least.

@@ -531,7 +542,8 @@ where
.collect();

if respend_descriptors.is_empty() {
// It could be that a tx confirmed and there is now nothing to sweep anymore.
// It could be that a tx confirmed and there is now nothing to sweep anymore. If there is dirty state,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we now persist in the if !has_respends, why aren't we persisting here, if dirty is set? Seems a bit weird to take a different approach in each case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants