Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag to disable broadcasting when it's dangerous due to information loss #1593

Closed
wants to merge 10 commits into from

Conversation

QPotato
Copy link

@QPotato QPotato commented Jul 4, 2022

Work in progress PR. The functionality should be ready, but I'm still figuring out how to write tests for this. I'm opening the PR to run CI and use it as reference to ask questions about the tests on the issue.

This PR adds a new field to ChannelMonitor to track when a channel was force closed with should_broadcast: false. We then use this flag to disabled automatic braodcasting of the latest holder commitment transaction in cases where doing so could be dangerous due to missing information. In particular, adding this solves the case where the transaction would have been broadcast on deserialization of the monitor.

I also fixed some minor typos I found on docstrings while reading through the code.

Closes #1563

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. I do wonder why CI didn't run...sometimes GH Actions is just broken, I guess - maybe try git commit --amend && git push -f to push a new commit and see if it works. As for testing, you should be able to write a functional test to call force_close_without_broadcasting_txn and then reload the channelmanager and see if the ChannelMonitor automatically rebroadcasts (see test_manager_serialize_deserialize_inconsistent_monitor for a reload test that hits the rebroadcast case).

@@ -1934,6 +1966,8 @@ impl<Signer: Sign> ChannelMonitorImpl<Signer> {
self.pending_monitor_events.push(MonitorEvent::CommitmentTxConfirmed(self.funding_info.0));
}



Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spurious newlines here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 84d8de6

}
}

pub(crate) fn force_broadcast_latest_holder_commitment_txn_unsafe<B: Deref, L: Deref>(&mut self, broadcaster: &B, logger: &L)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be pub(crate)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, I mostly followed the signature for the previous function which was pub(crate). But since both are supposed to be used through their respective wrappers, I guess we can just remove it? I did so in c262be2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, to "drop" pub(crate) I meant make it module-visible (ie no visibility specifier present at all), not pub :)

@QPotato QPotato changed the title [WIP] Add flag to disable broadcasting when it's dangerous due to information loss Add flag to disable broadcasting when it's dangerous due to information loss Jul 11, 2022
Comment on lines 4822 to 4825
assert!(nodes[0].chain_monitor.watch_channel(deserialized_chanmon.get_funding_txo().0, deserialized_chanmon).is_ok());
check_added_monitors!(nodes[0], 1);
nodes[0].node.get_and_clear_pending_msg_events();
nodes[0].node.get_and_clear_pending_events();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I don't know why I need this in my test. I only know that without these lines my test would panic while trying to drop() the monitor or the node.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so we use that to ensure that tests are always complete in checking for all the events that are generated during the course of the test - in general we should strive to that rather than simply clearing events. In this case it looks like you're missing a check_closed_event in the test body when the channel closes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, that worked without needing to clear the messages. Fixed in bec767b.

@@ -948,6 +952,7 @@ impl<Signer: Sign> Writeable for ChannelMonitorImpl<Signer> {

self.lockdown_from_offchain.write(writer)?;
self.holder_tx_signed.write(writer)?;
self.allow_automated_broadcast.write(writer)?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought I didn't need to write and read the new field. My first understanding was that monitor updates are replayed on deserialization. After some hours trying to make the test work and reading through the code, I realized that probably not the case and we actually serialize and deserialize the whole state so I added it here too and it seems to work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to write it, but we cannot add new fields to the "normal" write sequence, they have to go in (odd) TLVs so that old nodes can tell what they are and ignore them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I changed this in bec767b.

@QPotato QPotato requested a review from TheBlueMatt July 11, 2022 00:24
@QPotato
Copy link
Author

QPotato commented Jul 11, 2022

Huh, no idea why CI is not running. I might be doing something wrong?

}
self.pending_monitor_events.push(MonitorEvent::CommitmentTxConfirmed(self.funding_info.0));
} else {
log_error!(logger, "You have a toxic holder commitment transaction avaible in channel monitor, read comment in ChannelMonitor::get_latest_holder_commitment_txn to be informed of manual action to take");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, lets remove reference to "You have a toxic holder commitment transaction available" - since #1564 that's no longer actually true, if there's some toxic holder commitment transaction we panic instead of updating any state. Rather, the bool is now generally a result of the user deliberately calling "force_close_without_broadcasting". Its not really your bug but would be good to fix it up here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I downgraded this to an info log in cc2e3c5.
I also removed the other instance of this message in update_monitor on de5266c simplifying the logic there a bit and keeping the info log in the case the case were we have both !*should_broadcast && self.holder_tx_signed. Though let me know if I didn't get this right and need to further change it.

self.inner.lock().unwrap().maybe_broadcast_latest_holder_commitment_txn(broadcaster, logger)
}

pub(crate) fn force_broadcast_latest_holder_commitment_txn_unsafe<B: Deref, L: Deref>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My compiler says this is never used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We talked before about adding this method so that the users still has an option to broadcast if they think they know what they are doing, but I think in that case we want it to be plain pub, right? Changed this in 9084457

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think you also suggested this in another comment above. I rediscovered visibility rules, lol.

Comment on lines 4822 to 4825
assert!(nodes[0].chain_monitor.watch_channel(deserialized_chanmon.get_funding_txo().0, deserialized_chanmon).is_ok());
check_added_monitors!(nodes[0], 1);
nodes[0].node.get_and_clear_pending_msg_events();
nodes[0].node.get_and_clear_pending_events();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so we use that to ensure that tests are always complete in checking for all the events that are generated during the course of the test - in general we should strive to that rather than simply clearing events. In this case it looks like you're missing a check_closed_event in the test body when the channel closes.

@wpaulino wpaulino self-requested a review July 11, 2022 22:12
@QPotato QPotato requested a review from TheBlueMatt July 20, 2022 04:01
@TheBlueMatt
Copy link
Collaborator

I think the reason CI may not be running is you need to rebase this onto rust-lightning git, ie remove the merge commits. If you're not sure how to do that, Bitcoin Core's contributor guide has some instructions.

@QPotato
Copy link
Author

QPotato commented Jul 25, 2022

Still trying to figure out the rebase thing. It's kinda different to what I'm used to do with git. I think I messed up the branch a bit when trying it. Will give it another shot tomorrow. Thanks.

@TheBlueMatt
Copy link
Collaborator

No, the branch looks good, for the life of me I have no idea why Github isn't even trying to run CI...

self.inner.lock().unwrap().maybe_broadcast_latest_holder_commitment_txn(broadcaster, logger)
}

// Broadcasts the latest commitment transaction, even if we can't ensure it's safe to do so
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation needs to start with three "/"s to count. Also, we should add a bunch more detail here - why would a user want to use this, in what context is it safe, when is it not safe, etc.

@@ -1938,7 +1959,25 @@ impl<Signer: Sign> ChannelMonitorImpl<Signer> {
}
}

pub(crate) fn broadcast_latest_holder_commitment_txn<B: Deref, L: Deref>(&mut self, broadcaster: &B, logger: &L)
// Broadcasts the latest commitment transaction only if it's safe to do so.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, needs three "/"s, needs to describe when a user wants to use this, not just what it does.

@@ -726,6 +726,10 @@ pub(crate) struct ChannelMonitorImpl<Signer: Sign> {
counterparty_node_id: Option<PublicKey>,

secp_ctx: Secp256k1<secp256k1::All>, //TODO: dedup this a bit...

// Used to track that the channel was updated with ChannelForceClosed {should_broadcast: false}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter so much for internal fields, but good to make this a doc comment with three "/"s rather than two.

@TheBlueMatt
Copy link
Collaborator

Ah, somehow github has flagged you as a spammer? Can you reach out to github via the link at https://github.com/QPotato/rust-lightning/actions ?

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, it looks like I missed that you pushed a commit here, sadly github doesn't generate notifications for simple pushes, so I don't always see it. I left one further comment here, but it looks like a few of the comments from my previous review weren't yet addressed. Do you still have a desire to land this PR?

@@ -3417,6 +3453,8 @@ impl<'a, Signer: Sign, K: KeysInterface<Signer = Signer>> ReadableArgs<&'a K>
counterparty_node_id,

secp_ctx,

allow_automated_broadcast: allow_automated_broadcast.unwrap(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be unwrap_or(true) as old data won't have the field in it, so unwrap could panic.

@QPotato
Copy link
Author

QPotato commented Aug 22, 2022

Sorry for the absence, I've had very few free time lately. I'll try to push a new commit this week but feel free to take over if ai'm delaying a release or something. Thanks for all the review so far.

@TheBlueMatt
Copy link
Collaborator

No particular rush currently as long as you're going to pick it back up. If it becomes higher priority I'll take it and open a new PR, but will comment here before I do.

- Add a functional test
- Fix some typos I found while reading the code
- Use `check_closed_event` instead of clearing pending events
- Write new property and read it from TLVs
…hat it's exposed to library users and not an unreferenced symbol.
…e info log in the case the txn is already signed.
@QPotato
Copy link
Author

QPotato commented Sep 5, 2022

Rebased to latest master to try to make CI pass. I now understand why we use rebase instead of merge.
Still need to expand docs a bit, as requested. I'm doing some reading so that I can come up with something useful.

@@ -2148,7 +2169,25 @@ impl<Signer: Sign> ChannelMonitorImpl<Signer> {
}
}

pub(crate) fn broadcast_latest_holder_commitment_txn<B: Deref, L: Deref>(&mut self, broadcaster: &B, logger: &L)
/// Broadcasts the latest commitment transaction only if it's safe to do so.
pub fn maybe_broadcast_latest_holder_commitment_txn<B: Deref, L: Deref>(&mut self, broadcaster: &B, logger: &L)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire type is private, so there's no reason to make these pub. If anything its a bit confusing because it looks like users/other things can call this (and the next function) but they cant.

// If we generated a MonitorEvent::CommitmentTxConfirmed, the ChannelManager
// will still give us a ChannelForceClosed event with !should_broadcast, but we
// shouldn't print the scary warning above.
self.allow_automated_broadcast = *should_broadcast;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its already set to false we should probably leave it at false and not update it. I dont know that it actually matters, but it seems safer.

}

/// Broadcasts the latest commitment transaction, even if we can't ensure it's safe to do so
/// due to missing information.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just list the reasons why we force-closed without broadcast here, which I think are just -

  • channel monitor update(s) failed to be applied, likely because of a persistence error
  • user called the ChannelManager function to force-close without broadcast.

@TheBlueMatt
Copy link
Collaborator

Looks like this needs rebase again.

@wpaulino wpaulino removed their request for review July 28, 2023 18:45
@TheBlueMatt
Copy link
Collaborator

Closing as abandoned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't Always Broadcast latest state on startup if we'd previously closed-without-broadcast
2 participants