Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure successful message propagation in case of disconnection mid-handshake #2725

Merged
merged 2 commits into from
Feb 5, 2024

Conversation

shaavan
Copy link
Contributor

@shaavan shaavan commented Nov 10, 2023

Resolves #2096

  • This PR ensures that we don't immediately force-close OutboundV1Channel in case of disconnection mid-handshake.
  • Instead, we rebroadcast the SendOpenChannel message if the peer reconnects within time.

@shaavan
Copy link
Contributor Author

shaavan commented Nov 10, 2023

This PR makes the following interpretation of the issue and follows the solution accordingly:

  1. Don't allow channel creation if the peer is already disconnected.
  2. But if the peer is connected, before calling the "create_channel" and disconnects at the exact time when the function is called. Allow channel creation, and resend the open_channel message again when it reconnects.
  3. Fail the creation after a few timer ticks if the peer fails to connect within the time.

If in case, my interpretation of the problem has been erroneous, do let me and I shall be glad to correct it! :)

@shaavan
Copy link
Contributor Author

shaavan commented Nov 10, 2023

Also, this PR has been set to draft because the tests are incomplete and only partially test the added code.
Any help or suggestion in making the tests work is much welcome!

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the status here. You have it marked draft, do you want feedback? Whay kind of feedback/review is this ready for?

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
@shaavan
Copy link
Contributor Author

shaavan commented Nov 14, 2023

What's the status here. You have it marked draft, do you want feedback? What kind of feedback/review is this ready for?

Hi, @TheBlueMatt
I have completed the implementation in the PR, but the problem I am facing is writing up the test for it.

So, I am facing trouble preparing a test for when the peer is reconnected on time, and hence the open channel message is sent to it because it conflicts with how the rest of the test codebase is set up.

So I wanted to get a general Approach ACK, before I go about hacking in the test to make them work.

@shaavan
Copy link
Contributor Author

shaavan commented Nov 25, 2023

Updated from pr2725.01 -> pr2725.02 (diff)
Addressed @TheBlueMatt comment

Changes:

  1. Rebased on Main.
  2. Used data in Channel to reconstruct the SendOpenChannel message when rebroadcasting.
  3. Clean up the faulty test.

Note:

  1. The PR has been moved from "Draft" to "Ready for Review"
  2. The tests were becoming cumbersome and hacky, so I have temporarily cleaned them up.
  3. I am looking for a general Approach ACK on this PR before working on creating a test for this approach.

@shaavan shaavan marked this pull request as ready for review November 25, 2023 11:11
@shaavan
Copy link
Contributor Author

shaavan commented Nov 25, 2023

Updated from pr2725.02 -> pr2725.03 (diff)

Changes:

  • Updated the peer_disconnected function to keep the unnotified channel around in case of sudden disconnection.
  • Also updated the peer_connected code to handle this change in behavior.

Thanks for the suggestion, @wpaulino!

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here, busy with thanksgiving travel and other stuff.

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
@shaavan
Copy link
Contributor Author

shaavan commented Dec 3, 2023

Updated from pr2725.03 -> pr2725.04 (diff)
Addressed @TheBlueMatt suggestion

Changes:

  1. Rebased on Main.
  2. Revamped the approach. Instead of sending the SendOpenChannel message later, we track if we have received an accepted channel message from them. This simplifies the approach and is in line with the current code architecture.
  3. Added test to verify this new behavior.

Thank you very much, @TheBlueMatt, for this new idea to solve this problem.

@wpaulino
Copy link
Contributor

wpaulino commented Dec 4, 2023

Instead of sending the SendOpenChannel message later, we track if we have received an accepted channel message from them.

We still need to resend open_channel if they reconnect. Also, waiting for accept_channel doesn't seem enough, they can send it and immediately disconnect after, and we're left without a channel anyway.

@shaavan
Copy link
Contributor Author

shaavan commented Dec 5, 2023

@wpaulino

We still need to resend open_channel if they reconnect.

You are right! That was an oversight from my side. Thank you for pointing it out.

Also, waiting for accept_channel doesn't seem enough, they can send it and immediately disconnect after, and we're left without a channel anyway.

You are right. However, the goal of the PR is to ensure proper execution of the create_channel function. Since the scope of create_channel ends at properly sending the SendOpenChannel message, its correct execution can be ensured by confirming that we received the AcceptChannel message.
If, by chance, the peer disconnects after sending AcceptChannel when we are creating the funding, the Outbound Channel will simply be removed because it was an unfunded channel.
Maybe, we will also need to handle the case of proper transmission of funding-created messages for a suddenly disconnected peer, but that seems to be outside the scope of this PR, and we can handle it later.

@wpaulino
Copy link
Contributor

wpaulino commented Dec 5, 2023

However, the goal of the PR is to ensure proper execution of the create_channel function. Since the scope of create_channel ends at properly sending the SendOpenChannel message, its correct execution can be ensured by confirming that we received the AcceptChannel message.

I don't think we necessarily care about that. What we really want is to end up with a funded channel (within a reasonable timeout) if a user requests one while being able to handle the counterparty disconnecting mid-handshake.

Maybe, we will also need to handle the case of proper transmission of funding-created messages for a suddenly disconnected peer, but that seems to be outside the scope of this PR, and we can handle it later.

Typically nodes forget all about channels before sending/receiving funding_signed, so we'll need to retransmit all messages starting from open_channel after a reconnection.

@shaavan
Copy link
Contributor Author

shaavan commented Dec 7, 2023

Thanks, @wpaulino, for the details about the message transmissions!

Seems like it's worth considering extending the PR from fixing the original issue to not failing channel creation mid-handshake due to channel disconnection.

I am tinkering with an approach, and I shall update the PR very soon!

@shaavan
Copy link
Contributor Author

shaavan commented Dec 7, 2023

Update:

Okay, so I have figured out an approach, but this depends on the behavior changes introduced in #2760.

We can track the list of msg_events we have sent during the handshake process, which can be used in case a peer disconnects midway.

Once the funding is signed, we graduate the channel from OutboundV1 to Channel. And so after that, we stop tracking the msg_events.

However, currently, in the main, we graduate the channel as soon as we have created the funding.

let (chan: Channel<SP>, msg_opt) = match peer_state.channel_by_id.remove(temporary_channel_id) {

...

},

Since #2760 is already getting approval and will soon be merged, I shall build this new approach over the changes introduced there.

@TheBlueMatt
Copy link
Collaborator

I don't think we need to explicitly track which message we're ready to send to our counterparty - if we are disconnected from a peer, then reconnect prior to funding, we have to restart from the open_channel step - the counterparty may have forgotten about the channel.

@shaavan
Copy link
Contributor Author

shaavan commented Dec 10, 2023

Updated from pr2725.04 -> pr2725.05 (diff)
Addressed @wpaulino and @TheBlueMatt comments

Updates:

  1. Rebased on main.
  2. Approach update

Logic:

-> Follow the standard handshake routine.

-> If we disconnect mid-handshake from our peer (that is, OutboundV1Channel is not resolved to a funded channel), we don't immediately close the OutboundV1Channel.

-> Instead, we track how long it has been since we disconnected from peers.

-> If we connect back within time, we rebroadcast SendOpenChannel corresponding to OutboundV1Channel to the peer.

-> If we do not connect back within N (=2) timer ticks, we force close and remove the channel.

Note:

  1. To also handle the case of further disconnection mid-handshake, the timer resets when the peer connects back.

@shaavan shaavan changed the title Don't Discard the create_channel for a suddenly disconnected peer Ensure successfully message propagation in case of disconnection mid-handshake Dec 10, 2023
@shaavan shaavan changed the title Ensure successfully message propagation in case of disconnection mid-handshake Ensure successful message propagation in case of disconnection mid-handshake Dec 10, 2023
@shaavan
Copy link
Contributor Author

shaavan commented Dec 11, 2023

Updated from pr2725.05 -> pr2725.06 (diff)

Changes:

  1. Updated the ClosureReason from HolderForceClosed -> PeerDisconnected as it is more apt.
  2. Updated the test introduced in this PR accordingly.
  3. Updated other relevant tests to account for the behavior change introduced in this PR.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM. One comment. Note that the test fixes in the last commit will need to get squashed into the commit that broke tests. We require (but don't actually check in CI) that each individual commit builds and passes tests.

lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
@shaavan
Copy link
Contributor Author

shaavan commented Dec 12, 2023

Updated from pr2725.06 -> pr2725.07 (diff)
Addressed @TheBlueMatt comment

Update:

  1. Squashed the updates introduced in tests with the changes that broke them so that each commit now individually passes all the tests.

@codecov-commenter
Copy link

codecov-commenter commented Dec 12, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (5bf58f0) 89.14% compared to head (0fd3d31) 89.14%.

❗ Current head 0fd3d31 differs from pull request most recent head a6a6b48. Consider uploading reports for the commit a6a6b48 to get more accurate results

Files Patch % Lines
lightning/src/ln/functional_tests.rs 97.95% 2 Missing ⚠️
lightning/src/ln/channelmanager.rs 94.73% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2725      +/-   ##
==========================================
- Coverage   89.14%   89.14%   -0.01%     
==========================================
  Files         116      116              
  Lines       93205    93186      -19     
  Branches    93205    93186      -19     
==========================================
- Hits        83089    83066      -23     
- Misses       7583     7587       +4     
  Partials     2533     2533              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
@shaavan
Copy link
Contributor Author

shaavan commented Jan 19, 2024

Updated from pr2725.10 -> pr2725.11 (diff)
Addressed @jkczyz and @coderabbitai comments

  • Update comments to fix grammatical mistakes and better express the intended behavior
  • Stop ignoring test_peer_disconnected_before_funding_broadcasted, and instead update it to account for behavior updates.
  • Expand the introduced test to check if node[1] has immediately dropped the Unfunded Inbound Channel

Copy link

coderabbitai bot commented Jan 19, 2024

@shaavan, the updates you've made in response to the comments seem to be well-detailed and focused on improving the PR's clarity and functionality. It's good to see that you've expanded the tests to cover the new behavior thoroughly.

Regarding the mention of @coderabbitai, it seems there might be a mix-up as there are no previous comments from that handle in this thread. If there's anything specific you'd like me to review or if there are any other concerns you have, please let me know!

(\( ⁰⊖⁰)/)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 4

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 59e6af3 and 15792af.
Files selected for processing (2)
  • lightning/src/ln/channelmanager.rs (3 hunks)
  • lightning/src/ln/functional_tests.rs (4 hunks)
Additional comments: 6
lightning/src/ln/functional_tests.rs (3)
  • 3727-3737: The logic for handling disconnection before funding transaction broadcast is clear. However, the test seems to be disabled. Confirm if this is intentional and if so, provide a reason or a TODO comment for future enablement.
  • 10513-10561: The new test test_channel_close_when_not_timely_accepted is well-structured and seems to cover the scenario it's designed for. However, ensure that the test is enabled and verify that it passes in the test suite.
  • 10563-10607: The test test_rebroadcast_open_channel_when_reconnect_mid_handshake appears to correctly simulate the scenario of a peer disconnecting and reconnecting mid-handshake. Verify that the test is enabled and that it passes in the test suite.
lightning/src/ln/channelmanager.rs (3)
  • 895-897: The logic here checks if any channel is in the Funded or UnfundedOutboundV1 phase. Ensure that this logic aligns with the intended behavior of the is_live function, especially considering the new UnfundedOutboundV1 state.
  • 8876-8877: The UnfundedOutboundV1 channel phase is set to always return true, which implies that these channels are considered live even if the peer is disconnected. Confirm that this behavior is consistent with the overall system logic and that it won't lead to any unexpected side effects.
  • 9028-9032: The addition of logic to push a SendOpenChannel message for UnfundedOutboundV1 channels is consistent with the PR's objective to allow rebroadcasting if the peer reconnects. Ensure that the get_open_channel function generates the correct message and that this behavior is tested.

lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good other than some comment re-phrasing.

lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
@shaavan
Copy link
Contributor Author

shaavan commented Jan 22, 2024

Updated from pr2725.11 -> pr2725.12 (diff)
Addressed @jkczyz comments

  • Improve the comment messages.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 59e6af3 and f9fbdca.
Files selected for processing (2)
  • lightning/src/ln/channelmanager.rs (3 hunks)
  • lightning/src/ln/functional_tests.rs (4 hunks)
Files skipped from review as they are similar to previous changes (2)
  • lightning/src/ln/channelmanager.rs
  • lightning/src/ln/functional_tests.rs

jkczyz
jkczyz previously approved these changes Jan 22, 2024
@shaavan
Copy link
Contributor Author

shaavan commented Jan 26, 2024

@TheBlueMatt
A gentle ping.
I think this PR is ready for a (potentially final) review :)

@shaavan
Copy link
Contributor Author

shaavan commented Jan 27, 2024

Updated from pr2725.12 -> pr2725.13 (diff)
Addressed @TheBlueMatt comments

  1. Repurpose the test_disconnect_in_funding_batch to check if all the channels of the batch close if one of them is closed.
  2. Use nodes[0].node.list_channels() in introduced test for cleaner code.
  3. Made the added test more precise by adding a test for correct msg-type along with the number of msgs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 51d9ee3 and 0fd3d31.
Files selected for processing (2)
  • lightning/src/ln/channelmanager.rs (3 hunks)
  • lightning/src/ln/functional_tests.rs (6 hunks)
Files skipped from review as they are similar to previous changes (1)
  • lightning/src/ln/channelmanager.rs
Additional comments: 6
lightning/src/ln/functional_tests.rs (6)
  • 3727-3739: The logic for handling peer disconnection before funding is broadcasted seems to correctly simulate the disconnection and checks for the expected closure reasons. However, it's important to ensure that the UNFUNDED_CHANNEL_AGE_LIMIT_TICKS constant is appropriately defined and used across the test to simulate the timeout accurately.

Ensure UNFUNDED_CHANNEL_AGE_LIMIT_TICKS is defined with a value that accurately represents the intended timeout duration for the test scenario.

  • 10515-10559: The test test_channel_close_when_not_timely_accepted simulates a scenario where peers disconnect mid-handshake, and the channel is not timely accepted. The test setup and the disconnection simulation are correctly implemented. However, the assertion that checks the channel's state after disconnection (line 10534) and the assertion for the channel's closure (line 10550) are critical to validate the intended behavior. It's essential to ensure that these assertions accurately reflect the expected state changes in the system under test.

Verify that the assertions accurately reflect the expected outcomes and that the test covers all relevant scenarios for the feature being tested.

  • 10561-10604: The test test_rebroadcast_open_channel_when_reconnect_mid_handshake correctly simulates a peer disconnection and reconnection mid-handshake. The test ensures that the SendOpenChannel message is rebroadcast upon reconnection (lines 10598-10603). This behavior aligns with the PR's objective to improve the robustness of the channel handshake process. However, it's crucial to verify that the rebroadcast logic is implemented as intended in the actual system code and not just within the test environment.

Confirm that the rebroadcast logic for the SendOpenChannel message upon peer reconnection is correctly implemented in the system code and not solely within the test.

  • 10762-10764: The introduction of the test test_close_in_funding_batch aims to ensure that if one channel in a batch closes, the entire batch is closed. This test is crucial for validating the robustness of batch processing in channel funding. It's important to ensure that the test setup correctly simulates the batch funding scenario and that the logic for triggering a channel close within the batch is accurately implemented.

Ensure the test accurately simulates batch funding scenarios and correctly implements the logic for closing a channel within a batch.

  • 10788-10820: The logic within test_close_in_funding_batch for force-closing a channel and verifying the closure of all channels in the batch (lines 10794-10820) is critical for ensuring the intended behavior of batch processing. The assertions and checks (lines 10797-10803, 10805-10809, and 10811-10818) are essential for validating the state of the system after a force-close operation. It's important to verify that these checks accurately reflect the expected outcomes and that the test covers all relevant scenarios for batch processing in channel funding.

Verify that the assertions and checks within the test accurately reflect the expected outcomes for batch processing in channel funding and that all relevant scenarios are covered.

  • 10820-10820: The final assertion in test_close_in_funding_batch that checks for the immediate closure of all channels in the batch upon a single channel's force-close (line 10820) is a key part of validating the intended behavior. However, it's crucial to ensure that this behavior aligns with the system's design and that the test accurately reflects the real-world scenario it intends to simulate.

Confirm that the immediate closure of all channels in a batch upon a single channel's force-close aligns with the system's design and that the test accurately simulates this scenario.

lightning/src/ln/functional_tests.rs Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
- Do not remove channel immediately when peer_disconnect, instead
  removed it after some time if peer doesn't reconnect soon (handled in
previous commit).
- Do not mark per ok_to_remove if we have some OutboundV1Channels too.
- Rebroadcast SendOpenChannel for outboundV1Channel when peer
  reconnects.
- Update the relevant tests to account for the behavior change.
- Repurpose the test_disconnect_in_funding_batch to test that all
  channels in the batch close when one them closes.
- The first test make sure that the OutboundV1Channel is not
immediately removed when peers disconnect, but is removed after N timer
ticks.
- The second test makes sure that the SendOpenChannel is rebroadcasted
for the OutboundV1Channel if peer reconnects within time.
@shaavan
Copy link
Contributor Author

shaavan commented Jan 31, 2024

Updated from pr2725.13 -> pr2725.14 (diff)
Addressed @TheBlueMatt comments

  • Use list_channels in the introduced test wherever necessary.
  • Made the check for msg_events more thorough by checking for the specific expected message.
  • Clean up commit structure.

Updated from pr2725.14 -> pr2725.15

Range-diff
git range-diff fbeb7ac9e1256f5d69b0cda580b56c5025ffa987..411462cce2c23d5a3faa72641107f4d2e43cf5a5 5bf58f0d33b13bd3d9f88c4f2021fbd77745274a..ddf75afd167ac4adb5824180b4125d87563fa31a
1:  411462cc ! 1:  ddf75afd Do not remove Outbound Channel immediately when peer disconnects
    @@ lightning/src/ln/functional_tests.rs: fn test_disconnect_in_funding_batch() {
     +  // Force-close the channel for which we've completed the initial monitor.
        let funding_txo_1 = OutPoint { txid: tx.txid(), index: 0 };
        let funding_txo_2 = OutPoint { txid: tx.txid(), index: 1 };
    -   let channel_id_1 = funding_txo_1.to_channel_id();
    -   let channel_id_2 = funding_txo_2.to_channel_id();
    +   let channel_id_1 = ChannelId::v1_from_funding_outpoint(funding_txo_1);
    +   let channel_id_2 = ChannelId::v1_from_funding_outpoint(funding_txo_2);
     +
     +  nodes[0].node.force_close_broadcasting_latest_txn(&channel_id_1, &nodes[1].node.get_our_node_id()).unwrap();
     +
  • Rebased over main

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 5bf58f0 and a6a6b48.
Files selected for processing (2)
  • lightning/src/ln/channelmanager.rs (3 hunks)
  • lightning/src/ln/functional_tests.rs (6 hunks)
Files skipped from review as they are similar to previous changes (1)
  • lightning/src/ln/channelmanager.rs
Additional comments: 7
lightning/src/ln/functional_tests.rs (7)
  • 3698-3698: The test description for test_peer_disconnected_before_funding_broadcasted is clear and sets the context well for what the test aims to achieve. However, ensure that the test implementation fully covers the scenario described, including both the disconnection and the failure to reconnect within the specified time.
  • 3727-3739: The logic to simulate peer disconnection before funding is broadcasted and to check the channel closure with the appropriate ClosureReason is implemented correctly. However, consider adding a comment explaining why UNFUNDED_CHANNEL_AGE_LIMIT_TICKS is used to simulate the passage of time and its significance in the context of this test.
  • 10517-10557: The test test_channel_close_when_not_timely_accepted correctly simulates a scenario where a peer disconnects mid-handshake and checks the state of channels and peer state after a specified time has passed. This test effectively covers the new behavior introduced in the PR. Ensure that the constants used, like UNFUNDED_CHANNEL_AGE_LIMIT_TICKS, are well-documented and their values are justified within the context of this test.
  • 10560-10598: The test test_rebroadcast_open_channel_when_reconnect_mid_handshake accurately simulates the scenario of peer disconnection and reconnection during the handshake process. It checks that the SendOpenChannel message is rebroadcast upon reconnection, aligning with the PR's objectives. This test is well-structured and covers the critical functionality introduced. Ensure that the test includes assertions for the state of both nodes after reconnection to fully validate the rebroadcast logic.
  • 10756-10756: The introduction of test_close_in_funding_batch aims to test the behavior when one of the channels in a batch closes. This is a good addition to ensure that batch processing of channel closures behaves as expected. However, the test description could be expanded to detail the expected behavior of the batch closure process for clarity.
  • 10782-10813: In test_close_in_funding_batch, the logic to force-close a channel and check the resulting state, including monitor updates and message events, is implemented correctly. This test effectively validates the behavior when a channel in a funding batch is closed. Ensure that the test also verifies the state of other channels in the batch to confirm that they are affected as expected by the batch closure process.
  • 10814-10814: The assertion that all channels in the batch should close immediately after one channel is force-closed is a critical part of test_close_in_funding_batch. This ensures that the batch processing logic is working as intended. Consider adding more detailed assertions to verify the closure reasons for each channel in the batch to ensure they align with the expected outcomes.

_ => panic!("Unexpected message."),
}

// We broadcast the commitment transaction as part of the force-close.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, this is kinda dumb, maybe we should fix that, but its not super critical and certainly unrelated to this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super interested in understanding the issue here! And probably might give it a try if it's not a super biggie!

@TheBlueMatt TheBlueMatt merged commit 8d9d099 into lightningdevkit:main Feb 5, 2024
14 of 15 checks passed
@shaavan shaavan deleted the issue2096 branch February 6, 2024 15:02
k0k0ne pushed a commit to bitlightlabs/rust-lightning that referenced this pull request Sep 30, 2024
v0.0.123 - May 08, 2024 - "BOLT12 Dust Sweeping"

API Updates
===========

 * To reduce risk of force-closures and improve HTLC reliability the default
   dust exposure limit has been increased to
   `MaxDustHTLCExposure::FeeRateMultiplier(10_000)`. Users with existing
   channels might want to consider using
   `ChannelManager::update_channel_config` to apply the new default (lightningdevkit#3045).
 * `ChainMonitor::archive_fully_resolved_channel_monitors` is now provided to
   remove from memory `ChannelMonitor`s that have been fully resolved on-chain
   and are now not needed. It uses the new `Persist::archive_persisted_channel`
   to inform the storage layer that such a monitor should be archived (lightningdevkit#2964).
 * An `OutputSweeper` is now provided which will automatically sweep
   `SpendableOutputDescriptor`s, retrying until the sweep confirms (lightningdevkit#2825).
 * After initiating an outbound channel, a peer disconnection no longer results
   in immediate channel closure. Rather, if the peer is reconnected before the
   channel times out LDK will automatically retry opening it (lightningdevkit#2725).
 * `PaymentPurpose` now has separate variants for BOLT12 payments, which
   include fields from the `invoice_request` as well as the `OfferId` (lightningdevkit#2970).
 * `ChannelDetails` now includes a list of in-flight HTLCs (lightningdevkit#2442).
 * `Event::PaymentForwarded` now includes `skimmed_fee_msat` (lightningdevkit#2858).
 * The `hashbrown` dependency has been upgraded and the use of `ahash` as the
   no-std hash table hash function has been removed. As a consequence, LDK's
   `Hash{Map,Set}`s no longer feature several constructors when LDK is built
   with no-std; see the `util::hash_tables` module instead. On platforms that
   `getrandom` supports, setting the `possiblyrandom/getrandom` feature flag
   will ensure hash tables are resistant to HashDoS attacks, though the
   `possiblyrandom` crate should detect most common platforms (lightningdevkit#2810, lightningdevkit#2891).
 * `ChannelMonitor`-originated requests to the `ChannelSigner` can now fail and
   be retried using `ChannelMonitor::signer_unblocked` (lightningdevkit#2816).
 * `SpendableOutputDescriptor::to_psbt_input` now includes the `witness_script`
   where available as well as new proprietary data which can be used to
   re-derive some spending keys from the base key (lightningdevkit#2761, lightningdevkit#3004).
 * `OutPoint::to_channel_id` has been removed in favor of
   `ChannelId::v1_from_funding_outpoint` in preparation for v2 channels with a
   different `ChannelId` derivation scheme (lightningdevkit#2797).
 * `PeerManager::get_peer_node_ids` has been replaced with `list_peers` and
   `peer_by_node_id`, which provide more details (lightningdevkit#2905).
 * `Bolt11Invoice::get_payee_pub_key` is now provided (lightningdevkit#2909).
 * `Default[Message]Router` now take an `entropy_source` argument (lightningdevkit#2847).
 * `ClosureReason::HTLCsTimedOut` has been separated out from
   `ClosureReason::HolderForceClosed` as it is the most common case (lightningdevkit#2887).
 * `ClosureReason::CooperativeClosure` is now split into
   `{Counterparty,Locally}Initiated` variants (lightningdevkit#2863).
 * `Event::ChannelPending::channel_type` is now provided (lightningdevkit#2872).
 * `PaymentForwarded::{prev,next}_user_channel_id` are now provided (lightningdevkit#2924).
 * Channel init messages have been refactored towards V2 channels (lightningdevkit#2871).
 * `BumpTransactionEvent` now contains the channel and counterparty (lightningdevkit#2873).
 * `util::scid_utils` is now public, with some trivial utilities to examine
   short channel ids (lightningdevkit#2694).
 * `DirectedChannelInfo::{source,target}` are now public (lightningdevkit#2870).
 * Bounds in `lightning-background-processor` were simplified by using
   `AChannelManager` (lightningdevkit#2963).
 * The `Persist` impl for `KVStore` no longer requires `Sized`, allowing for
   the use of `dyn KVStore` as `Persist` (lightningdevkit#2883, lightningdevkit#2976).
 * `From<PaymentPreimage>` is now implemented for `PaymentHash` (lightningdevkit#2918).
 * `NodeId::from_slice` is now provided (lightningdevkit#2942).
 * `ChannelManager` deserialization may now fail with `DangerousValue` when
    LDK's persistence API was violated (lightningdevkit#2974).

Bug Fixes
=========

 * Excess fees on counterparty commitment transactions are now included in the
   dust exposure calculation. This lines behavior up with some cases where
   transaction fees can be burnt, making them effectively dust exposure (lightningdevkit#3045).
 * `Future`s used as an `std::...::Future` could grow in size unbounded if it
   was never woken. For those not using async persistence and using the async
   `lightning-background-processor`, this could cause a memory leak in the
   `ChainMonitor` (lightningdevkit#2894).
 * Inbound channel requests that fail in
   `ChannelManager::accept_inbound_channel` would previously have stalled from
   the peer's perspective as no `error` message was sent (lightningdevkit#2953).
 * Blinded path construction has been tuned to select paths more likely to
   succeed, improving BOLT12 payment reliability (lightningdevkit#2911, lightningdevkit#2912).
 * After a reorg, `lightning-transaction-sync` could have failed to follow a
   transaction that LDK needed information about (lightningdevkit#2946).
 * `RecipientOnionFields`' `custom_tlvs` are now propagated to recipients when
   paying with blinded paths (lightningdevkit#2975).
 * `Event::ChannelClosed` is now properly generated and peers are properly
   notified for all channels that as a part of a batch channel open fail to be
   funded (lightningdevkit#3029).
 * In cases where user event processing is substantially delayed such that we
   complete multiple round-trips with our peers before a `PaymentSent` event is
   handled and then restart without persisting the `ChannelManager` after having
   persisted a `ChannelMonitor[Update]`, on startup we may have `Err`d trying to
   deserialize the `ChannelManager` (lightningdevkit#3021).
 * If a peer has relatively high latency, `PeerManager` may have failed to
   establish a connection (lightningdevkit#2993).
 * `ChannelUpdate` messages broadcasted for our own channel closures are now
   slightly more robust (lightningdevkit#2731).
 * Deserializing malformed BOLT11 invoices may have resulted in an integer
   overflow panic in debug builds (lightningdevkit#3032).
 * In exceedingly rare cases (no cases of this are known), LDK may have created
   an invalid serialization for a `ChannelManager` (lightningdevkit#2998).
 * Message processing latency handling BOLT12 payments has been reduced (lightningdevkit#2881).
 * Latency in processing `Event::SpendableOutputs` may be reduced (lightningdevkit#3033).

Node Compatibility
==================

 * LDK's blinded paths were inconsistent with other implementations in several
   ways, which have been addressed (lightningdevkit#2856, lightningdevkit#2936, lightningdevkit#2945).
 * LDK's messaging blinded paths now support the latest features which some
   nodes may begin relying on soon (lightningdevkit#2961).
 * LDK's BOLT12 structs have been updated to support some last-minute changes to
   the spec (lightningdevkit#3017, lightningdevkit#3018).
 * CLN v24.02 requires the `gossip_queries` feature for all peers, however LDK
   by default does not set it for those not using a `P2PGossipSync` (e.g. those
   using RGS). This change was reverted in CLN v24.02.2 however for now LDK
   always sets the `gossip_queries` feature. This change is expected to be
   reverted in a future LDK release (lightningdevkit#2959).

Security
========
0.0.123 fixes a denial-of-service vulnerability which we believe to be reachable
from untrusted input when parsing invalid BOLT11 invoices containing non-ASCII
characters.
 * BOLT11 invoices with non-ASCII characters in the human-readable-part may
   cause an out-of-bounds read attempt leading to a panic (lightningdevkit#3054). Note that all
   BOLT11 invoices containing non-ASCII characters are invalid.

In total, this release features 150 files changed, 19307 insertions, 6306
deletions in 360 commits since 0.0.121 from 17 authors, in alphabetical order:

 * Arik Sosman
 * Duncan Dean
 * Elias Rohrer
 * Evan Feenstra
 * Jeffrey Czyz
 * Keyue Bao
 * Matt Corallo
 * Orbital
 * Sergi Delgado Segura
 * Valentine Wallace
 * Willem Van Lint
 * Wilmer Paulino
 * benthecarman
 * jbesraa
 * olegkubrakov
 * optout
 * shaavan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't discard create _channel if a peer goes away
5 participants