-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misc routing optimization #2803
Misc routing optimization #2803
Conversation
CI's unhappy. |
22f6f4b
to
06bcbf6
Compare
Fixed. |
06bcbf6
to
bcbf56d
Compare
Rebased. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2803 +/- ##
==========================================
- Coverage 89.80% 89.78% -0.02%
==========================================
Files 121 121
Lines 100045 100094 +49
Branches 100045 100094 +49
==========================================
+ Hits 89845 89869 +24
- Misses 7533 7555 +22
- Partials 2667 2670 +3 ☔ View full report in Codecov by Sentry. |
WalkthroughThe project has undergone a significant update, focusing on efficiency and data integrity. The Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 40
Configuration used: CodeRabbit UI
Files selected for processing (6)
- .github/workflows/build.yml (1 hunks)
- lightning/src/ln/features.rs (1 hunks)
- lightning/src/routing/gossip.rs (33 hunks)
- lightning/src/routing/router.rs (38 hunks)
- lightning/src/routing/scoring.rs (1 hunks)
- lightning/src/util/test_utils.rs (2 hunks)
Files not summarized due to errors (1)
- lightning/src/routing/router.rs: Error: Message exceeds token limit
Additional comments: 40
.github/workflows/build.yml (4)
- 86-87: Updated binary file paths and keys for the network graph to reflect new versioning. Ensure that the new binary files are correctly placed and accessible at the specified URLs.
- 91-93: The download and hash verification steps for the network graph binary have been updated. Verify that the hash matches the expected value to ensure integrity of the downloaded file.
- 98-98: The environment variable for the expected network graph snapshot SHA sum has been updated. Confirm that this new SHA sum is correct and corresponds to the new binary file.
- 99-115: New steps have been added for caching, fetching, and verifying the scorer binary file. Ensure that the scorer binary is correctly integrated into the CI process and that the SHA sum verification step is accurate.
lightning/src/ln/features.rs (1)
- 778-794: The refactoring of the
requires_unknown_bits
method to use chunk iteration for flag comparison is a significant improvement in terms of efficiency. By processing 64 bits at a time instead of 8, the method reduces the number of iterations needed for feature flag checks, which can be beneficial for performance, especially when dealing with a large number of feature flags.lightning/src/routing/gossip.rs (34)
- 40-45: > Note: This review was outside the patches, and no patch overlapping with it was found. Original lines [1-6]
Imports and uses are modified, ensure that all the newly added imports (
AtomicUsize
,Ordering
, etc.) are used in the code and that there are no unused imports which can lead to warnings or bloat.
- 67-67: The
NodeId
struct is introduced or modified. Ensure that the changes to this struct are consistent with the rest of the codebase, especially with respect to serialization and deserialization, as these are common areas where issues arise when modifying data structures.- 168-169: New fields
removed_node_counters
andnext_node_counter
are added to theNetworkGraph
struct. Verify that the logic for managing these counters is correctly implemented throughout the codebase, especially in the context of node removal and addition.- 197-197: The
max_node_counter
field is added to theReadOnlyNetworkGraph
. Ensure that this field is properly maintained and represents the correct maximum value ofnode_counter
across all nodes.- 754-761: The
ChannelUpdateInfo
struct is annotated withrepr(C, align(32))
. Confirm that the alignment and representation directives are appropriate and that they do not cause any unforeseen issues on different architectures or with FFI boundaries.- 847-858: The
ChannelInfo
struct is annotated withrepr(align(128), C)
. Similar to the previous comment, verify that the alignment and representation directives are appropriate and do not cause issues on different architectures or with FFI boundaries.- 898-907: The
PartialEq
implementation forChannelInfo
is modified. Ensure that all fields that should be compared are included and that this change does not introduce any regressions in areas whereChannelInfo
equality checks are performed.- 1024-1025: The
node_one_counter
andnode_two_counter
fields inChannelInfo
are initialized withu32::max_value()
. Confirm that this is the intended default value and that it is handled correctly in all parts of the code whereChannelInfo
is used.- 1036-1037: The
DirectedChannelInfo
struct now includessource_counter
andtarget_counter
fields. Verify that these fields are correctly updated and used in routing decisions.- 1046-1051: The
new
method forDirectedChannelInfo
is modified to setsource_counter
andtarget_counter
. Ensure that the logic for determining these values is correct and that it aligns with the intended use of these counters in routing.- 1094-1099: The
source_counter
andtarget_counter
methods are added toDirectedChannelInfo
. Verify that these methods are used consistently and correctly throughout the routing logic.- 1290-1295: The
node_counter
field is added toNodeInfo
. Ensure that this field is correctly managed throughout the node's lifecycle and that it is consistent with the new vector-based lookup system.- 1359-1359: The
node_counter
field inNodeInfo
is initialized withu32::max_value()
. Confirm that this is the intended default value and that it is handled correctly in all parts of the code whereNodeInfo
is used.- 1369-1370: The
write
method forNetworkGraph
now includes a call totest_node_counter_consistency
. Verify that this method is correctly implemented and that it does not introduce performance regressions.- 1405-1415: The deserialization logic for
ChannelInfo
andNodeInfo
is modified to setnode_counter
. Ensure that the deserialization process is correct and that thenode_counter
values are consistent with the serialized data.- 1437-1438: The
NetworkGraph
constructor is modified to initializeremoved_node_counters
andnext_node_counter
. Verify that these fields are initialized to the correct values and that the constructor's logic is consistent with the rest of the codebase.- 1484-1485: The
NetworkGraph
constructor is modified to initializenext_node_counter
to 0 andremoved_node_counters
to an empty vector. Confirm that these initial values are correct and that they are handled properly throughout the graph's lifecycle.- 1493-1521: The
test_node_counter_consistency
method is added toNetworkGraph
. Verify that this method is correctly implemented and that it is called in appropriate places to ensure the consistency ofnode_counter
values.- 1680-1681: The
node_one_counter
andnode_two_counter
fields inChannelInfo
are initialized withu32::max_value()
. Confirm that this is the intended default value and that it is handled correctly in all parts of the code whereChannelInfo
is used.- 1696-1697: The logic for adding a channel between nodes is modified. Verify that the changes are correct and that they do not introduce any regressions in channel management.
- 1711-1713: The
remove_channel_in_nodes
method is called within a match arm. Verify that the logic for removing and updating channel information is correct and that it does not introduce any inconsistencies in the network graph.- 1723-1727: The
node_counter_id
array is introduced to managenode_counter
values for channels. Verify that this logic is correct and that it properly updates thenode_counter
values for both nodes associated with a channel.- 1832-1833: The
node_one_counter
andnode_two_counter
fields inChannelInfo
are initialized withu32::max_value()
. Confirm that this is the intended default value and that it is handled correctly in all parts of the code whereChannelInfo
is used.- 1862-1862: The
remove_channel_in_nodes
method is called. Verify that the logic for removing a channel from nodes is correct and that it does not introduce any inconsistencies in the network graph.- 1881-1881: The
remove_channel_in_nodes
method is called within a loop. Verify that the logic for removing channels and managing node counters is correct and that it does not introduce any inconsistencies in the network graph.- 1890-1890: The logic for removing nodes and updating
removed_node_counters
is modified. Verify that the changes are correct and that they do not introduce any regressions in node management.- 1973-1973: The
remove_channel_in_nodes
method is called within a loop. Verify that the logic for removing channels and managing node counters is correct and that it does not introduce any inconsistencies in the network graph.- 2152-2160: The
remove_channel_in_nodes
method is modified to updateremoved_node_counters
. Verify that the logic for removing channels and managing node counters is correct and that it does not introduce any inconsistencies in the network graph.- 2218-2222: The
max_node_counter
method is added toReadOnlyNetworkGraph
. Verify that this method returns the correct maximum value fornode_counter
and that it is used appropriately throughout the code.- 3519-3520: The
node_one_counter
andnode_two_counter
fields inChannelInfo
are set to specific values in a test. Verify that these values are appropriate for the test scenario and that the test correctly reflects the intended behavior of the code.- 3539-3540: The
node_one_counter
andnode_two_counter
fields inChannelInfo
are set to specific values in a test. Verify that these values are appropriate for the test scenario and that the test correctly reflects the intended behavior of the code.- 3595-3595: The
node_counter
field inNodeInfo
is set to a specific value in a test. Verify that this value is appropriate for the test scenario and that the test correctly reflects the intended behavior of the code.- 3632-3632: The
read_network_graph
benchmark function is modified. Verify that the benchmark is correctly set up and that it accurately measures the performance of reading the network graph.- 3642-3642: The
write_network_graph
benchmark function is modified. Verify that the benchmark is correctly set up and that it accurately measures the performance of writing the network graph.lightning/src/routing/scoring.rs (1)
- 3687-3687: The logic within
decay_100k_channel_bounds
has been fundamentally changed. Ensure that the new implementation aligns with the intended behavior and that all dependent features are tested with the updated logic.
bcbf56d
to
8d80f82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 28
Configuration used: CodeRabbit UI
Files selected for processing (6)
- .github/workflows/build.yml (1 hunks)
- lightning/src/ln/features.rs (1 hunks)
- lightning/src/routing/gossip.rs (33 hunks)
- lightning/src/routing/router.rs (38 hunks)
- lightning/src/routing/scoring.rs (1 hunks)
- lightning/src/util/test_utils.rs (2 hunks)
Files not summarized due to errors (1)
- lightning/src/routing/router.rs: Error: Message exceeds token limit
Files skipped from review as they are similar to previous changes (4)
- .github/workflows/build.yml
- lightning/src/ln/features.rs
- lightning/src/routing/gossip.rs
- lightning/src/util/test_utils.rs
Rebased. |
8d80f82
to
c5e0599
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 7
Configuration used: CodeRabbit UI
Files selected for processing (6)
- .github/workflows/build.yml (1 hunks)
- lightning/src/ln/features.rs (1 hunks)
- lightning/src/routing/gossip.rs (33 hunks)
- lightning/src/routing/router.rs (38 hunks)
- lightning/src/routing/scoring.rs (1 hunks)
- lightning/src/util/test_utils.rs (2 hunks)
Files not summarized due to errors (1)
- lightning/src/routing/router.rs: Error: Message exceeds token limit
Files not reviewed due to errors (1)
- (no review received)
Additional comments: 19
.github/workflows/build.yml (6)
- 86-87: The path and key for caching the routing graph snapshot have been updated to
lightning/net_graph-2023-12-10.bin
andldk-net_graph-v0.0.118-2023-12-10.bin
, respectively. Ensure that the new binary file is correctly generated and available at the specified URL to prevent CI failures.- 91-93: The fetching process for the routing graph snapshot includes a SHA-256 checksum verification step. It's crucial to verify that the
EXPECTED_ROUTING_GRAPH_SNAPSHOT_SHASUM
matches the actual checksum of the file athttps://bitcoin.ninja/ldk-net_graph-v0.0.118-2023-12-10.bin
to ensure integrity and prevent potential security issues.- 98-98: The
EXPECTED_ROUTING_GRAPH_SNAPSHOT_SHASUM
is updated. Double-check that this SHA-256 checksum accurately corresponds to the new routing graph snapshot file to ensure the integrity of the downloaded file.- 103-104: The path and key for caching the scorer snapshot have been updated to
lightning/scorer-2023-12-10.bin
andldk-scorer-v0.0.118-2023-12-10.bin
, respectively. Confirm that the new binary file is correctly generated and accessible at the provided URL to avoid CI disruptions.- 108-110: The fetching process for the scorer snapshot includes a SHA-256 checksum verification step. It's essential to ensure that the
EXPECTED_SCORER_SNAPSHOT_SHASUM
matches the actual checksum of the file athttps://bitcoin.ninja/ldk-scorer-v0.0.118-2023-12-10.bin
to maintain integrity and avert potential security risks.- 115-115: The
EXPECTED_SCORER_SNAPSHOT_SHASUM
is updated. Verify that this SHA-256 checksum correctly matches the new scorer snapshot file to guarantee the integrity of the downloaded file.lightning/src/routing/gossip.rs (8)
- 67-67: The
NodeId
struct is correctly annotated with#[derive(Clone, Copy, PartialEq, Eq)]
to ensure it can be easily copied and compared.- 168-169: The addition of
removed_node_counters
andnext_node_counter
fields to theNetworkGraph
struct is consistent with the PR's objective to optimize routing performance by using unique counters for nodes.- 197-197: The
max_node_counter
field inReadOnlyNetworkGraph
struct is a good addition for tracking the maximum node counter value, which is likely used for performance optimizations in routing.- 753-761: The use of
#[repr(C, align(32))]
forChannelUpdateInfo
struct is a smart optimization to ensure that the struct's layout is predictable and aligned for cache performance. This is particularly important for structures that are frequently accessed and modified in performance-critical paths.- 846-852: Similarly, the
ChannelInfo
struct's alignment with#[repr(align(128), C)]
is a thoughtful optimization for cache performance. Ensuring that frequently accessed fields are likely to be on the same or adjacent cache lines can significantly impact performance in routing.- 1024-1025: The initialization of
node_one_counter
andnode_two_counter
tou32::max_value()
inChannelInfo
struct'sReadable
implementation seems to be a placeholder. It's crucial to ensure that these counters are correctly set elsewhere in the code to meaningful values, as usingu32::max_value()
directly could lead to incorrect behavior or performance issues.Verification successful
The search results indicate that
node_one_counter
andnode_two_counter
are indeed updated from their initialu32::max_value()
state in various parts of the code. There are assertions and explicit updates that suggest these counters are managed and set to meaningful values before being used in a way that could affect behavior or performance. This evidence supports the idea that there is a mechanism in place to ensure the counters are not left at their placeholder values throughout the code's execution.* 1381-1382: The `test_node_counter_consistency` method in `NetworkGraph` is a good practice for ensuring the internal consistency of node counters. This kind of invariant checking is crucial for debugging and maintaining the integrity of the routing graph. * 1537-1537: The `read_only` method in `NetworkGraph` provides a safe way to access a read-only view of the network graph. This is useful for operations that need to read network graph data without the risk of modifying it.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that node counters are correctly assigned and not left at their initial max value state. rg "node_(one|two)_counter" --type rustLength of output: 1526
lightning/src/routing/scoring.rs (1)
- 3687-3687: Consider evaluating the performance impact of the
time_passed
method withindecay_100k_channel_bounds
. Frequent updates to channel scores could lead to performance bottlenecks, especially with a large number of channels.Explore optimizing the decay calculation to reduce computational overhead, possibly by batching updates or adjusting the decay frequency based on network size and activity.
lightning/src/routing/router.rs (4)
- 1126-1132: > 📝 NOTE
This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [3-6]
The constants
_GRAPH_NODE_SMALL
and_GRAPH_NODE_FIXED_SIZE
are used in static assertions to enforce a specific layout forRouteGraphNode
, as clarified in previous discussions. Ensure that these assertions are present and effectively enforce the intended layout for performance optimization.
- 1181-1189: The addition of
source_node_counter
andtarget_node_counter
fields in this context further supports the PR's goal of optimizing routing performance by using uniqueu32
counters. Good consistency across different structs.- 1205-1209: The introduction of
source_node_counter
for blinded paths aligns with the PR's optimization strategy. Ensure that the assumptions regarding the introduction point's visibility as a public node are valid and clearly documented.- 1354-1375: The use of
#[inline(always)]
and#[inline]
attributes is based on careful benchmarking, as previously clarified. Consider adding comments to document the benchmarking results and rationale behind these decisions to aid future maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, let me say sorry for taking this long to have a look this, I believe I self-requested review a while back.
I did a first high-level pass and added an initial round of questions. I have to say that I'm close to a concept NACK on this one: the router logic is hard to reason about as it is and we keep discovering bugs here. It seems to me that this PR significantly increases the code complexity and introduces several new angles how things can go wrong. While this seems to work just fine for now, I fear that we'll see more breakage in the router code as a consequence in the future. If we really want to go ahead with this, it would be great if we could find a better abstraction for our newly created data structure that would offer a foolproof API, e.g., so we don't for get to insert/remove reused counters in the corresponding list.
I have yet to run the benchmarks myself to see how much speedup this PR would gain us, but from my first impression I'm not convinced it's worth the increased risks and maintenance costs. Also, it seems that a good chunk of the performance improvements might come from the last few commits alone, which are optimizations that could be applied independently from switching to node counters?
lightning/src/routing/router.rs
Outdated
|
||
/// Tries to open a network graph file, or panics with a URL to fetch it. | ||
pub(crate) fn get_route_file() -> Result<std::fs::File, &'static str> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: "This means future changes to the scorer's data may
be harder to benchmark": Would an alternative be to keep both versions as separate benchmarks, so we could still benchmark updates impacting the Scorer's data model with synthetic data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I think the random failures model we had before is borderline useless for benchmarking route performance changes. We're somewhat better off doing some kind of conversion from old to new scoring data and then bencharking from there.
lightning/src/routing/gossip.rs
Outdated
/// | ||
/// These IDs allow the router to avoid a `HashMap` lookup by simply using this value as an | ||
/// index in a `Vec`, skipping a big step in some of the hottest code when routing. | ||
pub(crate) node_counter: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that storing the counter in here could be pretty dangerous if we or users were to clone the NodeInfo
, do something with it and come back, at which point we could have dropped the entry and recreated another NodeInfo
with the same counter.
Is this an issue? Do we want store the node_counter
as part of a wrapper struct holding both the NodeInfo
and the counter? Alternatively, we could make this an Option<u32>
and make sure that clone()
would reset it to None
, asserting we'd have to re-insert/lookup it as a 'fresh' info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not convinced its an issue. The network graph is publicly read-only - it has several internal consistency requirements (eg each channel has both side nodes in the nodes map) which imply users can't freely edit bits of it (though they're welcome to take parts of it and copy them locally to build their own graphs).
lightning/src/routing/gossip.rs
Outdated
@@ -1409,15 +1429,42 @@ impl<L: Deref> NetworkGraph<L> where L::Target: Logger { | |||
logger, | |||
channels: RwLock::new(IndexedMap::new()), | |||
nodes: RwLock::new(IndexedMap::new()), | |||
next_node_counter: AtomicUsize::new(0), | |||
removed_node_counters: Mutex::new(Vec::new()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding these just side-by-side, could we create a new data structure wrapping them and exposing adequate insert
/remove
API methods so that we'd never forget to, e.g., call removed_node_counters.push(..)
whenever we remove a node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, sadly that doesn't improve things much, we end up with a bunch of places where we just replace one removed_node_counters.push(..)
with a node_counters.removed_node(...)
. We can't super easily use a method on the graph to mark a node removed, as we're often doing it with hash map entries already in place from a previous lookup.
Still, test_node_counter_consistency
is pretty thorough, so if we have any obvious bugs fuzzing or tests should easily hit assertions in that.
lightning/src/routing/gossip.rs
Outdated
@@ -865,6 +857,24 @@ pub struct ChannelInfo { | |||
/// (which we can probably assume we are - no-std environments probably won't have a full | |||
/// network graph in memory!). | |||
announcement_received_time: u64, | |||
|
|||
/// The [`NodeInfo::node_counter`] of the node pointed to by [`Self::node_one`]. | |||
pub(crate) node_one_counter: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that these counters will be reused, how can we be sure that they won't get outdated, especially for cloned ChannelInfo
s as mentioned above regarding NodeInfo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should only get reused when a node was fully removed. The network graph already has internal consistency requirements that any channel has both the source and sink nodes in the graph as well. This just relies on that consistency requirement by adding an additional pointer. In terms of ChannelInfo
s copied and used outside of a specific graph, indeed, they could point nowhere, but that's kinda by definition - the counters are specific to a NetworkGraph
, they aren't global in any other sense, and each route finding operation only cares about a single graph and its contained infos.
lightning/src/routing/router.rs
Outdated
/// public node. | ||
pub(crate) payer_node_counter: u32, | ||
/// A unique ID which describes the first hop counterparty. It will not conflict with any | ||
/// [`super::gossip::NodeInfo::node_counter`]s, but may be equal to one if the counterparty is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: may be equal to one
is ambiguous in this context (here and below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this is ambiguous? Its saying that this won't step on the toes of any data in our graph, but it may be equal to some data in our graph if the node is public.
Fair, let me encapsulate the node counter logic and remove it from |
Sadly not. The last few commits reduce the pressure we put on the branch predictor, and improve things a bit on the edges, but the vast majority of the gain here is dropping the hash table lookups. A very large portion of our total routing time is spent just doing hash table lookups directly (we have like 3 or 4 of them we index into in routing - the network graph, gossip data, dist, etc), so dropping one entirely is a huge win. |
75d0c2e
to
f53cc4c
Compare
f53cc4c
to
799dc75
Compare
Okay, rebased on main. With the new struct I think its not that messy, and now it also lets us simplify some of the blinded path stuff too which I think is nice. |
799dc75
to
fb2d61e
Compare
fb2d61e
to
db4c369
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually, I think get_route
isn't too much less readable than before now that the encapsulation has been added. I agree with the concerns about complexity generally, though the speedup seems pretty worthwhile.
IMO, the |
db4c369
to
62ffd78
Compare
62ffd78
to
5647247
Compare
Rebased. |
CI is sad. |
When processing the main loop during routefinding, for each node, we check whether it happens to be our peer in one of our channels. This ensures we never fail to find a route that takes a hop through a private channel of ours, to a private node, then through invoice-provided route hints to reach the ultimate payee. Because this is incredibly hot code, doing a full `HashMap` lookup to check if each node is a first-hop target ends up eating a good chunk of time during routing. Luckily, we can trivially avoid this cost. Because we're already looking up the per-node state in the `dist` map, we can store a bool in each first-hop target's state, avoiding the lookup unless we know its going to succeed. This requires storing a dummy entry in `dist`, which feels somewhat strange, but is ultimately fine as we should never be looking at per-node state unless we've already found a path to that node, updating the fields in doign so.
While LLVM should inline and elide the redundant calls, because the router is rather large LLVM can decide against inlining in some cases where it would be an nice win. Thus, its worth DRY'ing the redundant calls explicitly.
Because we now have some slack space in `PathBuildingHop`, we can use it to cache some additional hot values. Here we use it to cache the source and target `node_counter`s for public channels, effectively prefetching the values from the channel state.
It turns out we spend several percent of our routefinding time just checking if nodes and channels require unknown features byte-by-byte. While the cost is almost certainly dominated by the memory read latency, avoiding doing the checks byte-by-byte should reduce the branch count slightly, which may reduce the overhead.
Because fetching fields from the `$candidate` often implies an indirect read, grouping them together may result in one or two fewer memory loads, so we do so here.
Because we scan per-channel information in the hot inner loop of our routefinding immediately after looking a channel up in a `HashMap`, we end up spending a nontrivial portion of our routefinding time waiting on memory to be read in. While there is only so much we can do about that, ensuring the channel information that we care about is sitting on one or adjacent cache lines avoids paying that penalty twice. Thus, here we manually lay out `ChannelInfo` and `ChannelUpdateInfo` and set them to 128b and 32b alignment, respectively. This wastes some space in memory in our network graph, but improves routing performance in return.
Fixed |
5647247
to
f689e01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. On the fence about whether this needs a second reviewer so up to you!
It is all pretty trivial, but at least eg the feature optimization and the first hop cache thing could probably use another pair of eyes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, mod one question.
Feel free to ignore nits, as they aren't important.
// dummy entry in dist for each first-hop target, allowing us to do this lookup for | ||
// free since we're already looking at the `was_processed` flag. | ||
// | ||
// Note that all the fields (except `is_first_hop_target`) will be overwritten whenever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth adding debug_assert
s or similar checks to make sure we don't deviate from this assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it would be, but I'm not sure I know how to write such an assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could check that if one field is updated, all others are too? But maybe not worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure how to check on a per-field basis. The likely failure case is we add another field and fail to update it when relevant, but I'm not aware of a way to iterate over the fields of a struct?
// | ||
// Sadly, this is not possible, however we can still do okay - all of the fields before | ||
// `one_to_two` and `two_to_one` are just under 128 bytes long, so we can ensure they sit on | ||
// adjacent cache lines (which are generally fetched together in x86_64 processors). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
// adjacent cache lines (which are generally fetched together in x86_64 processors). | |
// adjacent cache lines (which are generally fetched together in x86-64 processors). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to land
Gonna go ahead and land to get this done, but will tackle nits in a quick followup. |
During routing, we spend most of our time doing hashmap lookups. It turns out, we can drop two of them, the first requires a good bit of work - assigning each node in memory a randomu32
"node counter", we can then drop the main per-node routefinding state map and replace it with a vec. Once we do that, we can also drop the first-hop hashmap lookup that we do on a per node basis as we walk the network graph, replacing it with a check in the same vec.This is the first in a series of PRs that, in total, substantially more than double our routefinding performance with real data. This first step optimizes the route-finder itself, with later steps more focused on the scorer.Based on #2802.The bulk of this PR was landed in #3103 and #3104. This PR now includes a grab-bag of misc optimizations to
get_route
which should speed the router up a smidge.