Skip to content

Inbound governor performance improvement #5104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

crocodile-dentist
Copy link
Contributor

@crocodile-dentist crocodile-dentist commented Mar 27, 2025

Description

The change is expected to reduce CPU load and is designed to scale with the number of inbound connections served by a node. In particular, bootstrap relays and Genesis peers are mostly affected, but all relays in general should benefit from these improvements.

Profiling a real loaded node revealed a CPU bottleneck in the inbound governor loop, which essentially single-threads the application. Specifically, the firstPeerTo* FirstToFinish actions are CPU intensive in the amount of state checking performed for all active connections, repeatedly for each such action in a linear fashion. Casting aside the STM costs, there was in particular also duplication of filtering of hot/warm responder miniprotocols for each firstToFinish* activity. In terms of the the STM costs, especially the embedded LastToFinish computations are expensive because of lack of short-circuiting of STM retry semantics. For instance, when LastToFinish 1 <> 2 <> 3 is attempted, and 1 and 2 retry, but 3 succeeds, then 2 is attempted again, and if it succeeds then 1 is attempted again. From a performance perspective, it would be better to abort that peer's firstToFinish activity on the first retry and go try the next peer, but I'm not sure if that is supported.

This PR introduces an event based system where each connection's muxer thread can concurrently notify the inbound governor of interesting peer activity such that it can perform its tasks. The inbound governor creates a tracer which enqueues muxer events from each peer that it knows about into its information channel queue. This queue is processed in batch where all ordering of events is preserved. In particular, muxer start, stop, miniprotocol clean & exception exits are tracked. Some additional state is also introduced which tracks the number of hot and non-hot (warm & established) responder miniprotocols such that remote promotions and demotions can be identified efficiently.

The commit history is sequenced in a bottom up fashion, and while showing the 'guts' of how the new event system operates, it masks at first sight how this is folded into the program as a whole and what other reorganizations are needed from the following commits. In particular, a difficulty manifested because currently the CM is instantiated at the front, but the idea behind the proposed change is that the new tracer should be baked into the connection handler by the inbound governor, which should be provided to the CM as the penultimate step before starting the server. When a new outbound or incoming connection is made, the CM forks this augmented connection handler, which then adjoins this new tracer to the mux tracer before calling Mux.run. In this manner, the new tracing is only performed for connections served by the inbound handler, or the outbound handler only when a duplex data flow was negotiated. This reorganization is best viewed in the commit of Diffusion module refactoring.

Other changes include:

  • Fixes and improvements to simulation shrinkers
  • Testnet simulation tracing UX improvements
  • Refactors of testing code, with a bugfix to prop_mux_starvation test
  • Updates to network-spec

Checklist

Quality

  • Commit sequence makes sense and have useful messages, see ref.
  • New tests are added and existing tests are updated.
  • Self-reviewed the PR.

Maintenance

  • Linked an issue or added the PR to the current sprint of ouroboros-network project.
  • Added labels.
  • [ x Updated changelog files.
  • The documentation has been properly updated, see ref.

@crocodile-dentist crocodile-dentist added optimisation Performance optimisation inbound-governor Issues / PRs related to inbound-governor labels Mar 27, 2025
@crocodile-dentist crocodile-dentist self-assigned this Mar 27, 2025
@crocodile-dentist crocodile-dentist requested a review from a team as a code owner March 27, 2025 20:55
@github-project-automation github-project-automation bot moved this to In Progress in Ouroboros Network Mar 27, 2025
@crocodile-dentist crocodile-dentist marked this pull request as draft March 27, 2025 20:55
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch 3 times, most recently from a472b00 to 9664dee Compare March 31, 2025 12:14
@crocodile-dentist crocodile-dentist marked this pull request as ready for review March 31, 2025 12:15
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch 3 times, most recently from b835675 to aa7c0b3 Compare March 31, 2025 15:15
Copy link
Contributor

@bolt12 bolt12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, I am not sure I fully understood the solution but I don't think I like the added complexity. Tracers shouldn't be used in this way. I don't think it is clearer what happens when the tracer is a null one, for example. Implicit behaviours hidden in tracers sounds like a nightmare to debug as well. Did I understand it right?

@crocodile-dentist crocodile-dentist added the documentation Network Documentation related tasks label Apr 1, 2025
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch 2 times, most recently from 982787b to 0ea6769 Compare April 2, 2025 07:08
@karknu
Copy link
Contributor

karknu commented Apr 2, 2025

Some initial comments, I am not sure I fully understood the solution but I don't think I like the added complexity. Tracers shouldn't be used in this way. I don't think it is clearer what happens when the tracer is a null one, for example. Implicit behaviours hidden in tracers sounds like a nightmare to debug as well. Did I understand it right?

We use tracers in this way for ranking peers. I think this is what tracers excel at.

@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch 11 times, most recently from d4381fe to 37e02b0 Compare April 9, 2025 11:53
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch 4 times, most recently from 0f9e65c to 025e2d0 Compare April 25, 2025 07:15
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch from 025e2d0 to 895a56b Compare May 15, 2025 14:06
Copy link
Contributor

@coot coot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor comments.

This change replaces the mechanism heavily relying on STM
to track remote activity by the inbound governor.
Some performance tests indicate that there is a not-insignificant
performance penalty with that approach.

In the new approach, the inbound governor creates a tracer which is passed via the
connection manager to the connection handler. The handler joins
this tracer with the mux tracer such that the muxer main thread,
via various traces, writes to the information channel queue of events
such as remote promotions/demotions and mux stopping/erroring.
Each connection maintains a count of remote hot/non-hot responders
which are adjusted by the appropriate traces, and informing the
inbound governor of relevant state changes that need to be performed.

Change RemoteIdle tag:
  *  The change is motivated by the need of the new tracer
     to ensure proper sequencing of events on the queue. In case
     a connection is expired and responder startup is demanded, the tracer
     will retry until the connection is RemoteCold to register the
     promotion, or abort when the connection is dropped (CM CommitTr returned).
Moved 'InboundGovernorInfoChannel` to InboundGovernor

Change IG information channel queue bound

Previously, the queue was only used to communicate new connections
to the inbound governor. The queue is now used to also notify
the IG of muxer events so it will be busier.
Applies the connection handler builder to the new IG tracer and passes
the result to the withConnectionManager continuation. The connection
manager forks this tracer-augmented handler for new connections so
that the inbound governor can be efficiently notified of remote
activity by tracing performed by mux.

The information channel queue is drained one last time after the
inbound governor thread finishes. This is to avoid a deadlock where
the queue becomes full, potentially preventing connection handlers from
performing their cleanup routine. This is important when shutting down,
where the connection manager is waiting for all connection handlers to finish.

Refactor InboundGovernor interface

Rework IG loop

Process all the info channel events from the queue in one step
of the IG loop. The queue events arrive from the CM (new connection)
or from the tracer which tracks miniprotocol responder activity
and mux start/stop.

Move the remnants of the deleted Event module
  * Most of the functionality of the Event is unneeded anymore
    and the module is removed, with its still useful remnant moved
    here. Updates reflecting changes to 'RemoteIdle' tag were applied.
Also reflects changes to 'RemoteIdle' tag
Remove unnecessary unregisterConnection call sites
InformationChannel was moved under InboundGovernor directory
Update Args and API

Integrates prior commits
Reordered some traces to be sequenced after some state
was updated.

Introduce Mux.Tracers type

Contains mux tracer and separate channel tracer. The idea
is that the channel tracer should not contain the inbound
governor's information channel tracer which may be present
in the muxTracer field for certain connections. The protocol channel send/receive
trace tags are uniteresting from the IG tracer's perspective
but there is a penalty for invoking it so frequently for
every complete message.
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch from 895a56b to 82beb03 Compare May 19, 2025 07:48
let traceWithBearer = contramap $ Mx.WithBearer connectionId
unmask $ Mx.run Mx.Tracers {
muxTracer = traceWithBearer (muxTracer <> inboundGovernorMuxTracer countersVar),
channelTracer = traceWithBearer muxTracer }
Copy link
Contributor

@coot coot May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to this branch/PR or it has more to do with #5121?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related to 5121, but it makes sense to unmask async exceptions after delivering the negotiated connection values to the CM, which is blocked waiting for it.

Attach the IG tracer to Mux tracer before calling Mx.run

The tracer is attached to the inbound handler and to the outbound
handler only when a duplex mode was negotiated.

Apply unmask only when calling Mx.run to guarantee in case of
receiving an async exception that the promise of handshake negotiation
result is delivered to the CM which is blocked waiting for it.
The bug was exposed when Mux.run signature was changed to accept
the new MuxTracerBundle type instead of the Mux.Trace's Trace type.
The latter Trace type mixes low level bearer tags with the higher level
mux tags, so logically they are separate and in fact are traced by
separate components. The headerTracer must go along with the bearer tracer
otherwise it doesn't trace anything, but the types matched so the
program was accepted.
Integrates prior commits
integrates prior commits
No other significant changes besides cosmetics.
This addresses the shrinker hang when attempting to shrink
'AbsBearerInfo' in some cases.
Fixes termination bug in testnet command shrinking by pruning
duplicate commands.

Added Skip command to diffusion testnet script. This is only generated
for failed test cases by the shrinker.

Sometimes we can generate a simpler command script by dropping
a specific command, or a whole series of commands, and extend
the delays (in effect, 'unshrinking' the delay of when the command is applied) of the remaining ones
For eg. JoinNetwork 0 : Reconfigure 5 : Reconfigure 10 : Reconfigure 15 : Kill 2
Can be simplied to JoinNetwork 0 : Kill 32 when the intermediate reconfigures are
skipped - the shrinker changes them to Skip and the candidate is
passed here. It is important that the overall test duration is not
shrunk at this stage, since a failed test might depend on some minimal
run duration. If the property still fails, then the original commands are not contributing factors,
but what is important is that the node has to run at least that long
for the test case to fail. With such a simplified command script, the
execution duration can be shrunk in the following shrinker iterations.
Some commands may be just noise, and all they do is cause unnecessary work
slowing down the shrinker as it attempts to shrink them individually. This only
increases the search space without any benefit in the end.
The skips are removed from the fixed up script, just the effect of their
duration is applied to the following first non-skip command. If a specific command
is in fact significant for the property to fail, for eg. the test passes if
Reconfigure is changed to Skip, the shrinker will backtrack and keep
that command and continue with shrinking the rest. The bottom line is that
the minimal example and running time are both improved, significantly as
observed in practice.

For o-n-framework experiments shrinker, we shrink AcceptedConnectionLimits fields one by one, instead of
simultaneously all at once, for a better minimal result.
This annotes each trace with a nice (node-x), where x is a number
[1..max-node in simulation]. This facilitates grepping a trace
for just the node that failed a test.

In the mkTracers binding, one can either turn on all component tracers
for general testing, and in case of failure of a test where only a
subset of components are really needed, one can selectively toggle
just the interesting sayTracers to further cut down the noise.
Comprehensibility improvements
Integrates prior changes
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/inbound-governor-turbo branch from edff5fd to 2719dc4 Compare May 20, 2025 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Network Documentation related tasks inbound-governor Issues / PRs related to inbound-governor mux issues related to network-mux optimisation Performance optimisation
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

4 participants