Skip to content

Conversation

@taspelund
Copy link
Contributor

@taspelund taspelund commented Aug 8, 2025

This PR adds initial support for IPv6 static routing in maghemite, with the very beginnings of BGP support (really just originated IPv6 prefixes).
This addresses a few bugs found along the way (e.g. #529), adds unit tests for old/new functionality, and does a good bit of code cleanup, e.g.

  • mostly standardizing on oxnet::IpNet in generic spots and rdb::Prefix in db-specific spots, rather than several unique types
  • moving common functionality into mg-common where it can be used by all other crates w/o creating circular dependencies

Along the way I found that our handling of Connection Collisions was incorrect in a number of ways
e.g.

  • We were not properly babysitting the second connection up through a received Open
  • We were not always using the correct peer ASN (.unwrap_or(0) would return 0 all the time leading to incorrect behavior when the peer has a higher RID)
  • We were not sending Collision Resolution messages
  • Incomplete connection handling during collision window (e.g. HoldTime was set for the whole peer, not for a given connection)

This eventually became a pretty significant rewrite of much of the FSM for various reasons,
e.g.

  • To facilitate multiple connections, some state (like hold time / keepalive time) needed to be moved from the per-peer SessionRunner struct into the per-connection (TCP or Channel) BgpConnection trait.
  • The codepaths for inbound and outbound connections were not handled equally by the FSM
  • There were many unused FsmEvent enum invariants, leading to many catch-all match arms in the FSM state handlers
  • Our implementation deviated from the RFC in seemingly unnecessary ways (e.g. Active holding onto a connection, several FSM events returning us to Connect or Active when the RFC said to move into Idle, etc.),

Some highlights from the FSM work:

  • Connections now own a bit of state that was previously owned by the SessionRunner (particularly timers)
  • Introduced BgpConnector trait that mirrors the BgpListener trait but for outbound connections
  • Both inbound and outbound connections are now handled asynchronously and present themselves to the SessionRunner as FsmEvents (TcpConnectionAcked (previously named Connected) for inbound, TcpConnectionConfirmed for outbound)
  • Connection policy (ttl / md5) is now applied by BgpConnector or BgpListener before a connection is sent to the FSM/SessionRunner
  • The SessionRunner now owns a connection registry so we can keep track of more than one valid (registered) connection in parallel - with storage for a single "primary" connection for ease of querying connection state (users/operators shouldn't have to care about multiple connections in parallel)
  • a new ConnectionId type is introduced and added as a member of the FsmEvent variants that are meant only for a specific connection (this allows the FSM event queue to be multiplexed for multiple connections, since events can now be associated with a specific connection. e.g. a Notification for connection X doesn't affect connection Y)
  • FsmState::ConnectionCollision has been introduced to provide an entire pseudo-state (as in, not RFC defined) where we can emulate an FSM for two connections in parallel despite only implementing a single FSM
  • Unused FsmEvent variants have been moved to an UnusedEvent enum (decorated with dead_code) so we don't have to clutter the FSM methods with handlers for them (I deliberately wanted to make sure we don't use catch-all match arms)
  • enum FsmEvent now holds event type sub-enums that group the FsmEvents based on the category of event (admin, session-wide, and connection-specific)
  • The FsmState methods have been almost completely rewritten in order to align with the RFC (some events took incorrect actions, some logic lived in the wrong state, some states held onto connections when they shouldn't, etc.)
  • Lots of test restructuring (helper functions, use of generics to add TCP support, a LoopbackIpManager added to add/remove IPs as needed for TCP tests to work w/o changing the addr_to_session paradigm).
  • All logging in mgd (minus ddm) now uses slog, with only a couple exceptions where a custom macro including context-specific kv pairs for use with slog::o!

taspelund added 14 commits July 25, 2025 12:38
Removes one of the multiple duplicate types that implement an IP Prefix.

Signed-off-by: Trey Aspelund <[email protected]>
Moves to ddm/oxnet types to represent an IP Prefix instead of custom
types. Step 2 in standardizing around oxnet Prefix types.

Signed-off-by: Trey Aspelund <[email protected]>
Remove unused BgpAttributes4, BgpAttributes6, OriginChangeSet, and
Status types.  There were a couple other unused types that seemed more
forward-looking, so I left those in place (i.e. fn to_buf(), Policy4Key)

Signed-off-by: Trey Aspelund <[email protected]>
Updates Db to incorporate new methods that support IPv6.
The RIB is now split into three separate types:
- Rib: Holds Prefix (protocol-agnostic enum)
- Rib4: Holds Prefix4 (IPv4 specific struct)
- Rib6: Holds Prefix6 (IPv6 specific struct)
Updates bestpaths() to be completely prefix-agnostic (now accepts paths
as an argument in lieu of a prefix and Rib).
Callers of prefix add/remove methods are now responsible for managing
locks for the appropriate RIB structures, allowing each protocol's RIB
to be locked independently and giving the caller the option to reduce
how many times the lock needs to be acquired/released.

Signed-off-by: Trey Aspelund <[email protected]>
Initial implementation of IPv6 Static Routing:
- New mgd APIs
- New mgadm commands (mirroring v4 static routing commands)

Signed-off-by: Trey Aspelund <[email protected]>
During creation of Prefix4/Prefix6 structs, an Ipv{4,6}Addr was blindly
allowed w/o any validation or zeroing of the host bits.  This allowed
for the RIB to hold multiple entries for routes w/ the same underlying
subnet (e.g. 1.1.1.1/24 and 1.1.1.2/24 would be considered unique keys,
but the underlying subnet is still 1.1.1.0/24), possibly even with
differing next-hop values.
This commit addresses that issue by adding a step to zero the host bits
during the creation of new Prefix4/Prefix6 structs.  Figuring out a
migration strategy for the on-disk sled::Db remains a concern, but
this prevents new problematic entries from being created.

Fixes: #529

Signed-off-by: Trey Aspelund <[email protected]>
Second half of the fix for #529. First we moved to a constructor that
enforced all host bits be zeroed when creating a Prefix* type. Now, we
check whether the routes in the on-disk db are valid (host bits unset)
and remove them if they're invalid.
This also moves most of the logic into methods of the Prefix* types,
with the exception of the mg-admin-client mirrors of them, as progenitor
can't know about methods/traits from the openapi spec so that logic
has to live somewhere.

Fixes: #529

Signed-off-by: Trey Aspelund <[email protected]>
Some helpful parsing and retry macros were sitting in bgp and other
crates couldn't use them... so I moved them to mg-common.

Signed-off-by: Trey Aspelund <[email protected]>
Adds APIs, db methods, mgadm commands and unit tests for IPv6 BGP Origin

Signed-off-by: Trey Aspelund <[email protected]>
Signed-off-by: Trey Aspelund <[email protected]>
@taspelund taspelund requested a review from rcgoodfellow August 8, 2025 22:45
@taspelund taspelund self-assigned this Aug 8, 2025
@taspelund taspelund added bgp Border Gateway Protocol mgd Maghemite daemon customer For any bug reports or feature requests tied to customer requests static Static Routing rust Pull requests that update rust code labels Aug 8, 2025
Signed-off-by: Trey Aspelund <[email protected]>
@taspelund taspelund linked an issue Aug 8, 2025 that may be closed by this pull request
Moves RIB queries from /bgp to /rib (this will likely require
a corresponding update in Nexus), as rib_in and rib_loc are not
specific to BGP.
Moves RIB logic into its own set of files (mgd/src/rib_admin.rs,
mgadm/src/rib.rs).
Refactors print_rib() and move it into rib.rs.
Adds support for server-side filtering of RIB queries by AddressFamily
and Protocol (BGP/Static).
Removes a bunch of duplicate types and impls by using progenitor's
"replace" directive in generate_api!() macro to automatically map
progenitor's auto-derived types back to their original rdb types (this
is something we should do further refactoring on, as there is quite a
lot of type/impl duplication going on).

Fixes: #525

Signed-off-by: Trey Aspelund <[email protected]>
// Ensure that r1's peer session to r2 has gone back to connect.
r2.shutdown();
d2.shutdown();
wait_for_eq!(r1_session.state(), FsmStateKind::Connect);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we expecting state machine behavior changes? It looks like tests are now failing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I modified this is because I believed the FSM state observed is timing-dependent, not consistent.

It seemed to me like the goal of this test case is to confirm that "shutdown" on one side brings the session down.
If so, then looking for "!established" is a less timing-dependent way to confirm that things have been completely torn down.

Otherwise, the state on the active side could be "connect", "opensent" (since the listener isn't torn down on the shutdown side, the transport can still come up) or "idle".
I was hoping to avoid a scenario where the test fails because the active peer is in "idle" or "opensent" and we were only looking for "connect".

I don't have super strong opinions about this though, so if reverting that change means the test passes consistently again, then I'm fine with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure at what point this test became unreliable. I'll go back through and figure out which of my changes borked things. I don't suspect the issue is with the logic change of "state != established", but I could be wrong

};

for p in prefixes {
update.nlri.push(p.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't v6 be in mp_reach_nlri/mp_unreach_nlri?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually, yes. MP_{REACH,UNREACH}_NLRI hasn't been implemented yet

Properly handle connection termination scenarios to prevent zombie
threads and improve error logging. Changes include:

- Distinguish between recv timeout and disconnection in channel
  connections
- Add explicit break on disconnection to cleanly exit recv loops
- Detect EOF (n == 0 reads) and convert to UnexpecteEof error
- Break recv loop on TCP read errors to prevent infinite retry loops
- Add debug logging when peer disconnects

These changes ensure that threads properly terminate when connections
close, preventing resource leaks and reducing noise in logs from threads
attempting to read from closed connections.

Signed-off-by: Trey Aspelund <[email protected]>
fs2 was used for file locking in the LoopbackIpManager test helper, but
it does not have support for Illumos as a target OS. This removes the
fs2 dependency in favor of using the libc crate that supports Illumos,
MacOS and Linux.

Signed-off-by: Trey Aspelund <[email protected]>
Somewhere along the line in the refactor, some of the md5 code got
orphaned and was no longer being used, which went unnoticed while doing
dev on MacOS (for some reason even using --all-targets for cargo build,
check and clippy, this still didn't surface?). This updates the
BgpConnection code to use the md5 logic again (i.e. actually running the
SA keepalive thread).

Signed-off-by: Trey Aspelund <[email protected]>
The ipadm syntax used to install/uninstall addresses on illumos was
invalid - specifically, the `addrobj` string was not alphanumeric, which
resulted in an "Invalid argument provided" error. This updates the
`addrobj` string to use "dot" as delimiters between octets since it's
valid/alphanumeric.

Signed-off-by: Trey Aspelund <[email protected]>
Signed-off-by: Trey Aspelund <[email protected]>
Connection Collision Resolution requires that the BGP-ID is used as a
tie-breaker to determine which connection to close: the connection
initiated by the router with the lowest ID is closed. We were correctly
using the BGP-ID as the tie-breaker, but were implicitly using the "new"
connection (whichever completed second) as if it were always initiated
by the peer and the "existing" connection (whichever completed first)
as if it were initiated by the local system. This is a bad assumption
since the ordering of connection completion is non-deterministic. This
fixes issues that can arise when those assumptions are made,
i.e.
r1 and r2 agree that r2 has the higher BGP-ID, but each decide to close
a different connection.

Signed-off-by: Trey Aspelund <[email protected]>
The accept() path (via the Dispatcher) for TCP was incorrectly passing
the local bind address to the BgpConnection constructor (with_conn())
which is set to [::]:179 in non-test environments, which resulted in
us later encoding this into an Update message in the NEXT_HOP field.
This addresses the issue by correctly passing the local SocketAddr of
the live socket, which is an IPv4 address for an IPv4 TCP connection.
This also adds both tx and rx validation of NEXT_HOP length based on AFI
of prefixes and their next-hop.

Signed-off-by: Trey Aspelund <[email protected]>
@taspelund
Copy link
Contributor Author

Interop tests are now passing with the latest commit. Will do some more manual validation to try and force Collisions to occur, since that is the primary reason for the FSM changes

Updates connect() logic to use either a plain IPv4 or IPv6 address from
the new TcpStream's local SocketAddr, rather than using it as-is. This
is needed because our listening socket is a dual-stack socket (binding
to [::] with `v6_only` disabled), so any new IPv4 TcpStreams we accept()
from it are presented to the application as an IPv6 socket with an
IPv4-mapped IPv6 address (e.g. a new connection from 192.168.0.1 is
handed to the application with a peer IP of ::ffff:192.168.0.1).

The IP we strip from the TcpStream is then used if we attempt to bind()
the socket to a specific Source IP before initiating an outbound
connection, which we bail out of upon a failure. So no outbound
connections will be initiated while this is busted. The IP is also used
to determine the address-family of the next-hop we encode in sent Update
messages, so we could inadvertently encode 16 bytes instead of 4 for the
next-hop if this is wrong.

Updates the bind_addr of a SessionRunner to be a Option all the way
up through the API handler (and the test instantiation), where it is
always set to None by the API (until we add support for update-source).
This prevents erroneous bind() calls in non-test environments.

Signed-off-by: Trey Aspelund <[email protected]>
Add a ConnectionCreator check in OpenSent/OpenConfirm to ensure we never
enter FsmState::ConnectionCollision with two connections that were
initiated by the same peer (local or remote).

It was observed during manual stress testing that mgd would sometimes
enter Connection Collision with two connections that it initiated. This
happens because connect() calls for outbound TCP sockets are currently
handled as a one-shot operation in a separate thread without any
synchronization methods to allow for a graceful shutdown to occur (e.g.
sending a Shutdown message to the connect thread via a channel).
However, since we use connect_timeout(), the timing window for these
duplicate outbound connections is bounded by the timeout supplied
(currently 100ms). Regardless, any timing window is still a window,
even if this was only observed irregularly (seen 74 times after running
overnight with 2 peers who were resetting the session once per second),
so this adds logic to disallow this condition. If a second connection
completes, the FSM will check its direction (inbound or outbound) and
reject it if it matches the existing connection's direction.

This also removes a couple (now) unneeded clippy lints, and renames
SessionEvent::Connected to SessionEvent::TcpConnectionAcked (now
aligning with RFC event 16, Tcp_CR_Acked).

Signed-off-by: Trey Aspelund <[email protected]>
- Adds docstrings for several different types within session.rs that
  describe the SessionRunner and BgpConnection design choices.

- Renames the FSM state methods to "fsm_<state>" format so it's clear
  which function implements which FsmState handler.

- Moves capabilities_received into PeerConnection, since that state is
  inherently tied to a BgpConnection that has received an Open message
  rather than the SessionRunner itself.

- Renames PeerConnection.capabilities_received -> PeerConnection.caps.

- Renames SessionRunner.capabilities_sent -> SessionRunner.caps_tx.

- Renames SessionRunner.primary_connection -> SessionRunner.primary.

- Introduces PrimaryConnection enum that wraps either a Partial or Full
  BgpConnection, depending on whether it has the peer's Open data.

- Changes SessionRunner.primary to hold a PrimaryConnection enum rather
  than simply a ConnectionId. This provides a handle to the connection
  and the peer's Open data so we can expose it to queries.

- Adds a couple connection retries timer increments.

- Moves a couple ConnectRetryTimer stop calls earlier so we don't forget
  to stop the timer before transitioning out of Connect/Active.

Signed-off-by: Trey Aspelund <[email protected]>
Shortens names of SessionClock / ConnectionClock timers to make
SessionRunner code shorter/narrower (avoiding cargo fmt indentation
woes).

Adds enabled checks for timer expiration FSM events. This is needed in
the event of a valid timer event being added to the event queue behind
another event that makes the timer event invalid.

Example:
T0: TcpConnectionConfirmed pushed into event queue
T1: ConnectRetryTimerExpires pushed into event queue
T2: TcpConnectionConfirmed read from queue, ConnectRetryTime stopped/disabled
T3: FSM transition to OpenSent
T4: ConnectRetryTimerExpires read from queue -> FSM error!
T5: FSM transition to Idle

In this case, the ConnectRetryTimerExpires event is pushed into the
event queue because the timer legitimately expired. However, another
event that advances the FSM state (e.g. TcpConnectionConfirmed,
TcpConnectionAcked) was already in the queue to be processed, and the
handler stops ConnectRetryTimer but doesn't clear pending
ConnectRetryTimerExpires FSM events from the queue. The event queue is
not read from again until after moving into the next FSM state, where
the old event is then read and reacted to despite the timer being
disabled.

This is avoided by using the timer's enabled status as an indicator of
whether its events are valid.

Signed-off-by: Trey Aspelund <[email protected]>
Add connector_handle field to SessionRunner to track the spawned connector
thread throughout its lifecycle. This enables detection of connector thread
panics and prevents spawning multiple simultaneous connector threads for
the same session.

Changes BgpConnector::connect() to return JoinHandle instead of () so the
caller can manage the thread. The handle is checked on each connection
attempt: if a thread is still running, new spawns are skipped; if finished,
it's joined to detect panics before spawning a new thread. On shutdown, any
running connector thread is joined for clean teardown.

Adds connector_panics counter to SessionCounters for observability and
introduces initiate_connection() helper to consolidate the spawn logic.

Signed-off-by: Trey Aspelund <[email protected]>
Fix missing connection cleanup in connection_collision_open_sent()
when transitioning to SessionSetup via the early-Keepalive path.

The collision handler has two paths to SessionSetup:
1. Both connections receive Opens -> resolve_collision()
2. One connection receives Keepalive before both Opens

Path 1 correctly calls stop() on the losing connection, unregistering
it from the connection registry. Path 2 directly returned
SessionSetup without cleanup, leaving the losing connection registered
with its receive loop still active.

In the failing scenario:
1. Connection A receives Open, stores it in exist_open
2. Connection A receives Keepalive before connection B gets Open
3. FSM transitions Connection A to SessionSetup
4. Connection B remains registered and continues receiving messages
5. Connection B's queued Open arrives after R2 reaches Established
6. Open handler in Established sees B as "unexpected known connection"
7. FSM Error notification sent, terminating session

Fix by calling stop(StopReason::CollisionResolution) on the losing
connection before transitioning to SessionSetup, matching the behavior
of resolve_collision().

Also fix:
- Missing Keepalive transmission on Open receipt (RFC 4271 violation)
- Wrong connection timer restart (restarting 'new' instead of 'exist')
- Add RFC 4271 excerpts to ConnectionCollision methods to make protocol
  requirements more clear.

Signed-off-by: Trey Aspelund <[email protected]>
When two BGP routers start simultaneously with identical timer
configurations, they can enter a synchronization deadlock. Both
routers begin in Idle state, which rejects inbound connections per
RFC 4271. When they simultaneously initiate outbound connections,
each router rejects the other's inbound connection attempt, causing
both to fail and return to Idle state via the IdleHoldTimer.

Because both routers have identical IdleHoldTimer intervals, they
restart connection attempts at exactly the same cadence, perpetuating
the deadlock indefinitely. This manifests as both routers cycling
through Idle → Connect/Active → Idle without ever establishing a
session.

While RFC 4271 Section 10 recommends jitter for ConnectRetryTimer
and DelayOpenTimer to prevent synchronized behavior, it does not
explicitly mention IdleHoldTimer. However, IdleHoldTimer serves a
similar role in controlling retry timing after the DampPeerOscillations
state machine dampens connection attempts. In our implementation,
IdleHoldTimer is the primary control for retry cadence when moving
from Idle back to connection attempts (ConnectRetryTimer is disabled in
Idle per the RFC).

The synchronization issue was exposed by channel-based tests where
zero network latency and deterministic timing amplified the race
condition, causing test_basic_peering_active to fail intermittently
(iteration 20/300). While less likely with TCP due to network jitter,
the issue could theoretically occur in real deployments under unlucky
timing conditions.

Implementation adds jitter infrastructure for both IdleHoldTimer and
ConnectRetryTimer following RFC 4271's recommended 0.75-1.0 range:

- Add Timer::jitter_range field with recalculation every timer reset
- Add SessionInfo::{idle_hold_jitter, connect_retry_jitter} configuration
- Enable idle_hold_jitter by default (0.75, 1.0) to fix deadlock
- Disable connect_retry_jitter by default (None) - infrastructure ready
  but not needed since ConnectRetryTimer doesn't control retry timing
  in the Idle state scenario
- Support runtime jitter updates without session reset
- Add rand dependency for thread-safe RNG via rand::thread_rng()

Signed-off-by: Trey Aspelund <[email protected]>
Let's add a channel_id that is consistent across routers to make life
easier when debugging.

Signed-off-by: Trey Aspelund <[email protected]>
Renames PrimaryConnection enum to ConnectionKind.

Updates resolve_collision() to take BGP-ID as an argument rather than
taking it from one of the connections, and to use ConnectionKind enum
for the two connections instead of strictly PeerConnections. This
allows the callers to consistently order their args by connection
completion time (`exist` = first, `new` = second) regardless of how far
along the connection is.

This is preparation for a refactor of the connection collision logic
where we enter resolve_collision() earlier than we did before.

Signed-off-by: Trey Aspelund <[email protected]>
Fix an intermittent test failure caused by a race condition during
connection collision resolution. When two connections collide during the
OpenConfirm state, the losing connection is closed but may still have
in-flight messages in the event queue. The FSM was processing messages
from dead connections without validating the ConnectionID, causing it to
transition to Idle if a non-keepalive message was received for a
connection other than the one owned by FsmState::OpenConfirm.

This fix adds connection ID validation in OpenConfirm's Message handler,
following the same pattern used by other event handlers like
HoldTimerExpires. Messages from unknown or inactive connections are now:
  1. Logged as unexpected
  2. If the connection is known, it's closed with ConnectionRejected
  3. If the connection is unknown, it's simply ignored
  4. The FSM remains in OpenConfirm state

Signed-off-by: Trey Aspelund <[email protected]>
Move collision resolution from lazy (wait for both Opens) to eager
(resolve on first Open). When either connection receives an Open
message during collision state, immediately trigger collision
resolution instead of waiting in a loop for both connections.

This eliminates the manual state tracking (exist_open/new_open)
variables and simplifies the collision FSM. The collision detection
is now more direct and better aligned with RFC 4271 section 6.8.

Also fix error paths in connection_collision where set_primary_conn(None)
was incorrectly called when one connection failed during Open message
parsing, invalid message reception, or timer expiration. Now correctly
promote the fallback connection to primary, maintaining the invariant
that a single connection must always be designated primary.

Signed-off-by: Trey Aspelund <[email protected]>
The mgadm commands to query the RIB have been moved to a new CLI,
so update the test to use the proper command.
Old: mgadm bgp status {imported,selected} $ASN
New: mgadm rib status {imported,selected}

Signed-off-by: Trey Aspelund <[email protected]>
Add collision resolution notification handling in OpenSent, OpenConfirm,
and Established states to prevent death spirals when one peer resolves
a collision before the other peer detects it.

When a peer receives a Collision Resolution notification and hasn't yet
detected the collision itself (second connection still in-flight), the
RFC-prescribed transition to Idle causes the in-flight connection to be
rejected. This creates a death spiral where both peers repeatedly reject
each other's connections.

The fix transitions to Active state instead of Idle when receiving a
Collision Resolution notification. This allows the in-flight connection
to be accepted, breaking the death spiral. ConnectRetryTimer provides
fallback if the expected connection never arrives.

Also refactor collision resolution into a pure function and fix several
related issues:
- Move register_conn() after send_open() to prevent sends on failed conn
- Silent drop rejected connections in Idle (avoid notification storms)
- Add IoError stop reason for failed sends during collision detection
- Fix typo in CeaseErrorSubcode display ("Collition" → "Collision")

Signed-off-by: Trey Aspelund <[email protected]>
Signed-off-by: Trey Aspelund <[email protected]>
Consolidate BGP's Prefix struct into the shared rdb::Prefix enum,
standardizing prefix representation across the codebase. This adds
explicit address family context to prefix_from_wire(), required for
correct IPv4/IPv6 disambiguation in wire format parsing.

Key improvements:
- Prefix type is Copy, eliminating clones in announcement/origin methods
- Property-based tests ensure wire format round-trip correctness for both
  IPv4 and IPv6 prefixes
- CI job added for high-intensity proptest execution (1M+ cases per run)

All BGP wire format message types continue to work correctly with the
consolidated Prefix representation.

Signed-off-by: Trey Aspelund <[email protected]>
Move the "passive or active" logic used when transitioning out of Idle
state into a helper method.
Immediately transition out of Idle if IdleHoldTimer is not configured
using the new helper method, rather than by injecting an FSM event into
the event queue. This ensures no enqueued events are pulled out of the
event queue while in Idle state, where they could potentially be
ignored because of the rules around Idle state.

Signed-off-by: Trey Aspelund <[email protected]>
When both peers are configured to use identical IdleHoldTimer, they can
encounter a situation where one peer tries to connect to the other while
it's in Idle, which gets rejected. Upon that rejection (rather, the
resulting send() failure) the first peer drops back to Idle. Then when
the second peer comes out of Idle and attempts to connect, the first
peer will reject its connection while it's sitting in Idle. This death
spiral can sometimes happen indefinitely, which presents itself as
random test failures when the BGP session fails to establish within the
30 second timout in the tests. By transitioning immediately out of Idle
(i.e. disabling IdleHoldTime) without reading from the event queue, we
avoid the whole situation where connections are rejected by the rules of
the Idle FSM state.

Signed-off-by: Trey Aspelund <[email protected]>
In an attempt to avoid the death spiral involved with both peers using
IdleHoldTime (getting stuck in an Idle <-> Connect loop), I had tried
adding special handling for Collision Connection Resolution Notification
messages. After realizing that this problem is inherent to the Idle
state (rejecting connections that complete asynchronously) and adjusting
tests to disable IdleHoldTimer (aligning with the default config), I
realized that we actually do not need this non-compliant handling.

Signed-off-by: Trey Aspelund <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bgp Border Gateway Protocol customer For any bug reports or feature requests tied to customer requests mgd Maghemite daemon rust Pull requests that update rust code static Static Routing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bgp: tests fail intermittently when run via cargo test but pass consistently with cargo nextest IPv6 support for static routing

2 participants