Idling connection keep-alive interval does not seem to be respected. #5539
Replies: 1 comment 5 replies
-
The Could you confirm the transport that youre using (ie, tcp, quic, etc)? The behaviour you describe sounds like it might be related to the quic transport since depending on the configuration it may not drop the connection right away but wait a specific duration to determine if the connection is alive and during that period, the node could have disconnected and reconnected without reporting the event unless the duration has elapsed. If this is the case, you could change your quic configuration to lower those values down further. |
Beta Was this translation helpful? Give feedback.
-
Hi there,
I'm experiencing a strange situation and I'm unsure whether it's a bug or something I'm doing incorrectly. I hope this discussion will help clarify the issue.
Here's the setup:
I have a network of 10 nodes that need to connect to each other, forming a fully connected network. To establish these connections, each node, N, initiates connections with nodes 1 through N-1, creating a fully connected network with a single connection between each pair of nodes. Each node attempts to re-dial its peers every 5 seconds. For example, node 5 follows this logic (pseudo-code):
In this scenario, node 5 tries to reconnect with nodes 1 through 4 if they are disconnected and not currently being dialled.
Communication between the nodes is structured as follows: nodes 1 through 7 frequently send messages to nodes 8 through 10, while nodes 8 through 10 send messages to all other nodes except themselves.
After approximately 24 hours, I restarted node 3. It successfully dials nodes 1 and 2 and is dialed by nodes 8 through 10. However, nodes 4 through 7 never attempt to reconnect to node 3, as if they don't recognize that the connection has dropped.
The key observation here is that node 3 and nodes 8 through 10 engage in heavy message exchanges, while node 3 and nodes 4 through 7 do not exchange any messages. I'd like to emphasize that I've configured the swarm to essentially never drop idle connections:
SwarmBuilder::with_swarm_config(|cfg| cfg.with_idle_connection_timeout(Duration::from_secs(u64::MAX)))
.What could be causing this behaviour? Any suggestions or advice would be greatly appreciated.
PS:
Beta Was this translation helpful? Give feedback.
All reactions