Replies: 1 comment
-
Relates to #1478. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Background
For paritytech/polkadot#1532 I am investigating why connections initiated through Kademlia are kept alive for more than the expected 10 second idle timeout. As far as I can tell there is no way to learn why connections are kept alive without intrusive libp2p source code changes. Ideally I would like to expose a Prometheus metric in Substrate.
Hacky solution
To help debugging I extended
KeepAlive
with a protocol id indicating the protocol that would like to keep the connection alive.ProtocolHandler
s that delegate to otherProtocolHandler
s aggregate theKeepAlive
and pass the protocol id of the highestKeepAlive
upwards.Within
node_handler.rs
I could then log the id of the protocol that keeps the connection alive.This lead to #1698 which triggered investigation for #1700.
Way forward
First of all: Do people feel the need to surface keep-alive information to the user? Or is the on-demand debugging through log lines good enough?
If we do want to expose that information we need to find a consistent way to do so. One suggestion from my side would be to bubble up a ~
KeepAlive
event that records the id of the protocol keeping the protocol alive from theNodeHandler
for each connection on a regular interval .Beta Was this translation helpful? Give feedback.
All reactions