Unstable NAT traversal from a natmanger port collision #2059
Labels
kind/bug
A bug in existing code (including security flaws)
need/analysis
Needs further analysis before proceeding
The natmanger does not check to see if a port is already in use by another peer on the network before trying to clobber it with a new route:
https://github.com/libp2p/go-libp2p/blob/master/p2p/host/basic/natmgr.go#L129-L134
The result is that if a number of go-libp2p clients all try to use QUIC UDP in a large NAT like a collage dorm room, then no one will be able to do so in a stable way. This will cause problems for issue #1785.
Worse yet - you can't know how a given router is going to respond to aggressive multi-pronged TCP/UDP hole punching for the same port. Dozens of client is an extreme, but even two or more clients may case problems. Even one attempt at an existing route could disrupt network traffic for a buggy or cheap router, the client needs to do its best to avoid existing routes.
If the network interface is reporting a non-public IP address (RFC1918) then we can assume we are behind a NAT. We should see if there is an existing route before trying to setup any new routes. Which is not trivial which is why its not being currently done. It is better to randomly select a UDP port from a large range as to reduce the likelihood of collision as much as possible. If there is a UDP or TCP holepunch failure, the client should try selecting another random port above 1024 and try again, do so on three random ports before giving up.
Now if two libp2p nodes are on the same local network - then they might have found each-other using mDNS. If mDNS is reporting the use of a specific port, natmanger should avoid making routes using this external port. (Which would only work if the clients follow the pattern of mapping the same external port the internal - which afaik go-libp2p natmanger is doing already.)
Both avoiding already used ports reported by mDNS and choosing a random port each time will greatly reduce the likelihood of a NAT table collision.
The text was updated successfully, but these errors were encountered: