Unstable NAT traversal from a natmanger port collision #2059

TheRook · 2023-02-05T22:24:23Z

The natmanger does not check to see if a port is already in use by another peer on the network before trying to clobber it with a new route:
https://github.com/libp2p/go-libp2p/blob/master/p2p/host/basic/natmgr.go#L129-L134

The result is that if a number of go-libp2p clients all try to use QUIC UDP in a large NAT like a collage dorm room, then no one will be able to do so in a stable way. This will cause problems for issue #1785.

Worse yet - you can't know how a given router is going to respond to aggressive multi-pronged TCP/UDP hole punching for the same port. Dozens of client is an extreme, but even two or more clients may case problems. Even one attempt at an existing route could disrupt network traffic for a buggy or cheap router, the client needs to do its best to avoid existing routes.

If the network interface is reporting a non-public IP address (RFC1918) then we can assume we are behind a NAT. We should see if there is an existing route before trying to setup any new routes. Which is not trivial which is why its not being currently done. It is better to randomly select a UDP port from a large range as to reduce the likelihood of collision as much as possible. If there is a UDP or TCP holepunch failure, the client should try selecting another random port above 1024 and try again, do so on three random ports before giving up.

Now if two libp2p nodes are on the same local network - then they might have found each-other using mDNS. If mDNS is reporting the use of a specific port, natmanger should avoid making routes using this external port. (Which would only work if the clients follow the pattern of mapping the same external port the internal - which afaik go-libp2p natmanger is doing already.)

Both avoiding already used ports reported by mDNS and choosing a random port each time will greatly reduce the likelihood of a NAT table collision.

TheRook · 2023-02-06T19:11:06Z

This bug can be recreated by booting up 5 identical VMs under a NAT - if each has an identical libp2p configuration then the attempt to setup a new route will conflict with existing routes - which is why random port numbers will improve network connectivity and reduce conflict.

An easier test would be to run 10 libp2p processes with identical configuration - all 10 should be able to quickly accept incoming connections without conflicting with any existing NAT table entries. The idea of using the same port is very important for old services, but I am hard-pressed to find any advantage is preferring to use one port for Kad - and only disadvantages in making it a lot easier for malicious entities to filter Kad's traffic using a single iptables rule.

p-shahi added kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding labels Feb 6, 2023

TheRook mentioned this issue Mar 6, 2023

[META] Private Addresses In Public ipfs/kubo#6932

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unstable NAT traversal from a natmanger port collision #2059

Unstable NAT traversal from a natmanger port collision #2059

TheRook commented Feb 5, 2023 •

edited

Loading

TheRook commented Feb 6, 2023 •

edited

Loading

Unstable NAT traversal from a natmanger port collision #2059

Unstable NAT traversal from a natmanger port collision #2059

Comments

TheRook commented Feb 5, 2023 • edited Loading

TheRook commented Feb 6, 2023 • edited Loading

TheRook commented Feb 5, 2023 •

edited

Loading

TheRook commented Feb 6, 2023 •

edited

Loading