Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable NAT traversal from a natmanger port collision #2059

Open
TheRook opened this issue Feb 5, 2023 · 1 comment
Open

Unstable NAT traversal from a natmanger port collision #2059

TheRook opened this issue Feb 5, 2023 · 1 comment
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding

Comments

@TheRook
Copy link

TheRook commented Feb 5, 2023

The natmanger does not check to see if a port is already in use by another peer on the network before trying to clobber it with a new route:
https://github.com/libp2p/go-libp2p/blob/master/p2p/host/basic/natmgr.go#L129-L134

The result is that if a number of go-libp2p clients all try to use QUIC UDP in a large NAT like a collage dorm room, then no one will be able to do so in a stable way. This will cause problems for issue #1785.

Worse yet - you can't know how a given router is going to respond to aggressive multi-pronged TCP/UDP hole punching for the same port. Dozens of client is an extreme, but even two or more clients may case problems. Even one attempt at an existing route could disrupt network traffic for a buggy or cheap router, the client needs to do its best to avoid existing routes.

If the network interface is reporting a non-public IP address (RFC1918) then we can assume we are behind a NAT. We should see if there is an existing route before trying to setup any new routes. Which is not trivial which is why its not being currently done. It is better to randomly select a UDP port from a large range as to reduce the likelihood of collision as much as possible. If there is a UDP or TCP holepunch failure, the client should try selecting another random port above 1024 and try again, do so on three random ports before giving up.

Now if two libp2p nodes are on the same local network - then they might have found each-other using mDNS. If mDNS is reporting the use of a specific port, natmanger should avoid making routes using this external port. (Which would only work if the clients follow the pattern of mapping the same external port the internal - which afaik go-libp2p natmanger is doing already.)

Both avoiding already used ports reported by mDNS and choosing a random port each time will greatly reduce the likelihood of a NAT table collision.

@TheRook
Copy link
Author

TheRook commented Feb 6, 2023

This bug can be recreated by booting up 5 identical VMs under a NAT - if each has an identical libp2p configuration then the attempt to setup a new route will conflict with existing routes - which is why random port numbers will improve network connectivity and reduce conflict.

An easier test would be to run 10 libp2p processes with identical configuration - all 10 should be able to quickly accept incoming connections without conflicting with any existing NAT table entries. The idea of using the same port is very important for old services, but I am hard-pressed to find any advantage is preferring to use one port for Kad - and only disadvantages in making it a lot easier for malicious entities to filter Kad's traffic using a single iptables rule.

@p-shahi p-shahi added kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding labels Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding
Projects
None yet
Development

No branches or pull requests

2 participants