Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional failure to discover NAT which prevents NAT port from getting mapped #3138

Open
2color opened this issue Jan 15, 2025 · 3 comments · May be fixed by #3154
Open

Occasional failure to discover NAT which prevents NAT port from getting mapped #3138

2color opened this issue Jan 15, 2025 · 3 comments · May be fixed by #3154
Assignees

Comments

@2color
Copy link
Contributor

2color commented Jan 15, 2025

Description

go-libp2p v0.38.1 does not discover NAT reliably, preventing from a port mapping from getting created even though it should work.

This has been observed with the AutoTLS example. Most times, NAT is discovered correctly and a port mapping is created, however, occasionally, NAT isn't detected and the following error is logged:

2025-01-15T11:48:23.026+0100	INFO	basichost	basic/natmgr.go:112	DiscoverNAT error:no NAT found

When this happens, the process just hangs and nothing happens even after waiting (~20 minutes) no new even or log is emitted.

Interestingly, starting the process again will usually fix this.

Additional context

I have observed this on a Mac behind NAT with both IPv4 and IPv6:

en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6460<TSO4,TSO6,CHANNEL_IO,PARTIAL_CSUM,ZEROINVERT_CSUM>
	ether bc:d0:74:2d:3e:a6
	inet6 fe80::35:e41d:ea71:2da7%en0 prefixlen 64 secured scopeid 0x10
	inet 192.168.178.21 netmask 0xffffff00 broadcast 192.168.178.255
	inet6 fd00:1e42:719d:0:835:dcc8:a2f2:b2e5 prefixlen 64 autoconf secured
	inet6 2a02:3100:1125:ab00:ca3:b8e6:6a6b:6d9 prefixlen 64 autoconf secured
	inet6 2a02:3100:1125:ab00:915a:bbac:62ae:6911 prefixlen 64 autoconf temporary
	nd6 options=201<PERFORMNUD,DAD>
	media: autoselect
	status: active

What's particularly interesting is that the call to host.Addrs() clearly returns private IPs:

2025-01-15T11:48:23.012+0100	INFO	autotls-example	autotls/main.go:138	Addresses: [/ip4/77.188.127.11/tcp/50234 /ip4/127.0.0.1/tcp/5500 /ip4/192.168.178.21/tcp/5500 /ip6/::1/tcp/5500 /ip6/2a02:3100:1125:ab00:915a:bbac:62ae:6911/tcp/5500]
2025-01-15T11:48:23.012+0100	INFO	autotls-example.autotls.start	client/acme.go:358	found preexisting cert for "*.k51qzi5uqu5dh1mnrwyvgvnregc6s3dzggb63nbv05tfuw5vk3rftyba8yoa8w.libp2p.direct" in local storage
2025-01-15T11:48:23.015+0100	DEBUG	autotls-example.autotls	[email protected]/cache.go:271	added certificate to cache	{"subjects": ["*.k51qzi5uqu5dh1mnrwyvgvnregc6s3dzggb63nbv05tfuw5vk3rftyba8yoa8w.libp2p.direct"], "expiration": "2025-04-15T09:31:38.000Z", "managed": true, "issuer_key": "acme-staging-v02.api.letsencrypt.org-directory", "hash": "49f83249a59505880c23de69c503c2eda2d6a3424a606788f0736d1b5399baaf", "cache_size": 1, "cache_capacity": 0}
2025-01-15T11:48:23.015+0100	INFO	autotls-example	autotls/main.go:144	TLS certificate loaded
2025-01-15T11:48:23.015+0100	INFO	autotls-example	autotls/main.go:145	Addresses: [/ip4/77.188.127.11/tcp/50234 /ip4/77.188.127.11/tcp/50234/tls/sni/77-188-127-11.k51qzi5uqu5dh1mnrwyvgvnregc6s3dzggb63nbv05tfuw5vk3rftyba8yoa8w.libp2p.direct/ws /ip4/127.0.0.1/tcp/5500 /ip4/192.168.178.21/tcp/5500 /ip6/::1/tcp/5500 /ip6/2a02:3100:1125:ab00:915a:bbac:62ae:6911/tcp/5500 /ip6/2a02:3100:1125:ab00:915a:bbac:62ae:6911/tcp/5500/tls/sni/2a02-3100-1125-ab00-915a-bbac-62ae-6911.k51qzi5uqu5dh1mnrwyvgvnregc6s3dzggb63nbv05tfuw5vk3rftyba8yoa8w.libp2p.direct/ws]
2025-01-15T11:48:23.026+0100	INFO	basichost	basic/natmgr.go:112	DiscoverNAT error:no NAT found

In another instance of this bug I got the following log (whereby the NAT discovery failed before the logging of the host's addresses, most likely because NAT detection runs in a background goroutine:

2025-01-15T12:24:11.480+0100	INFO	basichost	basic/natmgr.go:112	DiscoverNAT error:no NAT found
2025-01-15T12:24:11.481+0100	INFO	autotls-example	autotls/main.go:138	Addresses: [/ip4/127.0.0.1/tcp/5500 /ip4/192.168.178.21/tcp/5500 /ip6/::1/tcp/5500]
2025-01-15T12:24:11.481+0100	INFO	autotls-example.autotls.start	client/acme.go:358	found preexisting cert for "*.k51qzi5uqu5dh1mnrwyvgvnregc6s3dzggb63nbv05tfuw5vk3rftyba8yoa8w.libp2p.direct" in local storage
2025-01-15T12:24:11.488+0100	DEBUG	autotls-example.autotls	[email protected]/cache.go:271	added certificate to cache	{"subjects": ["*.k51qzi5uqu5dh1mnrwyvgvnregc6s3dzggb63nbv05tfuw5vk3rftyba8yoa8w.libp2p.direct"], "expiration": "2025-04-15T09:31:38.000Z", "managed": true, "issuer_key": "acme-staging-v02.api.letsencrypt.org-directory", "hash": "49f83249a59505880c23de69c503c2eda2d6a3424a606788f0736d1b5399baaf", "cache_size": 1, "cache_capacity": 0}
2025-01-15T12:24:11.488+0100	INFO	autotls-example	autotls/main.go:144	TLS certificate loaded
2025-01-15T12:24:11.488+0100	INFO	autotls-example	autotls/main.go:145	Addresses: [/ip4/127.0.0.1/tcp/5500 /ip4/192.168.178.21/tcp/5500 /ip6/::1/tcp/5500]

Steps to Reproduce

Run the AutoTLS example on a computer behind NAT a number of times until you get the basichost basic/natmgr.go:112 DiscoverNAT error:no NAT found error.

NAT detection

The logic for NAT detection is encapsulated here:
https://github.com/libp2p/go-nat/blob/2fac9092f3fc908fb38bc52fa99234c9fc5f175c/nat.go#L88-L119

@MarcoPolo MarcoPolo added the help wanted Seeking public contribution on this issue label Jan 16, 2025
@MarcoPolo
Copy link
Collaborator

A good first step for debugging this issue would be to add error logging to the go-nat library. Right now if we see an error we simply return nothing.

@2color
Copy link
Contributor Author

2color commented Jan 24, 2025

Currently debugging this with https://github.com/libp2p/go-libp2p/tree/debug-nat-port-mapping which includes the changes from #3154

2025-01-24T20:05:26.858+0100	INFO	basichost	basic/natmgr.go:110	DiscoverNAT start
2025-01-24T20:05:26.859+0100	INFO	autotls-example	autotls/main.go:123	Host created with PeerID: 12D3KooWC9QahaawG3P5n829N5oeQbRwDwUeob1pB3sPZ5UPcCRm
2025-01-24T20:05:26.906+0100	INFO	basichost	basic/natmgr.go:113	DiscoverNAT error:no NAT found: []error{(*errors.errorString)(0x1400044c530)}
2025-01-24T20:05:26.907+0100	DEBUG	internal/nat	nat/nat.go:87	discoverUPNP_IG2 context done
2025-01-24T20:05:26.907+0100	DEBUG	internal/nat	nat/nat.go:76	discoverUPNP_IG1 context done

Two things that stand out:

  • how quickly it fails to detect NAT.
  • It reliably fails to map a port now with this version

@MarcoPolo
Copy link
Collaborator

We believe we've fixed this issue in #3154 . The fixes include:

  1. Running SSDP queries in serial rather than in parallel. The theory being that the FritzBox router may get confused with concurrent requests.
  2. When executing the fallback IGDev method, handle both InternetGateway V1 and V2 devices.

@MarcoPolo MarcoPolo linked a pull request Jan 28, 2025 that will close this issue
@MarcoPolo MarcoPolo self-assigned this Jan 28, 2025
@MarcoPolo MarcoPolo removed the help wanted Seeking public contribution on this issue label Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants