Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate "edge: *** bit out of range 0 - FD_SETSIZE on fd_set ***: terminated" errors #28

Closed
hamishcoleman opened this issue May 7, 2024 · 7 comments
Assignees

Comments

@hamishcoleman
Copy link
Contributor

There have been some reports (ntop/n2n#1168 (comment)) of people getting this error.

@hamishcoleman hamishcoleman self-assigned this May 7, 2024
@hamishcoleman
Copy link
Contributor Author

@nkxingxh, are you able to provide some details about what configuration you are using when you get this error?

The error suggests to me that a file descriptor has been used without being initialised, so there is probably some combination of configuration that leads to this happening.

@nkxingxh
Copy link

nkxingxh commented May 7, 2024

I installed n3n_3.3.3-1_amd64.deb on debian11 (Linux Robin 5.10.0-28-cloud-amd64 #1 SMP Debian 5.10.209-2 (2024-01-31) x86_64 GNU/Linux) and debian12 (Linux debian 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux) using dpkg -i ....

Both machines use the following config file, and I'm having this problem with debian12

I then tried installing n3n-3.3.3-1.x86_64.rpm on RockyLinux8. But my low glibc version caused the installation to fail, so I compiled n3n on rockylinux8. It also works fine with the following configuration.

Here is my configuration file

[tuntap]
name=n3n0
# address=10.6.6.3/22
# address_mode=static
[community]
name=xxxx
supernode=xx.xxx.xxx.xxx:7777
key=xxxxxxxxxxxxxxxx
cipher=Speck
[auth]
password=xxxxxxxxxxxxxxxx
pubkey=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[connection]
bind=50001
description=xxxxx
allow_p2p=false
connect_tcp=true

The output looks like this

......
...... sessiondir: /run/n3n/edge
...... automatically assign IP address by supernode
*** bit out of range 0 - FD_SETSIZE on fd_set ***: terminated
Aborted

Sorry, I didn't keep a log from that time, but I'm sure that's what it looked like.

After that, I uploaded the compiled binary to my debian12 and this time I didn't have the problem

@hamishcoleman
Copy link
Contributor Author

Thanks for those details - I'm guessing that this might be related to the "allow_p2p=false connect_tcp=true" options as the TCP mode has had less testing.

I will see if that information lets me reproduce the issue here

@hamishcoleman
Copy link
Contributor Author

Just wanted to confirm that I can reproduce this on Debian 12 with the https://github.com/n42n/n3n/releases/download/3.3.3/n3n-3.3.3-1.x86_64.rpm binary, but not with locally compiled versions.

While it is clearly influenced by the build environment, I'm still assuming it is pointing at an actual code bug

@nkxingxh
Copy link

nkxingxh commented May 9, 2024

Just wanted to confirm that I can reproduce this on Debian 12 with the https://github.com/n42n/n3n/releases/download/3.3.3/n3n-3.3.3-1.x86_64.rpm binary, but not with locally compiled versions.

While it is clearly influenced by the build environment, I'm still assuming it is pointing at an actual code bug

Yes, this situation is the same as mine. The locally compiled version is working fine

hamishcoleman added a commit to hamishcoleman/n3n that referenced this issue May 9, 2024
Some bug reports that edges using TCP mode are getting occasional
errors:
   *** bit out of range 0 - FD_SETSIZE on fd_set ***: terminated

We were not accounting for the fact that the TCP supernode socket file
descriptor is set to "-1" when it is disconnected and invalid.

Avoid "-1" file handles when adding to or checking signals for fd_set
contents.

This code path will hopefully also get smoothed out when the mainloop
conversion is added in - as it will have a systematic way to add/check
file descriptors.

(Addresses n42n#28)
@hamishcoleman
Copy link
Contributor Author

I found a couple of bugs related to TCP mode - 3.3.4 has these fixes and I couldnt make it crash.

@hamishcoleman
Copy link
Contributor Author

I've been running this new version for a day and a bit and could not trigger this error again.

Please feel free to re-open if you can reproduce the issue again with a newer version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants