Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't panic when IPv6 is not supported #248

Merged
merged 7 commits into from
Jul 2, 2024

Conversation

gabrielmougard
Copy link
Contributor

@gabrielmougard gabrielmougard commented Dec 12, 2023

closes #159

  • This fix enable the detection for a cluster member for IPv6 support to set the appropriate flag for hashicorp/mdns to avoid panicking if IPv6 is not supported,
  • This also contains a fix for avoiding an infinite timeout issue in the lookupPeers function.
  • However, the mDNS query (without support for IPv6) being received by the other cluster member mDNS servers, whose the responses are well written back to the UDPv4 connection opened on the server, are not received by the client and the reason still remains a mystery...

Copy link
Contributor

@markylaing markylaing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small changes. Thanks.

microcloud/mdns/mdns.go Outdated Show resolved Hide resolved
microcloud/cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
Copy link
Contributor

@masnax masnax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple questions/changes, thanks!

microcloud/mdns/mdns.go Outdated Show resolved Hide resolved
microcloud/mdns/lookup.go Outdated Show resolved Hide resolved
microcloud/cmd/microcloud/main_init.go Outdated Show resolved Hide resolved
@tomponline
Copy link
Member

tomponline commented Jan 2, 2024

  • However, the mDNS query (without support for IPv6) being received by the other cluster member mDNS servers, whose the responses are well written back to the UDPv4 connection opened on the server, are not received by the client and the reason still remains a mystery...

Could you explain this one for me? Are you saying that if you have IPv6 disabled, this prevents the panic, but auto discovery doesn't work still?

Yes, exactly. I'm also trying to see if it could not be related to some settings in /proc/sys/net/ipv4/conf/{iface} like forwarding, mc_forwarding or rp_filter, but so far they seem to be ok so I think the problem might be elsewhere...

@gabrielmougard
Copy link
Contributor Author

@tomponline As this unexpected behaviour seems to be quite different (discovery issue vs panicking) from the original issue title, should I create a new related ticket ?

@tomponline
Copy link
Member

I would suggest that you keep the changes in this PR - ultimately if we cant get it working without ipv6 enabled then we need to detect that any show an error, rather than trying to proceed and then failing silently not being able to detect other members.

@tomponline
Copy link
Member

@gabrielmougard also could you quote and reply to my questions rather than editing them in the future. Thanks

@gabrielmougard
Copy link
Contributor Author

@masnax do you have an idea on what could be going wrong here? I must admit that I'm running out of ideas...

@masnax
Copy link
Contributor

masnax commented Jan 4, 2024

@gabrielmougard I'm not able to replicate the discovery issue with this branch myself.

I've tried setting disable_ipv6 to 1 for every combination of all, default, and enp5s0, and I've also manually cleared every ipv6 address with ip a del. At least on my end, the systems are all still discoverable with ipv4.

Can you provide some reproducer steps for the discoverability issue?

Also, I just remembered that we will have to add permissions to the snap to allow accessing those /proc/sys/net paths as otherwise it's going to result in an apparmor denial.

@gabrielmougard
Copy link
Contributor Author

@masnax should I open a new issue with a reproducer with this ipv4 discoverability issue I experience as this seems out of scope for this ticket (it is likely that my microcloud cluster is misconfigured somehow) so that we can close this one (avoiding the panicking issue) ?

@masnax
Copy link
Contributor

masnax commented Jan 24, 2024

@masnax should I open a new issue with a reproducer with this ipv4 discoverability issue I experience as this seems out of scope for this ticket (it is likely that my microcloud cluster is misconfigured somehow) so that we can close this one (avoiding the panicking issue) ?

It would be great if you could file an issue with reproducer steps. I think we ought to take a look at the issue further before we go ahead and merge this. If it ends up being unrelated then we can go ahead but we should investigate first.

@gabrielmougard
Copy link
Contributor Author

@masnax I made it work! It's been a while since I experienced the last IPv4 issue (so I don't really remember the setup steps I took), so now after following the official microcloud doc and manually disabling IPv6, I was able to get mDNS to work on IPv4 :). I updated the Lookup() function though, just to return an error early if IPv6 AND IPv4 are both not supported.

microcloud/mdns/mdns.go Outdated Show resolved Hide resolved
Copy link
Contributor

@markylaing markylaing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few minor tweaks :)

microcloud/mdns/mdns.go Outdated Show resolved Hide resolved
microcloud/mdns/mdns.go Outdated Show resolved Hide resolved
microcloud/mdns/mdns.go Outdated Show resolved Hide resolved
microcloud/mdns/lookup.go Outdated Show resolved Hide resolved
@gabrielmougard
Copy link
Contributor Author

@markylaing @tomponline updated

@masnax
Copy link
Contributor

masnax commented Apr 19, 2024

Also, I just remembered that we will have to add permissions to the snap to allow accessing those /proc/sys/net paths as otherwise it's going to result in an apparmor denial.

Looks like this is going to be an issue, judging from the tests:

Error: Failed to check IP status: open /proc/sys/net/ipv6/conf/all/disable_ipv6: permission denied

I'm not sure how feasible it would be to add read permissions for this path to MicroCloud. LXD probably already has it so I wonder if we could add something to lxc info --resources that records this information? @tomponline

@tomponline
Copy link
Member

Can you infer this from ipv6 link local addresses on the interfaces instead?

@masnax
Copy link
Contributor

masnax commented Apr 19, 2024

Can you infer this from ipv6 link local addresses on the interfaces instead?

I tried testing it locally and so far checking for the link local address seems sufficient.

Given just 1 interface enp5s0, the panic seems to occur in these 3 cases:

  • echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
  • echo 1 > /proc/sys/net/ipv6/conf/enp5s0/disable_ipv6
  • ip a del fe80... dev enp5s0

It doesn't happen when setting /proc/sys/net/ipv6/conf/default/disable_ipv6 because that seems to only apply to newly created interfaces.

And in all 3 cases, the link local address is indeed missing from ip a show, so I think it would be sufficient to check for that.

@gabrielmougard gabrielmougard force-pushed the fix/ipv6-disabled branch 2 times, most recently from 089976b to 1b75f96 Compare April 30, 2024 08:19
@gabrielmougard
Copy link
Contributor Author

@masnax @simondeziel tests should be good now :)

test/suites/basic.sh Outdated Show resolved Hide resolved
@tomponline
Copy link
Member

@simondeziel is the DCO check no longer occurring or happening elsewhere and needs to be removed from Required checks?

mdns/lookup.go Outdated Show resolved Hide resolved
@simondeziel
Copy link
Member

@tomponline I'll check with IS what's up with the DCO check. For this PR, I confirmed the 4 existing commits have the signed-off-by tag.

test/suites/basic.sh Outdated Show resolved Hide resolved
@gabrielmougard gabrielmougard force-pushed the fix/ipv6-disabled branch 4 times, most recently from 8c60714 to 1d136ef Compare June 7, 2024 10:44
@gabrielmougard
Copy link
Contributor Author

tests working now (except the usual timeouts in some interactive tests)

@gabrielmougard gabrielmougard force-pushed the fix/ipv6-disabled branch 2 times, most recently from 2ebc66a to 1ab3a7f Compare June 10, 2024 08:20
In case of mDNS discovery issue (no IPv6 support on a node for example),
the timeout never seems to occur leading to wait forever. This should fix the issue

Signed-off-by: Gabriel Mougard <[email protected]>
…overlay networking solution is available

Signed-off-by: Gabriel Mougard <[email protected]>
Copy link
Contributor

@masnax masnax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is 1 change here that needs to be made, particularly checking if FAN would be supported on all systems and not just the local system before proceeding.

But I can add that in the refactor PR, so this one can be merged as-is.

@masnax masnax dismissed markylaing’s stale review July 2, 2024 16:57

Changes addressed

@masnax masnax merged commit db1e041 into canonical:main Jul 2, 2024
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to disable ipv6 protocol in microcloud init command.
6 participants