Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dhclient does not decode 802.1q-encapsulated replies #114

Closed
ipartola opened this issue Apr 22, 2021 · 42 comments
Closed

dhclient does not decode 802.1q-encapsulated replies #114

ipartola opened this issue Apr 22, 2021 · 42 comments
Assignees
Labels
feature Adding new functionality

Comments

@ipartola
Copy link

My ISP's DHCP server sends DHCPOFFER datagrams inside 802.1q-encapsulated frames with VLAN 0. It seems that OPNSense's dhclient ignores these. I found a similar issue in pfsense (https://redmine.pfsense.org/issues/8526) and a corresponding pull request (pfsense/FreeBSD-src#9). The actual code diff is here: pfsense/FreeBSD-src@15051bf

I don't see 802.1q handling in https://github.com/opnsense/src/blob/master/sbin/dhclient/bpf.c or https://github.com/opnsense/src/blob/master/sbin/dhclient/packet.c. It seems like this code should be added to OPNSense's dhclient code.

@fichtner fichtner added the support Community support label Apr 22, 2021
@fichtner
Copy link
Member

Hi @ipartola,

VLAN 0 is a Cisco addition which there is no support in FreeBSD. While there is a patch it heavily complicates our BPF filter which already knows how to skip VLAN.

Does your ISP require you to run Cisco gear or can they disable the priority feature?

https://content.cisco.com/chapter.sjs?uri=/searchable/chapter/content/en/us/td/docs/ios-xml/ios/atm/configuration/15-mt/atm-15-mt-book/atm-15-mt-book_chapter_011000.html.xml

TBH, this also breaks QinQ so the question is why is this forced on customers in the first place.

Cheers,
Franco

@fichtner
Copy link
Member

PS: The VLAN priority reference is here and not to be confused with VLAN 0 hijacking 5e4e4f842b7

@ipartola
Copy link
Author

ipartola commented Apr 22, 2021

@fichtner I have no Cisco gear, though I suspect they use it on their end. My whole setup consists of a XGS-PON ONT (model FOX222) and an amd64 box with an Intel dual NIC running OPNSense.

The behavior I see is that dhclient sends out a DHCPDISCOVER and the ISP's DCHP server responds with a DHCPOFFER which dhclient ignores entirely:

01:09:47.918351 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from a0:ce:c8:01:05:8b (oui Unknown), length 300

01:09:47.935354 IP 32.217.174.1.bootps > 32.219.252.142.bootpc: BOOTP/DHCP, Reply, length 300

Looking at tcpdump further I get

11:56:43.560290 a4:7b:2c:29:53:74 > 80:61:5f:08:2d:7a, ethertype 802.1Q (0x8100), length 346: vlan 0, p 7, ethertype IPv4, 32.219.248.1.67 > 32.219.250.238.68: BOOTP/DHCP, Reply, length 300
Which is consistent with the pfsense bug. The only work-around I've found searching the web is to put a managed switch between my ONT and the OPNSense WAN port that would strip the 802.1q tag, but that seems like a hardware solution to a software problem. My old OpenWRT router has no problem with the ISP's setup and gets an IPv4 address immediately.

@fichtner
Copy link
Member

IMO just talk to your ISP as they shouldn't force a maybe-VLAN0 on you. You obviously don't need it to communicate with them so they can avoid it too.

VLAN 0 needs support in the kernel in general, not just dhclient. The ISP could encapsulate any traffic it deems priority and you will never see it.

There may be a way with a bridge and VLAN 0 (if 0 is a supported setting), but I would not recommend either way.

Cheers,
Franco

@fichtner
Copy link
Member

And for further reference a FreeBSD ticket from 2018: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224961

@fichtner
Copy link
Member

Ok so this pertains to the read filter, not write filter. I suppose we could patch it in but I don't trust a BPF filter that I haven't written...

@fichtner fichtner added feature Adding new functionality and removed support Community support labels Apr 22, 2021
@fichtner fichtner self-assigned this Apr 22, 2021
@ipartola
Copy link
Author

@fichtner thank you I appreciate the consideration! I am happy to help test if that would be useful. I will reach out to my ISP but I don't have high hopes for their ability to sort it out based on my experiences with their level 1 & 2 support thus far.

@ipartola
Copy link
Author

@fichtner Two things I have learned:

  1. Unofficial reply regarding what my ISP is doing was: "VLAN0 is so 802.1p can be passed". I haven't gotten any other info from them regarding this behavior and honestly don't really understand what they are doing with 802.1p here.

  2. I tried the following patch on 21.1.5 and it worked like a charm: 2d00172. If this looks good to you, I'm happy to make a pull request.

fichtner added a commit that referenced this issue Jun 10, 2021
VLAN ID 0 is supposed to be interpreted as having no VLAN with
a bit of priority on the side, but the kernel is not able to
decapsulate this on the fly so dhclient needs to take care of it.

This is similar to the VLAN ID send use case where we latch on
to a parent interface in order to be able to serve responses
with the correct VLAN priority.

Patch is inspired by the pfSense proposal below, but the filter
was rewritten for minimal impact.

PR:       #114
See also: pfsense/FreeBSD-src#9
@fichtner
Copy link
Member

fichtner commented Jun 10, 2021

@ipartola alright, I tried to remove some fluff from the original patch... can you try this one instead? bf0e9ca

I could provide a build too. It looks like it doesn't break the non-VLAN case, but I cannot easily test the VLAN-0 case.

Cheers,
Franco

@ipartola
Copy link
Author

@fichtner yes that works! Thank you!

@fichtner
Copy link
Member

@ipartola hooray :) can you also try cbba3c3 on top... it's trying to clean up a few artefacts from when OpenBSD imported this in 2004 and nobody got rid of it apparently.

Cheers,
Franco

@ipartola
Copy link
Author

@fichtner Yes, that works as well!

@fichtner
Copy link
Member

Ok, I'm expecting these to land in 21.7-RC1. This is a bit too dangerous for 21.1.x and we are only one month away from the RC1 anyway. Close then? :)

fichtner added a commit that referenced this issue Jun 10, 2021
VLAN ID 0 is supposed to be interpreted as having no VLAN with
a bit of priority on the side, but the kernel is not able to
decapsulate this on the fly so dhclient needs to take care of it.

This is similar to the VLAN ID send use case where we latch on
to a parent interface in order to be able to serve responses
with the correct VLAN priority.

Patch is inspired by the pfSense proposal below, but the filter
was rewritten for minimal impact.

PR:       #114
See also: pfsense/FreeBSD-src#9
@ipartola
Copy link
Author

That works for me. I am less affected since I can use the patched version so I can wait until whenever :).

Thanks again for fixing this!

@fichtner
Copy link
Member

Thanks for bringing this to our attention in the first place. ❤️

@lattera
Copy link
Contributor

lattera commented Jun 10, 2021

Is this something that could be upstreamed?

@fichtner
Copy link
Member

If someone dares to review and accept it in phabricator, possibly yes.

@michaellacroix
Copy link

@fichtner yes that works! Thank you!

How can I add this into my installation of pfsense? Thanks (Sorry I'm a noob in the github world)

@fichtner
Copy link
Member

I’m not sure which patches pfSense uses but you can try sbin/dhclient from our Snapshot build (FreeBSD 12) https://pkg.opnsense.org/FreeBSD:12:amd64/snapshots/sets/base-21.7.b-amd64.txz

@michaellacroix
Copy link

michaellacroix commented Jun 26, 2021 via email

@michaellacroix
Copy link

I'm also having difficulty installing this on my opnsense install. Any help would be great. Thanks again.

@michaellacroix
Copy link

@ipartola
Copy link
Author

@michaellacroix

How can I add this into my installation of pfsense? Thanks (Sorry I'm a noob in the github world)

This is probably not ideal to discuss on the OPNSense GitHub issues when it has to do with a completely separate project, and as I used OPNSense I can't tell you whether this will even work, but what I did that worked for me was:

  1. Clone the OPNSense repository to my OPNSense box git clone [GIT URL]
  2. Pull in the patched files into sbin/dhclient either with a text editor or just downloading them directly
  3. make && make install
  4. Release and renew your IP in the web UI

Note that this will only work to try the patched code since it'll be overwritten when you run updates.

@fichtner
Copy link
Member

On OPNsense it's easy to install:

# opnsense-update -zbkr 21.7.b
# opnsense-shell reboot

Done :)

@michaellacroix
Copy link

Thanks so much fichtner!!
And thank you also ipartola!

Works great for opnsense. Unfortunately I could not get this to work on pfsense as well. Pfsense does not have a compiler to use on it. Unless I can somehow use the opnsense package on pfsense?

@fichtner
Copy link
Member

@michaellacroix Something like this probably:

# fetch https://pkg.opnsense.org/FreeBSD:12:amd64/snapshots/sets/base-21.7.b-amd64.txz
# tar vxf base-21.7.b-amd64.txz ./sbin/dhclient
# mv /sbin/dhclient /sbin/dhclient.orig
# cp ./sbin/dhclient /sbin/dhclient

Thanks for testing ❤️

@michaellacroix
Copy link

I'm very happy to test. Unfortunatly I hit a snag:
image

I can use winscp to copy file, will that work?

@fichtner
Copy link
Member

I did not expect you would run this from / root directory. In that case just run tar command again and it will extract to /sbin/dhclient directly. Please note that /sbin/dhclient.orig is not the original due to this.

Cheers,
Franco

@michaellacroix
Copy link

Thanks so much fichtner. Unfortunately it did not work.
If you have any other suggestions I would be happy to test.

@fichtner
Copy link
Member

If you have an error I am happy to help, otherwise this is too vague to take a random guess.

Cheers,
Franco

@michaellacroix
Copy link

Of course, I'll provide some cap files when I can. There's no error messages just unable to obtain an IP address from dhcp.
Thanks,
Mike

@fichtner
Copy link
Member

There's no use in running pcaps since we know dhclient binary works which makes this either an issue with shared libraries (no idea if 2.6.x uses FreeBSD 12.2 or not). You can test with:

# ldd /sbin/dhclient

Otherwise it's a configuration error should pfSense use other syntax that is not included in FreeBSD 12.1.

That's all it could be really. :)

Cheers,
Franco

@heyhewmike
Copy link

Hello,
I am presently on opnsense 22.1.7 and unable to get an IP Address from my ISP connected to an XGS-PON ONT (model FOX222).
My ISP appears to be sending 802.1Q VLan Packets with an ID of 0(Zero).
image

Has this fix been implemented in Release 22.1.7?

@fichtner
Copy link
Member

Yes. It’s also included in FreeBSD 13.1 now.

@heyhewmike
Copy link

Yes. It’s also included in FreeBSD 13.1 now.

Thank you.

@janstadt
Copy link

Does anyone using Frontier Fiber know how to get this all to work? I have an ONT and frontier provided router and i want to get rid of that and go directly from ONT to Opnsense. Does anyone know if this works? I tried and wasnt successful but i dont know if its due to this specific issue or not. How would one go about determining this?

@fichtner
Copy link
Member

You need to capture the packets in front of the frontier provided router to see what it's doing. Not sure if related to this or not. It differs from provider to provider.

@janstadt
Copy link

Thanks @fichtner. So wireshark on some machine connected directly to the ONT would tell me what i need? I cant believe how these providers are doing this all. There shouldnt be any reason to have to have 2 boxes (ONT and their provided router) in front of my stuff just to make things work. This is me venting about the industry and nothing directed at you or opnsense. Called their tech support and they were absolutely useless (basically what i expected). Theres gotta be someone else out there with this setup that can hopefully provide me with a decent writeup or some link that explains what i need to do.

@ipartola
Copy link
Author

@janstadt This setup currently works for me with stock opnsense 24.1. I guess the patch was included in FreeBSD sometime in 2022. In theory you shouldn't need anything besides just connecting your box to the ONT and firing up the DHCP client. In practice your issue could be anything ranging from plugging in the wrong port or using a bad cable to funky firewall rules.

It's been a few years since I had to deal with this, thank Zeus, but basically what I did was SSH to my opnsense box and run tcpdump listening on port 67. I would stop the dhclient service and then run it from a separate shell so I would only get my specific request traffic. What I saw was that the response from Frontier's DHCP server was tagged with VLAN 0 and opnsense's dhclient not responding to it, which led me down the rabbit hole with the similar issue with pfsense, etc.

I would direct you to https://forum.opnsense.org/index.php?board=1.0 for general support if this isn't your issue, folks there are very helpful.

@janstadt
Copy link

Thanks @ipartola. I just commended on a frontier post in the opnsense forums and will see if that gets me anywhere. I am certain the ports and cables are fine. Firewall might be another thing. I have a few rules, but most are vanilla as well. I've also enabled/disabled crowdsec without any success. Do you have frontier as a provider as well? I'll keep digging. Im sure theres an answer out there.

@ipartola
Copy link
Author

@janstadt I do have Frontier in CT, same setup as when I originally opened the ticket and everything does work for me with no issues. My guess is that this isn't your issue but try to capture some traffic to confirm what's going on. The command I used:

tcpdump -len -i igb0 | grep BOOTP

where igb0 is my WAN interface name.

@heyhewmike
Copy link

@janstadt

I had this same issue Spring 2022 when I had a pcengines board as my SOC for OPNSense.

I never had it working correctly with Frontier in CT and had no support from Frontier. I saw it was because of how Frontier had been dealing with their DHCP server requests and responses and VLan tagging.

I personally gave up on Frontier and returned to Comcast without any issues.

I am now running a Dell XPS for OPNSense and not sure I will try going back to Frontier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Adding new functionality
Development

No branches or pull requests

6 participants