Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace OpenWrt 19.07 switch config style with OpenWrt 21.02 one in proto-lan and network.lua's device parser #959

Merged
merged 8 commits into from
Mar 26, 2023

Conversation

ilario
Copy link
Member

@ilario ilario commented Dec 11, 2022

See #951

Specifically, here I tried to address two big differences that can be spotted in the reports attached in #951:

  • the fact that with current code and when DSA is present, bat0 interface was not being added to br-lan bridge, see DSA config: bat0 does not get into the bridge #958
  • the fact that with current code and when DSA is present, eth0 was selected for addition to the bridge
  • the fact that with current code and when DSA is present, the VLAN interfaces from the ethernet ports were not added inside bat0

On the last point, there are two big doubts:

  • with DSA, eth0.1 does not exist anymore and has been replaced by lan1, lan2, lan3... So which interfaces should be created and added to bat0? I decided to create a VLAN for each of these, for example lan1_29, lan2_29, lan3_29...
  • why the WAN interface was added to bat0? With the current code, there are interfaces like eth0-2_29 (a VLAN created on top of the WAN) being added to bat0, but this sounds useless and unsafe to me. So I did not add WAN interface to bat0, but only LAN ones.

I coarsely tested on a single router, so this PR is not ready to merge at all.

Indeed, what I observed is that only lan1_29 gets added to bat0, while lan2_29... do not get added, even if in /etc/config/network they are all indicated to be part of bat0 in the same way. No idea what is happening here.

@ilario ilario requested a review from G10h4ck December 11, 2022 21:19
@codecov-commenter
Copy link

codecov-commenter commented Dec 11, 2022

Codecov Report

Merging #959 (68e6bb9) into master (c9aa653) will decrease coverage by 0.23%.
The diff coverage is 40.74%.

❗ Current head 68e6bb9 differs from pull request most recent head ed56e82. Consider uploading reports for the commit ed56e82 to get more accurate results

@@            Coverage Diff             @@
##           master     #959      +/-   ##
==========================================
- Coverage   77.72%   77.48%   -0.24%     
==========================================
  Files          52       52              
  Lines        4359     4384      +25     
==========================================
+ Hits         3388     3397       +9     
- Misses        971      987      +16     
Impacted Files Coverage Δ
...ges/lime-system/files/usr/lib/lua/lime/network.lua 72.58% <7.14%> (-3.10%) ⬇️
...s/lime-system/files/usr/lib/lua/lime/proto/lan.lua 75.00% <75.00%> (-1.09%) ⬇️
...kages/lime-system/files/usr/lib/lua/lime/utils.lua 85.71% <80.00%> (-0.09%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ilario
Copy link
Member Author

ilario commented Dec 13, 2022

Indeed, what I observed is that only lan1_29 gets added to bat0, while lan2_29... do not get added, even if in /etc/config/network they are all indicated to be part of bat0 in the same way.

Just thought (will test ASAP): maybe only lan1_29 was added because it was the only Ethernet port with an active link (to my laptop). So maybe the others would get added as soon as the lower lanX interface gets up.

@ilario
Copy link
Member Author

ilario commented Dec 19, 2022

I can confirm that, with the current state of this PR, it works on OpenWrt 19.07.
Here you have the lime-report for a router running OpenWrt 19.07, and some LibreMesh packages from this pull request:
OpenWrt19-with_lime-system_lime-proto-anygw-batadv-babeld.txt

and you can compare it with the one obtained without this PR here: #951 (comment)

@ilario ilario marked this pull request as ready for review December 20, 2022 17:42
@G10h4ck
Copy link
Member

G10h4ck commented Jan 4, 2023

I don't have DSA devices at hand, @ilario can you test it with a few of them and see if everything looks good?

@ilario
Copy link
Member Author

ilario commented Jan 4, 2023

I thought I had more of the supported ones, but actually I just have two (I will check if some more, but if any they will be very old):

I tested on the first (both with OpenWrt 22.03 and 19.07) but not yet on the second (I started testing on Ubiquiti NanoStation M5 XM and NS Loco M2 XM and was going to test on TP-Link TL-WDR3600 before realizing they were not ported yet to DSA, maybe I could compile master including the pull request you mentioned in the meeting).

@G10h4ck
Copy link
Member

G10h4ck commented Jan 4, 2023

Having a mixed setup of DSA and non-DSA routers is ideal for testing this PR, all must work fine on all of them, at that point we can merge it

@ilario
Copy link
Member Author

ilario commented Feb 12, 2023

Just tested on OpenWrt 22.03 on a non-DSA device (Ubiquiti NanoStation M5 XW) and the /etc/config/network file is quite similar to the DSA case, not to the OpenWrt 19.07 case.
I didn't expect that, but obviously it makes sense.
So, instead of checking for DSA - not-DSA we should check for pre-22.03 or 22.03.

Considering this mess, and the need to restructure everything related to the firewall (OpenWrt 22.03 replaced iptables and ebtables with nftables, which means that installing a simple package like lime-proto-wan can pulll a ton of useless dependencies) I would suggest to not support anything older than OpenWrt 22.03 for the next release.
This would also mean creating a branch for compiling on OpenWrt 19.07 (for example it can be named 2020.2 practically creating a new release following the last one 2020.1), using the master branch for developing the next release and indicating in the documentation that for compiling an image the 2020.2 branch should be used.

@ilario
Copy link
Member Author

ilario commented Feb 12, 2023

So, instead of checking for DSA - not-DSA we should check for pre-22.03 or 22.03.

This is not difficult, eh, I can modify the PR if we consider that is too early for abandoning OpenWrt 19.07...

@G10h4ck
Copy link
Member

G10h4ck commented Feb 14, 2023

* why the WAN interface was added to bat0? With the current code, there are interfaces like eth0-2_29 (a VLAN created on top of the WAN) being added to bat0, but this sounds useless and unsafe to me. So I did not add WAN interface to bat0, but only LAN ones.

It is not useless, and not dangerous, please keep this behaviour unchanged, so keep the wan interface used for mesh too by default

@ilario
Copy link
Member Author

ilario commented Feb 14, 2023

Ok, anyway the firewall will block it, so ok.

@G10h4ck
Copy link
Member

G10h4ck commented Feb 14, 2023

Just tested on OpenWrt 22.03 on a non-DSA device (Ubiquiti NanoStation M5 XW) and the /etc/config/network file is quite similar to the DSA case, not to the OpenWrt 19.07 case. I didn't expect that, but obviously it makes sense. So, instead of checking for DSA - not-DSA we should check for pre-22.03 or 22.03.

Considering this mess, and the need to restructure everything related to the firewall (OpenWrt 22.03 replaced iptables and ebtables with nftables, which means that installing a simple package like lime-proto-wan can pulll a ton of useless dependencies) I would suggest to not support anything older than OpenWrt 22.03 for the next release. This would also mean creating a branch for compiling on OpenWrt 19.07 (for example it can be named 2020.2 practically creating a new release following the last one 2020.1), using the master branch for developing the next release and indicating in the documentation that for compiling an image the 2020.2 branch should be used.

With so many changes what IMHO should we do, is decide which OpenWrt version is our development target, if we decide to target a new non-compatible version we should tag/branch the last commit compatible with the previous version as target_OpenWrt_XX.XX, if we use a tag it means that commit is the last supporting that OpenWrt version carved in stone, aka absolutely no more works gets there, if we branch we remain open to the possibility to accept some very very very well tested bug fix commit ending up there.

@G10h4ck
Copy link
Member

G10h4ck commented Feb 14, 2023

Ok, anyway the firewall will block it, so ok.

Also because of that it should not be installed :-p

@ilario
Copy link
Member Author

ilario commented Feb 14, 2023

With so many changes what IMHO should we do, is decide which OpenWrt version is our development target

Yep, after a very short discussion on the Element chat, it was decided to go on supporting only OpenWrt 22.03 on the master branch.

if we decide to target a new non-compatible version we should tag/branch the last commit compatible with the previous version as target_OpenWrt_XX.XX, if we use a tag it means that commit is the last supporting that OpenWrt version carved in stone, aka absolutely no more works gets there, if we branch we remain open to the possibility to accept some very very very well tested bug fix commit ending up there.

Yep, I think that a branch is better.
What do you think about making a minor release like 2020.2 with a tag and a target_OpenWrt_19.07 branch for including bug fixes? There are plenty of trivial things that as soon as get fixed can go also there as they affect OpenWrt 22.03 and 19.07 in the same way.

Ok, anyway the firewall will block it, so ok.
Also because of that it should not be installed :-p

Let's keep the discussion on libremesh/libremesh.github.io#139 (or #280).

@ilario
Copy link
Member Author

ilario commented Feb 20, 2023

* why the WAN interface was added to bat0? With the current code, there are interfaces like eth0-2_29 (a VLAN created on top of the WAN) being added to bat0, but this sounds useless and unsafe to me. So I did not add WAN interface to bat0, but only LAN ones.

It is not useless, and not dangerous, please keep this behaviour unchanged, so keep the wan interface used for mesh too by default

Even if I include wan there, WAN-WAN connection will be broken anyway as it is currently broken (as currently the WAN port gets included in br-lan, and so I still don't understand why people use that WAN-WAN thing).

The main reason why I need to include wan here, is because otherwise the lime-proto-wan and lime-hwd-openwrt-wan packages will not configure the WAN port, I suspect...? I should try... can anyone confirm?

So the possible solutions are 2, that I can think of:

  1. we add wan here.
    Then, for people willing to do WAN-WAN connections, we document how to manually remove WAN from br-lan and take down the firewall rules blocking incoming connections on WAN (main ticket Understand and document how to use WAN-WAN connections for connecting different clouds #976);
  2. we add wan here and we modify lan.lua for recognizing the WAN port and avoiding to add it to the br-lan bridge, something like this
    if ifname:match("^wlan") then return end

    but it would be easy just with DSA as the WAN interface is clearly named wan. In swconfig it would be a mess, ideas on clean ways for doing that @G10h4ck?
    Then, for people willing to do WAN-WAN connections, we document how to take down the firewall rules blocking incoming connections on WAN (main ticket Understand and document how to use WAN-WAN connections for connecting different clouds #976).

@ilario
Copy link
Member Author

ilario commented Feb 20, 2023

Even if I include wan there, WAN-WAN connection will be broken anyway as it is currently broken (as currently the WAN port gets included in br-lan, and so I still don't understand why people use that WAN-WAN thing).

Maybe I observed that just because I didn't have lime-hwd-openwrt-wan installed, as otherwise it should create specific configuration for the WAN interface and not include it in the lan bridge. If this is the case, sorry, I thought it was a general behavior.

Anyway, the complete absence of users commenting on the wan-wan use-case make me think that we are struggling to support something that nobody uses.

@ilario ilario changed the title Adapt proto-lan and network.lua's device parser to DSA Replace OpenWrt 19.07 switch config style with OpenWrt 21.02 one in proto-lan and network.lua's device parser Feb 21, 2023
@ilario
Copy link
Member Author

ilario commented Feb 22, 2023

What is remaining to do:

  • update the tests

And I also kept the DSA checker even if it is not being used anymore in this PR, supposing that is going to be useful in other occasions.

@ilario
Copy link
Member Author

ilario commented Feb 22, 2023

In order to be able to update the tests, I would need someone to prepare a Docker image based on OpenWrt 22.03 to be uploaded on Docker Hub https://hub.docker.com/u/libremesh who can do that? @spiccinini @altergui

@G10h4ck
Copy link
Member

G10h4ck commented Feb 22, 2023

I think we should keep this whole PR for after the retargetting, so please keep de DSA detection code as it will be useful

@ilario ilario added this to the 2023.1 milestone Feb 23, 2023
ilario added 2 commits March 3, 2023 15:55

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@ilario ilario force-pushed the network.lua-22.03 branch from 3be314b to 277e414 Compare March 3, 2023 14:55
@G10h4ck
Copy link
Member

G10h4ck commented Mar 22, 2023

I have been pushing some improvements here https://github.com/G10h4ck/lime-packages/tree/test_PR_959

@ilario
Copy link
Member Author

ilario commented Mar 22, 2023

I saw that you implemented the usage of device additionally to that of ifname.
I think we can directly remove the usage of ifname as I believe that since OpenWrt 21.02 it has been renamed to device, see https://openwrt.org/releases/21.02/notes-21.02.0#new_network_configuration_syntax_and_boardjson_change

@G10h4ck
Copy link
Member

G10h4ck commented Mar 22, 2023

Thanks for suggestion and for pointing out documentation, I have fixed it as you suggested

@ilario
Copy link
Member Author

ilario commented Mar 22, 2023

I took the liberty to include your commits here, as they look correct.
Tell me if you prefer to keep the things separated.

@G10h4ck
Copy link
Member

G10h4ck commented Mar 23, 2023

I think we should merge this for now, next iteration is to get rid of iptables/ebtables, ans see if we can do what we need with firewall4 or if we need to write nftables rules by hand

@ilario
Copy link
Member Author

ilario commented Mar 23, 2023

Wait, before merging this I think we need to do the following:

  • create a branch for compiling on OpenWrt 19.07
  • write on the development page of the website that the compilation should be done using that branch
  • write the same thing on the mailing list and on the Element group

Do you agree?

@G10h4ck
Copy link
Member

G10h4ck commented Mar 23, 2023

Wait, before merging this I think we need to do the following:

* create a branch for compiling on OpenWrt 19.07

Already done that https://github.com/libremesh/lime-packages/tree/OpenWrt_19.07_compatible

I live the rest to you ;)

* write on the development page of the website that the compilation should be done using that branch

* write the same thing on the mailing list and on the Element group

Do you agree?

@aparcar
Copy link
Member

aparcar commented Mar 23, 2023

So this is ready to merge now?

@ilario
Copy link
Member Author

ilario commented Mar 24, 2023

Yes, I think it is ready to merge. I could not test @G10h4ck 's commits yet but he surely did.
Anyway I would leave a couple of grace days for the community to read the announcement email https://lists.autistici.org/message/20230323.212314.230ce707.en.html as this pull request will break their builds otherwise.

@G10h4ck
Copy link
Member

G10h4ck commented Mar 24, 2023

Yeah I have tested the commits, those are not enough to run properly libremesh on top of newer OpenWrt, and hence as @ilario said the resulting build will be broken, but I am already working on more stuff to push in another PR soon that should fix at least what I have detected so far.

@aparcar
Copy link
Member

aparcar commented Mar 24, 2023

Sound great. I'd be happy to rest things. Please let me know

@G10h4ck G10h4ck merged commit 282f9d5 into libremesh:master Mar 26, 2023
@aparcar
Copy link
Member

aparcar commented Mar 26, 2023

WAN isn't detect on my device...

@/usr/lib/lua/lime/hwd/openwrt_wan.lua:53 detect_hardware WAN interface: wan
network.scandevices.owrt_ifname_parser found ifname wlan0-ap
network.scandevices.dev_parser found WiFi device wlan0-ap
network.scandevices.owrt_ifname_parser found ifname wlan0-apname
network.scandevices.dev_parser found WiFi device wlan0-apname
network.scandevices.owrt_ifname_parser found ifname wlan0-mesh
network.scandevices.dev_parser found WiFi device wlan0-mesh
network.scandevices.owrt_ifname_parser found ifname wlan1-ap
network.scandevices.dev_parser found WiFi device wlan1-ap
network.scandevices.owrt_ifname_parser found ifname wlan1-apname
network.scandevices.dev_parser found WiFi device wlan1-apname
network.scandevices.owrt_ifname_parser found ifname wlan1-mesh
network.scandevices.dev_parser found WiFi device wlan1-mesh
network.scandevices.owrt_device_parser found base interface not_found and derived device br-lan
network.scandevices.dev_parser got nil device
network.scandevices.owrt_device_parser found interface br-lan with port bat0
network.scandevices.owrt_device_parser found interface br-lan with port lan1
network.scandevices.dev_parser found LAN port lan1 and marking eth0 as nobridge
network.scandevices.owrt_device_parser found interface br-lan with port lan4
network.scandevices.dev_parser found LAN port lan4 and marking eth0 as nobridge
network.scandevices.owrt_device_parser found interface br-lan with port lan3
network.scandevices.dev_parser found LAN port lan3 and marking eth0 as nobridge
network.scandevices.owrt_device_parser found interface br-lan with port lan2
network.scandevices.dev_parser found LAN port lan2 and marking eth0 as nobridge
network.scandevices.dev_parser found plain Ethernet device eth0

Is it possible that the wan interface isn't scanned at all?

@ilario
Copy link
Member Author

ilario commented Mar 26, 2023

Seems that lime-hwd-openwrt-wan is working correctly.
But for some reason, the wan.configure() from lime-proto-wan does not get called.

@G10h4ck
Copy link
Member

G10h4ck commented Mar 27, 2023

WAN isn't detect on my device...

@/usr/lib/lua/lime/hwd/openwrt_wan.lua:53 detect_hardware WAN interface: wan

Is it possible that the wan interface isn't scanned at all?

The interface is correctly scanned and it is called wan maybe it get filtered out later in the code, I remember we have regexp filter to match ethernet interfaces that before were called ethX, so maybe some of those filters is silently dropping the interface at some point, I don't have a DSA based device at hand so if you can debug where the wan interface get discarded would be good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants