-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't delete OVS datapath port after renaming interface #284
Comments
CC: @dickmanmaor, @girishmg |
this is happening because we are renaming the device owned by OVS using ip-link(1m) command and that rename is reflected on netdev unbeknownst to OVS. this causes a disconnect between ovs and kernel. the only way to get out of this is to restart ovs-vswitchd @aserdean ? |
Yeah, renaming the interface while it is attached to OVS is not a great idea. The main reason being that OVS database stores port names, i.e. the configuration is applied to a port by name in most cases. And many other operations inside ovs-vswitchd will expect the correct name. Re-start may fix the problem, because OVS will remove all the ports it doesn't recognize on start up. I'm not sure why dpctl call doesn't work though. There might be some clash on the kernel side as well, I guess. You may try removing the datapath port by the port number instead: Note: Please, don't use |
Using the port number works. Thanks @igsilya. The larger question is whether we can do anything in OVS to prevent this? When we delete a OVS port from an OVS bridge should we try to also delete it from the datapath using the port number instead of the name? (do both, first try the name and then try the port number). |
@girishmg, @igsilya thanks for the quick feedback.
As Girish was mentioning above we should have something from the OVS side to mitigate this. I was wondering if we could enforce FWIW we faced somewhat the same challenge on Windows and there we added two properties to figure out if port is added by both OVS and the kernel. L.E. Sorry for using |
On restart ovs-vswitchd will dump all the ports from the datapath and remove ones that it doesn't recognize. This is done for a case where users change the database while ovs-vswitchd is down.
I don't think it's a good thing to do as we will most likely just shift the problem to another module that is not expecting the name change to fail. One possible solution from the OVS side is to try to perform deletion of stale datapath ports every once in a while. For example, we could trigger removal of stale ports while processing ovs-vsctl port-del if we can't find a corresponding datapath port to remove. It's still a bit tricky, because sometimes it's a legitimate case where datapath port is getting removed underneath us (e.g. tunctl -d or unload of a kernel module that is responsible for a virtual interface). I'm still not sure why the del-if with a name fails though, because it supposed to talk directly to the kernel and if the current name is used it should be able to find that port by name. Strange. |
In the kernel, the ovs vports are stored in hash table [1] which use the dev name as a key [2]. [1] static struct hlist_head *dev_table; [2] |
Hmm. Thanks, @dickmanmaor ! We may also consider just removing the port from a datapath from the kernel side on rename. But garbage collection from userspace might be a better call. |
I posted a patch that, I hope, should fix most of the issues: https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ The kernel part still needs fixing though. |
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Signed-off-by: Ilya Maximets <[email protected]> Signed-off-by: 0-day Robot <[email protected]>
Thanks for the quick fix! I'm unsure if I should leave the issue open until the kernel counterpart gets fixed also. |
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Tested-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Tested-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Tested-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Tested-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Tested-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Tested-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
OVS configuration is based on port names and OpenFlow port numbers. Names are stored in the database and translated later to OF ports. On the datapath level, each port has a name and a datapath port number. Port name in the database has to match datapath port name, unless it's a tunnel port. If a datapath port is renamed with 'ip link set DEV name NAME', ovs-vswitchd will wake up, destroy all the OpenFlow-related structures and clean other things up. This is because the port no longer represents the port from a database due to a name difference. However, ovs-vswitch will not actually remove the port from the datapath, because it thinks that this port is no longer there. This is happening because lookup is performed by name and the name have changed. As a result we have a port in a datapath that is not related to any port known to ovs-vswitchd and ovs-vswitchd can't remove it. This port also occupies a datapath port number and prevents the port to be added back with a new name. Fix that by performing lookup by a datapath port number during the port destruction. The name was used only to avoid spurious warnings in a normal case where the port was successfully deleted by other parts of OVS. Adding an extra flag to avoid these warnings instead. Fixes: 02f8d64 ("ofproto-dpif: Query port existence by name to prevent warnings.") Reported-at: openvswitch/ovs-issues#284 Tested-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Alin-Gabriel Serdean <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
Hi,
We are hitting an issue in which we can't add a port to OVS anymore, because the netdev has been renamed while it was managed by OVS.
Steps to reproduce:
Consider the following host setup:
and on that host we are trying to do the following:
This will bring us to the error we are hitting:
Trying to delete the port
eth1
from the userspace, or even forcing it from the datapath does not work.CC: @igsilya
The text was updated successfully, but these errors were encountered: