Open vSwitch http://openvswitch.org
A: Open vSwitch is a production quality open source software switch designed to be used as a vswitch in virtualized server environments. A vswitch forwards traffic between different VMs on the same physical host and also forwards traffic between VMs and the physical network. Open vSwitch supports standard management interfaces (e.g. sFlow, NetFlow, IPFIX, RSPAN, CLI), and is open to programmatic extension and control using OpenFlow and the OVSDB management protocol.
Open vSwitch as designed to be compatible with modern switching chipsets. This means that it can be ported to existing high-fanout switches allowing the same flexible control of the physical infrastructure as the virtual infrastructure. It also means that Open vSwitch will be able to take advantage of on-NIC switching chipsets as their functionality matures.
A: Open vSwitch can currently run on any Linux-based virtualization platform (kernel 2.6.32 and newer), including: KVM, VirtualBox, Xen, Xen Cloud Platform, XenServer. As of Linux 3.3 it is part of the mainline kernel. The bulk of the code is written in platform- independent C and is easily ported to other environments. We welcome inquires about integrating Open vSwitch with other virtualization platforms.
A: The Open vSwitch source code can be built on a Linux system. You can build and experiment with Open vSwitch on any Linux machine. Packages for various Linux distributions are available on many platforms, including: Debian, Ubuntu, Fedora.
You may also download and run a virtualization platform that already has Open vSwitch integrated. For example, download a recent ISO for XenServer or Xen Cloud Platform. Be aware that the version integrated with a particular platform may not be the most recent Open vSwitch release.
A: No, Open vSwitch has been ported to a number of different operating systems and hardware platforms. Most of the development work occurs on Linux, but the code should be portable to any POSIX system. We've seen Open vSwitch ported to a number of different platforms, including FreeBSD, Windows, and even non-POSIX embedded systems.
By definition, the Open vSwitch Linux kernel module only works on Linux and will provide the highest performance. However, a userspace datapath is available that should be very portable.
A: The PORTING.md document describes how one would go about porting Open vSwitch to a new operating system or hardware platform.
A: Open vSwitch is specially designed to make it easier to manage VM network configuration and monitor state spread across many physical hosts in dynamic virtualized environments. Please see WHY-OVS.md for a more detailed description of how Open vSwitch relates to the Linux Bridge.
Q: How is Open vSwitch related to distributed virtual switches like the VMware vNetwork distributed switch or the Cisco Nexus 1000V?
A: Distributed vswitch applications (e.g., VMware vNetwork distributed switch, Cisco Nexus 1000V) provide a centralized way to configure and monitor the network state of VMs that are spread across many physical hosts. Open vSwitch is not a distributed vswitch itself, rather it runs on each physical host and supports remote management in a way that makes it easier for developers of virtualization/cloud management platforms to offer distributed vswitch capabilities.
To aid in distribution, Open vSwitch provides two open protocols that are specially designed for remote management in virtualized network environments: OpenFlow, which exposes flow-based forwarding state, and the OVSDB management protocol, which exposes switch port state. In addition to the switch implementation itself, Open vSwitch includes tools (ovs-ofctl, ovs-vsctl) that developers can script and extend to provide distributed vswitch capabilities that are closely integrated with their virtualization management platform.
A: Open vSwitch is intended to be a useful component for building flexible network infrastructure. There are many different approaches to distribution which balance trade-offs between simplicity, scalability, hardware compatibility, convergence times, logical forwarding model, etc. The goal of Open vSwitch is to be able to support all as a primitive building block rather than choose a particular point in the distributed design space.
A: You can start by joining the mailing lists and helping to answer questions. You can also suggest improvements to documentation. If you have a feature or bug you would like to work on, send a mail to one of the mailing lists:
http://openvswitch.org/mlists/
A: Starting in OVS 2.4, we switched the default ports to the IANA-specified port numbers for OpenFlow (6633->6653) and OVSDB (6632->6640). We recommend using these port numbers, but if you cannot, all the programs allow overriding the default port. See the appropriate man page.
A: All official releases have been through a comprehensive testing process and are suitable for production use. Planned releases will occur several times a year. If a significant bug is identified in an LTS release, we will provide an updated release that includes the fix. Releases that are not LTS may not be fixed and may just be supplanted by the next major release. The current LTS release is 2.3.x.
A: The following table lists the Linux kernel versions against which the given versions of the Open vSwitch kernel module will successfully build. The Linux kernel versions are upstream kernel versions, so Linux kernels modified from the upstream sources may not build in some cases even if they are based on a supported version. This is most notably true of Red Hat Enterprise Linux (RHEL) kernels, which are extensively modified from upstream.
Open vSwitch | Linux kernel |
---|---|
1.4.x | 2.6.18 to 3.2 |
1.5.x | 2.6.18 to 3.2 |
1.6.x | 2.6.18 to 3.2 |
1.7.x | 2.6.18 to 3.3 |
1.8.x | 2.6.18 to 3.4 |
1.9.x | 2.6.18 to 3.8 |
1.10.x | 2.6.18 to 3.8 |
1.11.x | 2.6.18 to 3.8 |
2.0.x | 2.6.32 to 3.10 |
2.1.x | 2.6.32 to 3.11 |
2.3.x | 2.6.32 to 3.14 |
2.4.x | 2.6.32 to 4.0 |
2.5.x | 2.6.32 to 4.2 |
Open vSwitch userspace should also work with the Linux kernel module built into Linux 3.3 and later.
Open vSwitch userspace is not sensitive to the Linux kernel version. It should build against almost any kernel, certainly against 2.6.32 and later.
configure: error: Linux kernel in <dir> is version <x>, but
version newer than <y> is not supported (please refer to the
FAQ for advice)
What should I do?
A: You have the following options:
-
Use the Linux kernel module supplied with the kernel that you are using. (See also the following FAQ.)
-
If there is a newer released version of Open vSwitch, consider building that one, because it may support the kernel that you are building against. (To find out, consult the table in the previous FAQ.)
-
The Open vSwitch "master" branch may support the kernel that you are using, so consider building the kernel module from "master".
All versions of Open vSwitch userspace are compatible with all versions of the Open vSwitch kernel module, so you do not have to use the kernel module from one source along with the userspace programs from the same source.
Q: What features are not available in the Open vSwitch kernel datapath that ships as part of the upstream Linux kernel?
A: The kernel module in upstream Linux does not include support for LISP. Work is in progress to add support for LISP to the upstream Linux version of the Open vSwitch kernel module. For now, if you need this feature, use the kernel module from the Open vSwitch distribution instead of the upstream Linux kernel module.
Certain features require kernel support to function or to have reasonable performance. If the ovs-vswitchd log file indicates that a feature is not supported, consider upgrading to a newer upstream Linux release or using the kernel module paired with the userspace distribution.
Q: Why do tunnels not work when using a kernel module other than the one packaged with Open vSwitch?
A: Support for tunnels was added to the upstream Linux kernel module after the rest of Open vSwitch. As a result, some kernels may contain support for Open vSwitch but not tunnels. The minimum kernel version that supports each tunnel protocol is:
Protocol | Linux Kernel |
---|---|
GRE | 3.11 |
VXLAN | 3.12 |
Geneve | 3.18 |
LISP | |
STT |
If you are using a version of the kernel that is older than the one listed above, it is still possible to use that tunnel protocol. However, you must compile and install the kernel module included with the Open vSwitch distribution rather than the one on your machine. If problems persist after doing this, check to make sure that the module that is loaded is the one you expect.
A: Generating outer UDP checksums requires kernel support that was not part of the initial implementation of these protocols. If using the upstream Linux Open vSwitch module, you must use kernel 4.0 or newer. The out-of-tree modules from Open vSwitch release 2.4 and later support UDP checksums.
A: Tunnel virtual ports are not supported, as described in the previous answer. It is also not possible to use queue-related actions. On Linux kernels before 2.6.39, maximum-sized VLAN packets may not be transmitted.
A: IPFIX flow monitoring requires the Linux kernel module from Linux 3.10 or later, or the out-of-tree module from Open vSwitch version 1.10.90 or later.
In general, the Open vSwitch userspace should be used with the kernel version included in the same release or with the version from upstream Linux. However, when upgrading between two releases of Open vSwitch it is best to migrate userspace first to reduce the possibility of incompatibilities.
A: Bridge compatibility was a feature of Open vSwitch 1.9 and earlier. When it was enabled, Open vSwitch imitated the interface of the Linux kernel "bridge" module. This allowed users to drop Open vSwitch into environments designed to use the Linux kernel bridge module without adapting the environment to use Open vSwitch.
Open vSwitch 1.10 and later do not support bridge compatibility. The feature was dropped because version 1.10 adopted a new internal architecture that made bridge compatibility difficult to maintain. Now that many environments use OVS directly, it would be rarely useful in any case.
To use bridge compatibility, install OVS 1.9 or earlier, including the accompanying kernel modules (both the main and bridge compatibility modules), following the instructions that come with the release. Be sure to start the ovs-brcompatd daemon.
Q: I thought Open vSwitch was a virtual Ethernet switch, but the documentation keeps talking about bridges. What's a bridge?
A: In networking, the terms "bridge" and "switch" are synonyms. Open vSwitch implements an Ethernet switch, which means that it is also an Ethernet bridge.
A: See the "VLAN" section below.
A: Add "tag=VLAN" to your "ovs-vsctl add-port" command. For example, the following commands configure br0 with eth0 as a trunk port (the default) and tap0 as an access port for VLAN 9:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0 tag=9
If you want to configure an already added port as an access port, use "ovs-vsctl set", e.g.:
ovs-vsctl set port tap0 tag=9
A: The following commands configure br0 with eth0 and tap0 as trunk ports. All traffic coming in or going out on eth0 or tap0 is also mirrored to tap1; any traffic arriving on tap1 is dropped:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0
ovs-vsctl add-port br0 tap1 \
-- --id=@p get port tap1 \
-- --id=@m create mirror name=m0 select-all=true output-port=@p \
-- set bridge br0 mirrors=@m
To later disable mirroring, run:
ovs-vsctl clear bridge br0 mirrors
A: Yes. How you configure it depends on what you mean by "promiscuous mode":
-
Conventionally, "promiscuous mode" is a feature of a network interface card. Ordinarily, a NIC passes to the CPU only the packets actually destined to its host machine. It discards the rest to avoid wasting memory and CPU cycles. When promiscuous mode is enabled, however, it passes every packet to the CPU. On an old-style shared-media or hub-based network, this allows the host to spy on all packets on the network. But in the switched networks that are almost everywhere these days, promiscuous mode doesn't have much effect, because few packets not destined to a host are delivered to the host's NIC.
This form of promiscuous mode is configured in the guest OS of the VMs on your bridge, e.g. with "ifconfig".
-
The VMware vSwitch uses a different definition of "promiscuous mode". When you configure promiscuous mode on a VMware vNIC, the vSwitch sends a copy of every packet received by the vSwitch to that vNIC. That has a much bigger effect than just enabling promiscuous mode in a guest OS. Rather than getting a few stray packets for which the switch does not yet know the correct destination, the vNIC gets every packet. The effect is similar to replacing the vSwitch by a virtual hub.
This "promiscuous mode" is what switches normally call "port mirroring" or "SPAN". For information on how to configure SPAN, see "How do I configure a port as a SPAN port, that is, enable mirroring of all traffic to that port?"
A: Firstly, you must have a DPDK-enabled version of Open vSwitch.
If your version is DPDK-enabled it will support the --dpdk argument on the command line and will display lines with "EAL:..." during startup when --dpdk is supplied.
Secondly, when adding a DPDK port, unlike a system port, the type for the interface must be specified. For example;
ovs-vsctl add-br br0
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
Finally, it is required that DPDK port names begin with 'dpdk'.
See INSTALL.DPDK.md for more information on enabling and using DPDK with Open vSwitch.
Q: How do I configure a VLAN as an RSPAN VLAN, that is, enable mirroring of all traffic to that VLAN?
A: The following commands configure br0 with eth0 as a trunk port and tap0 as an access port for VLAN 10. All traffic coming in or going out on tap0, as well as traffic coming in or going out on eth0 in VLAN 10, is also mirrored to VLAN 15 on eth0. The original tag for VLAN 10, in cases where one is present, is dropped as part of mirroring:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0 tag=10
ovs-vsctl \
-- --id=@m create mirror name=m0 select-all=true select-vlan=10 \
output-vlan=15 \
-- set bridge br0 mirrors=@m
To later disable mirroring, run:
ovs-vsctl clear bridge br0 mirrors
Mirroring to a VLAN can disrupt a network that contains unmanaged switches. See ovs-vswitchd.conf.db(5) for details. Mirroring to a GRE tunnel has fewer caveats than mirroring to a VLAN and should generally be preferred.
A: Yes, but mirroring to a VLAN strips the original VLAN tag in favor of the specified output-vlan. This loss of information may make the mirrored traffic too hard to interpret.
To mirror multiple VLANs, use the commands above, but specify a comma-separated list of VLANs as the value for select-vlan. To mirror every VLAN, use the commands above, but omit select-vlan and its value entirely.
When a packet arrives on a VLAN that is used as a mirror output VLAN, the mirror is disregarded. Instead, in standalone mode, OVS floods the packet across all the ports for which the mirror output VLAN is configured. (If an OpenFlow controller is in use, then it can override this behavior through the flow table.) If OVS is used as an intermediate switch, rather than an edge switch, this ensures that the RSPAN traffic is distributed through the network.
Mirroring to a VLAN can disrupt a network that contains unmanaged switches. See ovs-vswitchd.conf.db(5) for details. Mirroring to a GRE tunnel has fewer caveats than mirroring to a VLAN and should generally be preferred.
A: The following commands configure br0 with eth0 and tap0 as trunk ports. All traffic coming in or going out on eth0 or tap0 is also mirrored to gre0, a GRE tunnel to the remote host 192.168.1.10; any traffic arriving on gre0 is dropped:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0
ovs-vsctl add-port br0 gre0 \
-- set interface gre0 type=gre options:remote_ip=192.168.1.10 \
-- --id=@p get port gre0 \
-- --id=@m create mirror name=m0 select-all=true output-port=@p \
-- set bridge br0 mirrors=@m
To later disable mirroring and destroy the GRE tunnel:
ovs-vsctl clear bridge br0 mirrors
ovs-vcstl del-port br0 gre0
A: No. ERSPAN is an undocumented proprietary protocol. As an alternative, Open vSwitch supports mirroring to a GRE tunnel (see above).
A: First, why do you want to do this? Two connected bridges are not much different from a single bridge, so you might as well just have a single bridge with all your ports on it.
If you still want to connect two bridges, you can use a pair of patch ports. The following example creates bridges br0 and br1, adds eth0 and tap0 to br0, adds tap1 to br1, and then connects br0 and br1 with a pair of patch ports.
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0
ovs-vsctl add-br br1
ovs-vsctl add-port br1 tap1
ovs-vsctl \
-- add-port br0 patch0 \
-- set interface patch0 type=patch options:peer=patch1 \
-- add-port br1 patch1 \
-- set interface patch1 type=patch options:peer=patch0
Bridges connected with patch ports are much like a single bridge. For instance, if the example above also added eth1 to br1, and both eth0 and eth1 happened to be connected to the same next-hop switch, then you could loop your network just as you would if you added eth0 and eth1 to the same bridge (see the "Configuration Problems" section below for more information).
If you are using Open vSwitch 1.9 or an earlier version, then you need to be using the kernel module bundled with Open vSwitch rather than the one that is integrated into Linux 3.3 and later, because Open vSwitch 1.9 and earlier versions need kernel support for patch ports. This also means that in Open vSwitch 1.9 and earlier, patch ports will not work with the userspace datapath, only with the kernel module.
Q: How do I configure a bridge without an OpenFlow local port? (Local port in the sense of OFPP_LOCAL)
A: Open vSwitch does not support such a configuration. Bridges always have their local ports.
A: Open vSwitch uses different kinds of flows for different purposes:
-
OpenFlow flows are the most important kind of flow. OpenFlow controllers use these flows to define a switch's policy. OpenFlow flows support wildcards, priorities, and multiple tables.
When in-band control is in use, Open vSwitch sets up a few "hidden" flows, with priority higher than a controller or the user can configure, that are not visible via OpenFlow. (See the "Controller" section of the FAQ for more information about hidden flows.)
-
The Open vSwitch software switch implementation uses a second kind of flow internally. These flows, called "datapath" or "kernel" flows, do not support priorities and comprise only a single table, which makes them suitable for caching. (Like OpenFlow flows, datapath flows do support wildcarding, in Open vSwitch 1.11 and later.) OpenFlow flows and datapath flows also support different actions and number ports differently.
Datapath flows are an implementation detail that is subject to change in future versions of Open vSwitch. Even with the current version of Open vSwitch, hardware switch implementations do not necessarily use this architecture.
Users and controllers directly control only the OpenFlow flow table. Open vSwitch manages the datapath flow table itself, so users should not normally be concerned with it.
A: Open vSwitch has two kinds of flows (see the previous question), so it has commands with different purposes for dumping each kind of flow:
-
ovs-ofctl dump-flows <br>
dumps OpenFlow flows, excluding hidden flows. This is the most commonly useful form of flow dump. (Unlike the other commands, this should work with any OpenFlow switch, not just Open vSwitch.) -
ovs-appctl bridge/dump-flows <br>
dumps OpenFlow flows, including hidden flows. This is occasionally useful for troubleshooting suspected issues with in-band control. -
ovs-dpctl dump-flows [dp]
dumps the datapath flow table entries for a Linux kernel-based datapath. In Open vSwitch 1.10 and later, ovs-vswitchd merges multiple switches into a single datapath, so it will show all the flows on all your kernel-based switches. This command can occasionally be useful for debugging. -
ovs-appctl dpif/dump-flows <br>
, new in Open vSwitch 1.10, dumps datapath flows for only the specified bridge, regardless of the type.
A: Open vSwitch maintains snooping tables for each VLAN.
A: No. There are several reasons:
-
Kernel flows are not as sophisticated as OpenFlow flows, which means that some OpenFlow policies could require a large number of kernel flows. The "conjunctive match" feature is an extreme example: the number of kernel flows it requires is the product of the number of flows in each dimension.
-
With multiple OpenFlow flow tables and simple sets of actions, the number of kernel flows required can be as large as the product of the number of flows in each dimension. With more sophisticated actions, the number of kernel flows could be even larger.
-
Open vSwitch is designed so that any version of OVS userspace interoperates with any version of the OVS kernel module. This forward and backward compatibility requires that userspace observe how the kernel module parses received packets. This is only possible in a straightforward way when userspace adds kernel flows in reaction to received packets.
For more relevant information on the architecture of Open vSwitch, please read "The Design and Implementation of Open vSwitch", published in USENIX NSDI 2015.
A: The OVS kernel datapath may have been updated to a newer version than the OVS userspace components. Sometimes new versions of OVS kernel module add functionality that is backwards compatible with older userspace components but may cause a drop in performance with them. Especially, if a kernel module from OVS 2.1 or newer is paired with OVS userspace 1.10 or older, there will be a performance drop for TCP traffic.
Updating the OVS userspace components to the latest released version should fix the performance degradation.
To get the best possible performance and functionality, it is recommended to pair the same versions of the kernel module and OVS userspace.
like these:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
and as soon as I ran the "add-port" command I lost all connectivity through eth0. Help!
A: A physical Ethernet device that is part of an Open vSwitch bridge should not have an IP address. If one does, then that IP address will not be fully functional.
You can restore functionality by moving the IP address to an Open vSwitch "internal" device, such as the network device named after the bridge itself. For example, assuming that eth0's IP address is 192.168.128.5, you could run the commands below to fix up the situation:
ifconfig eth0 0.0.0.0
ifconfig br0 192.168.128.5
(If your only connection to the machine running OVS is through the IP address in question, then you would want to run all of these commands on a single command line, or put them into a script.) If there were any additional routes assigned to eth0, then you would also want to use commands to adjust these routes to go through br0.
If you use DHCP to obtain an IP address, then you should kill the DHCP client that was listening on the physical Ethernet interface (e.g. eth0) and start one listening on the internal interface (e.g. br0). You might still need to manually clear the IP address from the physical interface (e.g. with "ifconfig eth0 0.0.0.0").
There is no compelling reason why Open vSwitch must work this way. However, this is the way that the Linux kernel bridge module has always worked, so it's a model that those accustomed to Linux bridging are already used to. Also, the model that most people expect is not implementable without kernel changes on all the versions of Linux that Open vSwitch supports.
By the way, this issue is not specific to physical Ethernet devices. It applies to all network devices except Open vSwitch "internal" devices.
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 eth1
and now my network seems to have melted: connectivity is unreliable (even connectivity that doesn't go through Open vSwitch), all the LEDs on my physical switches are blinking, wireshark shows duplicated packets, and CPU usage is very high.
A: More than likely, you've looped your network. Probably, eth0 and eth1 are connected to the same physical Ethernet switch. This yields a scenario where OVS receives a broadcast packet on eth0 and sends it out on eth1, then the physical switch connected to eth1 sends the packet back on eth0, and so on forever. More complicated scenarios, involving a loop through multiple switches, are possible too.
The solution depends on what you are trying to do:
-
If you added eth0 and eth1 to get higher bandwidth or higher reliability between OVS and your physical Ethernet switch, use a bond. The following commands create br0 and then add eth0 and eth1 as a bond:
ovs-vsctl add-br br0 ovs-vsctl add-bond br0 bond0 eth0 eth1
Bonds have tons of configuration options. Please read the documentation on the Port table in ovs-vswitchd.conf.db(5) for all the details.
Configuration for DPDK-enabled interfaces is slightly less straightforward: see INSTALL.DPDK.md.
-
Perhaps you don't actually need eth0 and eth1 to be on the same bridge. For example, if you simply want to be able to connect each of them to virtual machines, then you can put each of them on a bridge of its own:
ovs-vsctl add-br br0 ovs-vsctl add-port br0 eth0 ovs-vsctl add-br br1 ovs-vsctl add-port br1 eth1
and then connect VMs to br0 and br1. (A potential disadvantage is that traffic cannot directly pass between br0 and br1. Instead, it will go out eth0 and come back in eth1, or vice versa.)
-
If you have a redundant or complex network topology and you want to prevent loops, turn on spanning tree protocol (STP). The following commands create br0, enable STP, and add eth0 and eth1 to the bridge. The order is important because you don't want have to have a loop in your network even transiently:
ovs-vsctl add-br br0 ovs-vsctl set bridge br0 stp_enable=true ovs-vsctl add-port br0 eth0 ovs-vsctl add-port br0 eth1
The Open vSwitch implementation of STP is not well tested. Please report any bugs you observe, but if you'd rather avoid acting as a beta tester then another option might be your best shot.
A: Wireless base stations generally only allow packets with the source MAC address of NIC that completed the initial handshake. Therefore, without MAC rewriting, only a single device can communicate over a single wireless link.
This isn't specific to Open vSwitch, it's enforced by the access point, so the same problems will show up with the Linux bridge or any other way to do bridging.
A: PPP most commonly carries IP packets, but Open vSwitch works only with Ethernet frames. The correct way to interface PPP to an Ethernet network is usually to use routing instead of switching.
A: Yes. ovs-vswitchd.conf.db(5) is a comprehensive reference.
I only see a datapath called "ovs-system". How can I see datapath information about a particular bridge?
A: In version 1.9.0, OVS switched to using a single datapath that is shared by all bridges of that type. The "ovs-appctl dpif/*" commands provide similar functionality that is scoped by the bridge.
see the port in the datapath?
A: On Linux kernels before 3.11, the OVS GRE module and Linux GRE module cannot be loaded at the same time. It is likely that on your system the Linux GRE module is already loaded and blocking OVS (to confirm, check dmesg for errors regarding GRE registration). To fix this, unload all GRE modules that appear in lsmod as well as the OVS kernel module. You can then reload the OVS module following the directions in INSTALL.md, which will ensure that dependencies are satisfied.
A: It depends on mechanisms and configurations you want to use.
You cannot usefully use typical packet filters, like iptables, on physical Ethernet ports that you add to an Open vSwitch bridge. This is because Open vSwitch captures packets from the interface at a layer lower below where typical packet-filter implementations install their hooks. (This actually applies to any interface of type "system" that you might add to an Open vSwitch bridge.)
You can usefully use typical packet filters on Open vSwitch internal ports as they are mostly ordinary interfaces from the point of view of packet filters.
For example, suppose you create a bridge br0 and add Ethernet port eth0 to it. Then you can usefully add iptables rules to affect the internal interface br0, but not the physical interface eth0. (br0 is also where you would add an IP address, as discussed elsewhere in the FAQ.)
For simple filtering rules, it might be possible to achieve similar results by installing appropriate OpenFlow flows instead.
If the use of a particular packet filter setup is essential, Open vSwitch might not be the best choice for you. On Linux, you might want to consider using the Linux Bridge. (This is the only choice if you want to use ebtables rules.) On NetBSD, you might want to consider using the bridge(4) with BRIDGE_IPF option.
then immediately put it back. For example, consider that p1 is a port of type=internal:
ovs-vsctl del-port br0 p1 -- \
add-port br0 p1 -- \
set interface p1 type=internal
A: It's an expected behaviour.
If del-port and add-port happen in a single OVSDB transaction as your example, Open vSwitch always "skips" the intermediate steps. Even if they are done in multiple transactions, it's still allowed for Open vSwitch to skip the intermediate steps and just implement the overall effect. In both cases, your example would be turned into a no-op.
If you want to make Open vSwitch actually destroy and then re-create the port for some side effects like resetting kernel setting for the corresponding interface, you need to separate operations into multiple OVSDB transactions and ensure that at least the first one does not have --no-wait. In the following example, the first ovs-vsctl will block until Open vSwitch reloads the new configuration and removes the port:
ovs-vsctl del-port br0 p1
ovs-vsctl add-port br0 p1 -- \
set interface p1 type=internal
it takes too long (minutes or hours) to do it with ovs-vsctl. How can I do it faster?
A: If you add them one at a time with ovs-vsctl, it can take a long time to add thousands of ports to an Open vSwitch bridge. This is because every invocation of ovs-vsctl first reads the current configuration from OVSDB. As the number of ports grows, this starts to take an appreciable amount of time, and when it is repeated thousands of times the total time becomes significant.
The solution is to add the ports in one invocation of ovs-vsctl (or a small number of them). For example, using bash:
ovs-vsctl add-br br0
cmds=; for i in {1..5000}; do cmds+=" -- add-port br0 p$i"; done
ovs-vsctl $cmds
takes seconds, not minutes or hours, in the OVS sandbox environment.
show", but "ovs-ofctl show br0" just prints "br0 is not a bridge
or a socket".
A: Open vSwitch wasn't able to create the bridge. Check the ovs-vswitchd log for details (Debian and Red Hat packaging for Open vSwitch put it in /var/log/openvswitch/ovs-vswitchd.log).
In general, the Open vSwitch database reflects the desired configuration state. ovs-vswitchd monitors the database and, when it changes, reconfigures the system to reflect the new desired state. This normally happens very quickly. Thus, a discrepancy between the database and the actual state indicates that ovs-vswitchd could not implement the configuration, and so one should check the log to find out why. (Another possible cause is that ovs-vswitchd is not running. This will make "ovs-vsctl" commands hang, if they change the configuration, unless one specifies "--no-wait".)
up in "ovs-vsctl show", but "ovs-vsctl list port" says that it has
OpenFlow port ("ofport") -1, and "ovs-ofctl show br0" doesn't show
vif1.0 at all.
A: Open vSwitch wasn't able to create the port. Check the ovs-vswitchd log for details (Debian and Red Hat packaging for Open vSwitch put it in /var/log/openvswitch/ovs-vswitchd.log). Please see the previous question for more information.
You may want to upgrade to Open vSwitch 2.3 (or later), in which ovs-vsctl will immediately report when there is an issue creating a port.
added it to a bridge, like this:
tunctl -t tap0
ifconfig tap0 192.168.0.123
ovs-vsctl add-br br0
ovs-vsctl add-port br0 tap0
I expected that I could then use this IP address to contact other
hosts on the network, but it doesn't work. Why not?
A: The short answer is that this is a misuse of a "tap" device. Use an "internal" device implemented by Open vSwitch, which works differently and is designed for this use. To solve this problem with an internal device, instead run:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 int0 -- set Interface int0 type=internal
ifconfig int0 192.168.0.123
Even more simply, you can take advantage of the internal port that every bridge has under the name of the bridge:
ovs-vsctl add-br br0
ifconfig br0 192.168.0.123
In more detail, a "tap" device is an interface between the Linux (or *BSD) network stack and a user program that opens it as a socket. When the "tap" device transmits a packet, it appears in the socket opened by the userspace program. Conversely, when the userspace program writes to the "tap" socket, the kernel TCP/IP stack processes the packet as if it had been received by the "tap" device.
Consider the configuration above. Given this configuration, if you "ping" an IP address in the 192.168.0.x subnet, the Linux kernel routing stack will transmit an ARP on the tap0 device. Open vSwitch userspace treats "tap" devices just like any other network device; that is, it doesn't open them as "tap" sockets. That means that the ARP packet will simply get dropped.
You might wonder why the Open vSwitch kernel module doesn't intercept the ARP packet and bridge it. After all, Open vSwitch intercepts packets on other devices. The answer is that Open vSwitch only intercepts received packets, but this is a packet being transmitted. The same thing happens for all other types of network devices, except for Open vSwitch "internal" ports. If you, for example, add a physical Ethernet port to an OVS bridge, configure an IP address on a physical Ethernet port, and then issue a "ping" to an address in that subnet, the same thing happens: an ARP gets transmitted on the physical Ethernet port and Open vSwitch never sees it. (You should not do that, as documented at the beginning of this section.)
It can make sense to add a "tap" device to an Open vSwitch bridge, if some userspace program (other than Open vSwitch) has opened the tap socket. This is the case, for example, if the "tap" device was created by KVM (or QEMU) to simulate a virtual NIC. In such a case, when OVS bridges a packet to the "tap" device, the kernel forwards that packet to KVM in userspace, which passes it along to the VM, and in the other direction, when the VM sends a packet, KVM writes it to the "tap" socket, which causes OVS to receive it and bridge it to the other OVS ports. Please note that in such a case no IP address is configured on the "tap" device (there is normally an IP address configured in the virtual NIC inside the VM, but this is not visible to the host Linux kernel or to Open vSwitch).
There is one special case in which Open vSwitch does directly read and write "tap" sockets. This is an implementation detail of the Open vSwitch userspace switch, which implements its "internal" ports as Linux (or *BSD) "tap" sockets. In such a userspace switch, OVS receives packets sent on the "tap" device used to implement an "internal" port by reading the associated "tap" socket, and bridges them to the rest of the switch. In the other direction, OVS transmits packets bridged to the "internal" port by writing them to the "tap" socket, causing them to be processed by the kernel TCP/IP stack as if they had been received on the "tap" device. Users should not need to be concerned with this implementation detail.
Open vSwitch has a network device type called "tap". This is intended only for implementing "internal" ports in the OVS userspace switch and should not be used otherwise. In particular, users should not configure KVM "tap" devices as type "tap" (use type "system", the default, instead).
A: Yes. For traffic that egresses from a switch, OVS supports traffic shaping; for traffic that ingresses into a switch, OVS support policing. Policing is a simple form of quality-of-service that simply drops packets received in excess of the configured rate. Due to its simplicity, policing is usually less accurate and less effective than egress traffic shaping, which queues packets.
Keep in mind that ingress and egress are from the perspective of the switch. That means that egress shaping limits the rate at which traffic is allowed to transmit from a physical interface, but the rate at which traffic will be received on a virtual machine's VIF. For ingress policing, the behavior is the opposite.
A: Suppose that you want to set up bridge br0 connected to physical Ethernet port eth0 (a 1 Gbps device) and virtual machine interfaces vif1.0 and vif2.0, and that you want to limit traffic from vif1.0 to eth0 to 10 Mbps and from vif2.0 to eth0 to 20 Mbps. Then, you could configure the bridge this way:
ovs-vsctl -- \
add-br br0 -- \
add-port br0 eth0 -- \
add-port br0 vif1.0 -- set interface vif1.0 ofport_request=5 -- \
add-port br0 vif2.0 -- set interface vif2.0 ofport_request=6 -- \
set port eth0 qos=@newqos -- \
--id=@newqos create qos type=linux-htb \
other-config:max-rate=1000000000 \
queues:123=@vif10queue \
queues:234=@vif20queue -- \
--id=@vif10queue create queue other-config:max-rate=10000000 -- \
--id=@vif20queue create queue other-config:max-rate=20000000
At this point, bridge br0 is configured with the ports and eth0 is configured with the queues that you need for QoS, but nothing is actually directing packets from vif1.0 or vif2.0 to the queues that we have set up for them. That means that all of the packets to eth0 are going to the "default queue", which is not what we want.
We use OpenFlow to direct packets from vif1.0 and vif2.0 to the queues reserved for them:
ovs-ofctl add-flow br0 in_port=5,actions=set_queue:123,normal
ovs-ofctl add-flow br0 in_port=6,actions=set_queue:234,normal
Each of the above flows matches on the input port, sets up the appropriate queue (123 for vif1.0, 234 for vif2.0), and then executes the "normal" action, which performs the same switching that Open vSwitch would have done without any OpenFlow flows being present. (We know that vif1.0 and vif2.0 have OpenFlow port numbers 5 and 6, respectively, because we set their ofport_request columns above. If we had not done that, then we would have needed to find out their port numbers before setting up these flows.)
Now traffic going from vif1.0 or vif2.0 to eth0 should be rate-limited.
By the way, if you delete the bridge created by the above commands, with:
ovs-vsctl del-br br0
then that will leave one unreferenced QoS record and two unreferenced Queue records in the Open vSwich database. One way to clear them out, assuming you don't have other QoS or Queue records that you want to keep, is:
ovs-vsctl -- --all destroy QoS -- --all destroy Queue
If you do want to keep some QoS or Queue records, or the Open vSwitch you are using is older than version 1.8 (which added the --all option), then you will have to destroy QoS and Queue records individually.
A: A policing policy can be configured on an interface to drop packets that arrive at a higher rate than the configured value. For example, the following commands will rate-limit traffic that vif1.0 may generate to 10Mbps:
ovs-vsctl set interface vif1.0 ingress_policing_rate=10000
ovs-vsctl set interface vif1.0 ingress_policing_burst=1000
Traffic policing can interact poorly with some network protocols and can have surprising results. The "Ingress Policing" section of ovs-vswitchd.conf.db(5) discusses the issues in greater detail.
adding records to the QoS and Queue table, but the results aren't what I expect.
A: Did you install OpenFlow flows that use your queues? This is the primary way to tell Open vSwitch which queues you want to use. If you don't do this, then the default queue will be used, which will probably not have the effect you want.
Refer to the previous question for an example.
doesn't yet support. How do I do that?
A: Open vSwitch does not implement QoS itself. Instead, it can configure some, but not all, of the QoS features built into the Linux kernel. If you need some QoS feature that OVS cannot configure itself, then the first step is to figure out whether Linux QoS supports that feature. If it does, then you can submit a patch to support Open vSwitch configuration for that feature, or you can use "tc" directly to configure the feature in Linux. (If Linux QoS doesn't support the feature you want, then first you have to add that support to Linux.)
working as well as I expect.
A: With the Linux kernel, the Open vSwitch implementation of QoS has two aspects:
-
Open vSwitch configures a subset of Linux kernel QoS features, according to what is in OVSDB. It is possible that this code has bugs. If you believe that this is so, then you can configure the Linux traffic control (QoS) stack directly with the "tc" program. If you get better results that way, you can send a detailed bug report to [email protected].
It is certain that Open vSwitch cannot configure every Linux kernel QoS feature. If you need some feature that OVS cannot configure, then you can also use "tc" directly (or add that feature to OVS).
-
The Open vSwitch implementation of OpenFlow allows flows to be directed to particular queues. This is pretty simple and unlikely to have serious bugs at this point.
However, most problems with QoS on Linux are not bugs in Open vSwitch at all. They tend to be either configuration errors (please see the earlier questions in this section) or issues with the traffic control (QoS) stack in Linux. The Open vSwitch developers are not experts on Linux traffic control. We suggest that, if you believe you are encountering a problem with Linux traffic control, that you consult the tc manpages (e.g. tc(8), tc-htb(8), tc-hfsc(8)), web resources (e.g. http://lartc.org/), or mailing lists (e.g. http://vger.kernel.org/vger-lists.html#netdev).
A: Since version 2.0, Open vSwitch has OpenFlow protocol support for OpenFlow meters. There is no implementation of meters in the Open vSwitch software switch (neither the kernel-based nor userspace switches).
A: At the simplest level, a VLAN (short for "virtual LAN") is a way to partition a single switch into multiple switches. Suppose, for example, that you have two groups of machines, group A and group B. You want the machines in group A to be able to talk to each other, and you want the machine in group B to be able to talk to each other, but you don't want the machines in group A to be able to talk to the machines in group B. You can do this with two switches, by plugging the machines in group A into one switch and the machines in group B into the other switch.
If you only have one switch, then you can use VLANs to do the same thing, by configuring the ports for machines in group A as VLAN "access ports" for one VLAN and the ports for group B as "access ports" for a different VLAN. The switch will only forward packets between ports that are assigned to the same VLAN, so this effectively subdivides your single switch into two independent switches, one for each group of machines.
So far we haven't said anything about VLAN headers. With access ports, like we've described so far, no VLAN header is present in the Ethernet frame. This means that the machines (or switches) connected to access ports need not be aware that VLANs are involved, just like in the case where we use two different physical switches.
Now suppose that you have a whole bunch of switches in your network, instead of just one, and that some machines in group A are connected directly to both switches 1 and 2. To allow these machines to talk to each other, you could add an access port for group A's VLAN to switch 1 and another to switch 2, and then connect an Ethernet cable between those ports. That works fine, but it doesn't scale well as the number of switches and the number of VLANs increases, because you use up a lot of valuable switch ports just connecting together your VLANs.
This is where VLAN headers come in. Instead of using one cable and two ports per VLAN to connect a pair of switches, we configure a port on each switch as a VLAN "trunk port". Packets sent and received on a trunk port carry a VLAN header that says what VLAN the packet belongs to, so that only two ports total are required to connect the switches, regardless of the number of VLANs in use. Normally, only switches (either physical or virtual) are connected to a trunk port, not individual hosts, because individual hosts don't expect to see a VLAN header in the traffic that they receive.
None of the above discussion says anything about particular VLAN numbers. This is because VLAN numbers are completely arbitrary. One must only ensure that a given VLAN is numbered consistently throughout a network and that different VLANs are given different numbers. (That said, VLAN 0 is usually synonymous with a packet that has no VLAN header, and VLAN 4095 is reserved.)
A: Many drivers in Linux kernels before version 3.3 had VLAN-related bugs. If you are having problems with VLANs that you suspect to be driver related, then you have several options:
-
Upgrade to Linux 3.3 or later.
-
Build and install a fixed version of the particular driver that is causing trouble, if one is available.
-
Use a NIC whose driver does not have VLAN problems.
-
Use "VLAN splinters", a feature in Open vSwitch 1.4 and later that works around bugs in kernel drivers. To enable VLAN splinters on interface eth0, use the command:
ovs-vsctl set interface eth0 other-config:enable-vlan-splinters=true
For VLAN splinters to be effective, Open vSwitch must know which VLANs are in use. See the "VLAN splinters" section in the Interface table in ovs-vswitchd.conf.db(5) for details on how Open vSwitch infers in-use VLANs.
VLAN splinters increase memory use and reduce performance, so use them only if needed.
-
Apply the "vlan workaround" patch from the XenServer kernel patch queue, build Open vSwitch against this patched kernel, and then use ovs-vlan-bug-workaround(8) to enable the VLAN workaround for each interface whose driver is buggy.
(This is a nontrivial exercise, so this option is included only for completeness.)
It is not always easy to tell whether a Linux kernel driver has buggy VLAN support. The ovs-vlan-test(8) and ovs-test(8) utilities can help you test. See their manpages for details. Of the two utilities, ovs-test(8) is newer and more thorough, but ovs-vlan-test(8) may be easier to use.
A: Do you have VLANs enabled on the physical switch that OVS is attached to? Make sure that the port is configured to trunk the VLAN or VLANs that you are using with OVS.
and to its destination host, but OVS seems to drop incoming return traffic.
A: It's possible that you have the VLAN configured on your physical switch as the "native" VLAN. In this mode, the switch treats incoming packets either tagged with the native VLAN or untagged as part of the native VLAN. It may also send outgoing packets in the native VLAN without a VLAN tag.
If this is the case, you have two choices:
-
Change the physical switch port configuration to tag packets it forwards to OVS with the native VLAN instead of forwarding them untagged.
-
Change the OVS configuration for the physical port to a native VLAN mode. For example, the following sets up a bridge with port eth0 in "native-tagged" mode in VLAN 9:
ovs-vsctl add-br br0 ovs-vsctl add-port br0 eth0 tag=9 vlan_mode=native-tagged
In this situation, "native-untagged" mode will probably work equally well. Refer to the documentation for the Port table in ovs-vswitchd.conf.db(5) for more information.
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0 tag=9
ovs-vsctl add-port br0 tap1 tag=10
but the VMs can't access each other, the external network, or the
Internet.
A: It is to be expected that the VMs can't access each other. VLANs are a means to partition a network. When you configured tap0 and tap1 as access ports for different VLANs, you indicated that they should be isolated from each other.
As for the external network and the Internet, it seems likely that the machines you are trying to access are not on VLAN 9 (or 10) and that the Internet is not available on VLAN 9 (or 10).
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0 tag=9
ovs-vsctl add-port br0 tap1 tag=9
The VMs can access each other, but not the external network or the
Internet.
A: It seems likely that the machines you are trying to access in the external network are not on VLAN 9 and that the Internet is not available on VLAN 9. Also, ensure VLAN 9 is set up as an allowed trunk VLAN on the upstream switch port to which eth0 is connected.
A: Yes. Use an "internal port" configured as an access port. For example, the following configures IP address 192.168.0.7 on VLAN 9. That is, OVS will forward packets from eth0 to 192.168.0.7 only if they have an 802.1Q header with VLAN 9. Conversely, traffic forwarded from 192.168.0.7 to eth0 will be tagged with an 802.1Q header with VLAN 9:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 vlan9 tag=9 -- set interface vlan9 type=internal
ifconfig vlan9 192.168.0.7
See also the following question.
this:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ifconfig br0 192.168.0.5
ovs-vsctl add-port br0 vlan9 tag=9 -- set interface vlan9 type=internal
ifconfig vlan9 192.168.0.9
but other hosts that are only on VLAN 0 can reach the IP address configured on VLAN 9. What's going on?
A: RFC 1122 section 3.3.4.2 "Multihoming Requirements" describes two approaches to IP address handling in Internet hosts:
-
In the "Strong ES Model", where an ES is a host ("End System"), an IP address is primarily associated with a particular interface. The host discards packets that arrive on interface A if they are destined for an IP address that is configured on interface B. The host never sends packets from interface A using a source address configured on interface B.
-
In the "Weak ES Model", an IP address is primarily associated with a host. The host accepts packets that arrive on any interface if they are destined for any of the host's IP addresses, even if the address is configured on some interface other than the one on which it arrived. The host does not restrict itself to sending packets from an IP address associated with the originating interface.
Linux uses the weak ES model. That means that when packets destined to the VLAN 9 IP address arrive on eth0 and are bridged to br0, the kernel IP stack accepts them there for the VLAN 9 IP address, even though they were not received on vlan9, the network device for vlan9.
To simulate the strong ES model on Linux, one may add iptables rule to filter packets based on source and destination address and adjust ARP configuration with sysctls.
BSD uses the strong ES model.
A: The configuration for VLANs in the Open vSwitch database (e.g. via ovs-vsctl) only affects traffic that goes through Open vSwitch's implementation of the OpenFlow "normal switching" action. By default, when Open vSwitch isn't connected to a controller and nothing has been manually configured in the flow table, all traffic goes through the "normal switching" action. But, if you set up OpenFlow flows on your own, through a controller or using ovs-ofctl or through other means, then you have to implement VLAN handling yourself.
You can use "normal switching" as a component of your OpenFlow actions, e.g. by putting "normal" into the lists of actions on ovs-ofctl or by outputting to OFPP_NORMAL from an OpenFlow controller. In situations where this is not suitable, you can implement VLAN handling yourself, e.g.:
-
If a packet comes in on an access port, and the flow table needs to send it out on a trunk port, then the flow can add the appropriate VLAN tag with the "mod_vlan_vid" action.
-
If a packet comes in on a trunk port, and the flow table needs to send it out on an access port, then the flow can strip the VLAN tag with the "strip_vlan" action.
tags, like this:
ovs-vsctl add-br br0
ovs-vsctl set-controller br0 tcp:192.168.0.10:6653
ovs-vsctl add-port br0 eth0
ovs-vsctl add-port br0 tap0 tag=9
ovs-vsctl add-port br0 tap1 tag=10
but the VMs running behind tap0 and tap1 can still communicate, that is, they are not isolated from each other even though they are on different VLANs.
A: Do you have a controller configured on br0 (as the commands above do)? If so, then this is a variant on the previous question, "My OpenFlow controller doesn't see the VLANs that I expect," and you can refer to the answer there for more information.
A: Open vSwitch implements Independent VLAN Learning (IVL) for OFPP_NORMAL action. I.e. it logically has separate learning tables for each VLANs.
A: VXLAN stands for Virtual eXtensible Local Area Network, and is a means to solve the scaling challenges of VLAN networks in a multi-tenant environment. VXLAN is an overlay network which transports an L2 network over an existing L3 network. For more information on VXLAN, please see RFC 7348:
http://tools.ietf.org/html/rfc7348
A: Open vSwitch currently supports the framing format for packets on the wire. There is currently no support for the multicast aspects of VXLAN. To get around the lack of multicast support, it is possible to pre-provision MAC to IP address mappings either manually or from a controller.
use?
A: By default, Open vSwitch will use the assigned IANA port for VXLAN, which is 4789. However, it is possible to configure the destination UDP port manually on a per-VXLAN tunnel basis. An example of this configuration is provided below.
ovs-vsctl add-br br0
ovs-vsctl add-port br0 vxlan1 -- set interface vxlan1
type=vxlan options:remote_ip=192.168.1.2 options:key=flow
options:dst_port=8472
A: The following table lists the versions of OpenFlow supported by each version of Open vSwitch:
Open vSwitch OF1.0 OF1.1 OF1.2 OF1.3 OF1.4 OF1.5
###============ ===== ===== ===== ===== ===== =====
1.9 and earlier yes --- --- --- --- ---
1.10 yes --- [*] [*] --- ---
1.11 yes --- [*] [*] --- ---
2.0 yes [*] [*] [*] --- ---
2.1 yes [*] [*] [*] --- ---
2.2 yes [*] [*] [*] [%] [*]
2.3 yes yes yes yes [*] [*]
[*] Supported, with one or more missing features.
[%] Experimental, unsafe implementation.
Open vSwitch 2.3 enables OpenFlow 1.0, 1.1, 1.2, and 1.3 by default in ovs-vswitchd. In Open vSwitch 1.10 through 2.2, OpenFlow 1.1, 1.2, and 1.3 must be enabled manually in ovs-vswitchd. OpenFlow 1.4 and 1.5 are also supported, with missing features, in Open vSwitch 2.3 and later, but not enabled by default. In any case, the user may override the default:
-
To enable OpenFlow 1.0, 1.1, 1.2, and 1.3 on bridge br0:
ovs-vsctl set bridge br0 protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13
-
To enable OpenFlow 1.0, 1.1, 1.2, 1.3, 1.4, and 1.5 on bridge br0:
ovs-vsctl set bridge br0 protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15
-
To enable only OpenFlow 1.0 on bridge br0:
ovs-vsctl set bridge br0 protocols=OpenFlow10
All current versions of ovs-ofctl enable only OpenFlow 1.0 by default. Use the -O option to enable support for later versions of OpenFlow in ovs-ofctl. For example:
ovs-ofctl -O OpenFlow13 dump-flows br0
(Open vSwitch 2.2 had an experimental implementation of OpenFlow 1.4 that could cause crashes. We don't recommend enabling it.)
OPENFLOW-1.1+.md in the Open vSwitch source tree tracks support for OpenFlow 1.1 and later features. When support for OpenFlow 1.4 and 1.5 is solidly implemented, Open vSwitch will enable those version by default.
A: Before version 1.11, Open vSwitch did not support MPLS. That is, these versions can match on MPLS Ethernet types, but they cannot match, push, or pop MPLS labels, nor can they look past MPLS labels into the encapsulated packet.
Open vSwitch versions 1.11, 2.0, and 2.1 have very minimal support for MPLS. With the userspace datapath only, these versions can match, push, or pop a single MPLS label, but they still cannot look past MPLS labels (even after popping them) into the encapsulated packet. Kernel datapath support is unchanged from earlier versions.
Open vSwitch version 2.3 can match, push, or pop a single MPLS label and look past the MPLS label into the encapsulated packet. Both userspace and kernel datapaths will be supported, but MPLS processing always happens in userspace either way, so kernel datapath performance will be disappointing.
Open vSwitch version 2.4 can match, push, or pop up to 3 MPLS labels and look past the MPLS label into the encapsulated packet. It will have kernel support for MPLS, yielding improved performance.
A: This is a Open vSwitch extension to OpenFlow error codes. Open vSwitch uses this extension when it must report an error to an OpenFlow controller but no standard OpenFlow error code is suitable.
Open vSwitch logs the errors that it sends to controllers, so the easiest thing to do is probably to look at the ovs-vswitchd log to find out what the error was.
If you want to dissect the extended error message yourself, the format is documented in include/openflow/nicira-ext.h in the Open vSwitch source distribution. The extended error codes are documented in lib/ofp-errors.h.
Q1: Some of the traffic that I'd expect my OpenFlow controller to see doesn't actually appear through the OpenFlow connection, even though I know that it's going through. Q2: Some of the OpenFlow flows that my controller sets up don't seem to apply to certain traffic, especially traffic between OVS and the controller itself.
A: By default, Open vSwitch assumes that OpenFlow controllers are connected "in-band", that is, that the controllers are actually part of the network that is being controlled. In in-band mode, Open vSwitch sets up special "hidden" flows to make sure that traffic can make it back and forth between OVS and the controllers. These hidden flows are higher priority than any flows that can be set up through OpenFlow, and they are not visible through normal OpenFlow flow table dumps.
Usually, the hidden flows are desirable and helpful, but occasionally they can cause unexpected behavior. You can view the full OpenFlow flow table, including hidden flows, on bridge br0 with the command:
ovs-appctl bridge/dump-flows br0
to help you debug. The hidden flows are those with priorities greater than 65535 (the maximum priority that can be set with OpenFlow).
The DESIGN file at the top level of the Open vSwitch source distribution describes the in-band model in detail.
If your controllers are not actually in-band (e.g. they are on localhost via 127.0.0.1, or on a separate network), then you should configure your controllers in "out-of-band" mode. If you have one controller on bridge br0, then you can configure out-of-band mode on it with:
ovs-vsctl set controller br0 connection-mode=out-of-band
"ovs-appctl bridge/dump-flows" still shows some hidden flows.
A: You probably have a remote manager configured (e.g. with "ovs-vsctl set-manager"). By default, Open vSwitch assumes that managers need in-band rules set up on every bridge. You can disable these rules on bridge br0 with:
ovs-vsctl set bridge br0 other-config:disable-in-band=true
This actually disables in-band control entirely for the bridge, as if all the bridge's controllers were configured for out-of-band control.
A: See answer under "VLANs", above.
but I got a funny message like this:
ofp_util|INFO|normalization changed ofp_match, details:
ofp_util|INFO| pre: nw_dst=192.168.0.1
ofp_util|INFO|post:
and when I ran "ovs-ofctl dump-flows br0" I saw that my nw_dst match had disappeared, so that the flow ends up matching every packet.
A: The term "normalization" in the log message means that a flow cannot match on an L3 field without saying what L3 protocol is in use. The "ovs-ofctl" command above didn't specify an L3 protocol, so the L3 field match was dropped.
In this case, the L3 protocol could be IP or ARP. A correct command for each possibility is, respectively:
ovs-ofctl add-flow br0 ip,nw_dst=192.168.0.1,actions=drop
and
ovs-ofctl add-flow br0 arp,nw_dst=192.168.0.1,actions=drop
Similarly, a flow cannot match on an L4 field without saying what L4 protocol is in use. For example, the flow match "tp_src=1234" is, by itself, meaningless and will be ignored. Instead, to match TCP source port 1234, write "tcp,tp_src=1234", or to match UDP source port 1234, write "udp,tp_src=1234".
A: The OFPT_FEATURES_REQUEST message requests an OpenFlow switch to respond with an OFPT_FEATURES_REPLY that, among other information, includes a mapping between OpenFlow port names and numbers. From a command prompt, "ovs-ofctl show br0" makes such a request and prints the response for switch br0.
The Interface table in the Open vSwitch database also maps OpenFlow port names to numbers. To print the OpenFlow port number associated with interface eth0, run:
ovs-vsctl get Interface eth0 ofport
You can print the entire mapping with:
ovs-vsctl -- --columns=name,ofport list Interface
but the output mixes together interfaces from all bridges in the database, so it may be confusing if more than one bridge exists.
In the Open vSwitch database, ofport value -1 means that the interface could not be created due to an error. (The Open vSwitch log should indicate the reason.) ofport value [] (the empty set) means that the interface hasn't been created yet. The latter is normally an intermittent condition (unless ovs-vswitchd is not running).
run "ovs-dpctl dump-flows" I don't see them.
A: ovs-dpctl queries a kernel datapath, not an OpenFlow switch. It won't display the information that you want. You want to use "ovs-ofctl dump-flows" instead.
as an individual OpenFlow port. Is that right?
A: Yes, Open vSwitch makes individual bond interfaces visible as OpenFlow ports, rather than the bond as a whole. The interfaces are treated together as a bond for only a few purposes:
-
Sending a packet to the OFPP_NORMAL port. (When an OpenFlow controller is not configured, this happens implicitly to every packet.)
-
Mirrors configured for output to a bonded port.
It would make a lot of sense for Open vSwitch to present a bond as a single OpenFlow port. If you want to contribute an implementation of such a feature, please bring it up on the Open vSwitch development mailing list at [email protected].
multiple hosts, and other components. The behavior isn't what I expect. Help!
A: To debug network behavior problems, trace the path of a packet, hop-by-hop, from its origin in one host to a remote host. If that's correct, then trace the path of the response packet back to the origin.
The open source tool called "plotnetcfg" can help to understand the relationship between the networking devices on a single host.
Usually a simple ICMP echo request and reply ("ping") packet is good enough. Start by initiating an ongoing "ping" from the origin host to a remote host. If you are tracking down a connectivity problem, the "ping" will not display any successful output, but packets are still being sent. (In this case the packets being sent are likely ARP rather than ICMP.)
Tools available for tracing include the following:
-
"tcpdump" and "wireshark" for observing hops across network devices, such as Open vSwitch internal devices and physical wires.
-
"ovs-appctl dpif/dump-flows
" in Open vSwitch 1.10 and later or "ovs-dpctl dump-flows
" in earlier versions. These tools allow one to observe the actions being taken on packets in ongoing flows.See ovs-vswitchd(8) for "ovs-appctl dpif/dump-flows" documentation, ovs-dpctl(8) for "ovs-dpctl dump-flows" documentation, and "Why are there so many different ways to dump flows?" above for some background.
-
"ovs-appctl ofproto/trace" to observe the logic behind how ovs-vswitchd treats packets. See ovs-vswitchd(8) for documentation. You can out more details about a given flow that "ovs-dpctl dump-flows" displays, by cutting and pasting a flow from the output into an "ovs-appctl ofproto/trace" command.
-
SPAN, RSPAN, and ERSPAN features of physical switches, to observe what goes on at these physical hops.
Starting at the origin of a given packet, observe the packet at each hop in turn. For example, in one plausible scenario, you might:
-
"tcpdump" the "eth" interface through which an ARP egresses a VM, from inside the VM.
-
"tcpdump" the "vif" or "tap" interface through which the ARP ingresses the host machine.
-
Use "ovs-dpctl dump-flows" to spot the ARP flow and observe the host interface through which the ARP egresses the physical machine. You may need to use "ovs-dpctl show" to interpret the port numbers. If the output seems surprising, you can use "ovs-appctl ofproto/trace" to observe details of how ovs-vswitchd determined the actions in the "ovs-dpctl dump-flows" output.
-
"tcpdump" the "eth" interface through which the ARP egresses the physical machine.
-
"tcpdump" the "eth" interface through which the ARP ingresses the physical machine, at the remote host that receives the ARP.
-
Use "ovs-dpctl dump-flows" to spot the ARP flow on the remote host that receives the ARP and observe the VM "vif" or "tap" interface to which the flow is directed. Again, "ovs-dpctl show" and "ovs-appctl ofproto/trace" might help.
-
"tcpdump" the "vif" or "tap" interface to which the ARP is directed.
-
"tcpdump" the "eth" interface through which the ARP ingresses a VM, from inside the VM.
It is likely that during one of these steps you will figure out the problem. If not, then follow the ARP reply back to the origin, in reverse.
A: To drop a packet is to receive it without forwarding it. OpenFlow explicitly specifies forwarding actions. Thus, a flow with an empty set of actions does not forward packets anywhere, causing them to be dropped. You can specify an empty set of actions with "actions=" on the ovs-ofctl command line. For example:
ovs-ofctl add-flow br0 priority=65535,actions=
would cause every packet entering switch br0 to be dropped.
You can write "drop" explicitly if you like. The effect is the same. Thus, the following command also causes every packet entering switch br0 to be dropped:
ovs-ofctl add-flow br0 priority=65535,actions=drop
"drop" is not an action, either in OpenFlow or Open vSwitch. Rather, it is only a way to say that there are no actions.
ovs-ofctl add-flow br0 in_port=2,actions=2
but OVS drops the packets instead.
A: Yes, OpenFlow requires a switch to ignore attempts to send a packet out its ingress port. The rationale is that dropping these packets makes it harder to loop the network. Sometimes this behavior can even be convenient, e.g. it is often the desired behavior in a flow that forwards a packet to several ports ("floods" the packet).
Sometimes one really needs to send a packet out its ingress port ("hairpin"). In this case, output to OFPP_IN_PORT, which in ovs-ofctl syntax is expressed as just "in_port", e.g.:
ovs-ofctl add-flow br0 in_port=2,actions=in_port
This also works in some circumstances where the flow doesn't match on the input port. For example, if you know that your switch has five ports numbered 2 through 6, then the following will send every received packet out every port, even its ingress port:
ovs-ofctl add-flow br0 actions=2,3,4,5,6,in_port
or, equivalently:
ovs-ofctl add-flow br0 actions=all,in_port
Sometimes, in complicated flow tables with multiple levels of "resubmit" actions, a flow needs to output to a particular port that may or may not be the ingress port. It's difficult to take advantage of OFPP_IN_PORT in this situation. To help, Open vSwitch provides, as an OpenFlow extension, the ability to modify the in_port field. Whatever value is currently in the in_port field is the port to which outputs will be dropped, as well as the destination for OFPP_IN_PORT. This means that the following will reliably output to port 2 or to ports 2 through 6, respectively:
ovs-ofctl add-flow br0 in_port=2,actions=load:0->NXM_OF_IN_PORT[],2
ovs-ofctl add-flow br0 actions=load:0->NXM_OF_IN_PORT[],2,3,4,5,6
If the input port is important, then one may save and restore it on the stack:
ovs-ofctl add-flow br0 actions=push:NXM_OF_IN_PORT[],\
load:0->NXM_OF_IN_PORT[],\
2,3,4,5,6,\
pop:NXM_OF_IN_PORT[]
on port 2. I set up flows to forward only traffic destined to the other host and drop other traffic, like this:
priority=5,in_port=1,ip,nw_dst=192.168.0.2,actions=2
priority=5,in_port=2,ip,nw_dst=192.168.0.1,actions=1
priority=0,actions=drop
But it doesn't work--I don't get any connectivity when I do this. Why?
A: These flows drop the ARP packets that IP hosts use to establish IP connectivity over Ethernet. To solve the problem, add flows to allow ARP to pass between the hosts:
priority=5,in_port=1,arp,actions=2
priority=5,in_port=2,arp,actions=1
This issue can manifest other ways, too. The following flows that match on Ethernet addresses instead of IP addresses will also drop ARP packets, because ARP requests are broadcast instead of being directed to a specific host:
priority=5,in_port=1,dl_dst=54:00:00:00:00:02,actions=2
priority=5,in_port=2,dl_dst=54:00:00:00:00:01,actions=1
priority=0,actions=drop
The solution already described above will also work in this case. It may be better to add flows to allow all multicast and broadcast traffic:
priority=5,in_port=1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00,actions=2
priority=5,in_port=2,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00,actions=1
A: Reconfiguring your bridge can change your bridge's datapath-id because Open vSwitch generates datapath-id from the MAC address of one of its ports. In that case, Open vSwitch disconnects from controllers because there's no graceful way to notify controllers about the change of datapath-id.
To avoid the behaviour, you can configure datapath-id manually.
ovs-vsctl set bridge br0 other-config:datapath-id=0123456789abcdef
A: When a switch sends a packet to an OpenFlow controller using a "packet-in" message, it can also keep a copy of that packet in a "buffer", identified by a 32-bit integer "buffer_id". There are two advantages to buffering. First, when the controller wants to tell the switch to do something with the buffered packet (with a "packet-out" OpenFlow request), it does not need to send another copy of the packet back across the OpenFlow connection, which reduces the bandwidth cost of the connection and improves latency. This enables the second advantage: the switch can optionally send only the first part of the packet to the controller (assuming that the switch only needs to look at the first few bytes of the packet), further reducing bandwidth and improving latency.
However, buffering introduces some issues of its own. First, any switch has limited resources, so if the controller does not use a buffered packet, the switch has to decide how long to keep it buffered. When many packets are sent to a controller and buffered, Open vSwitch can discard buffered packets that the controller has not used after as little as 5 seconds. This means that controllers, if they make use of packet buffering, should use the buffered packets promptly. (This includes sending a "packet-out" with no actions if the controller does not want to do anything with a buffered packet, to clear the packet buffer and effectively "drop" its packet.)
Second, packet buffers are one-time-use, meaning that a controller cannot use a single packet buffer in two or more "packet-out" commands. Open vSwitch will respond with an error to the second and subsequent "packet-out"s in such a case.
Finally, a common error early in controller development is to try to use buffer_id 0 in a "packet-out" message as if 0 represented "no buffered packet". This is incorrect usage: the buffer_id with this meaning is actually 0xffffffff.
ovs-vswitchd(8) describes some details of Open vSwitch packet buffering that the OpenFlow specification requires implementations to document.
A: Add your new message to "enum ofpraw" and "enum ofptype" in lib/ofp-msgs.h, following the existing pattern. Then recompile and fix all of the new warnings, implementing new functionality for the new message as needed. (If you configure with --enable-Werror, as described in INSTALL.md, then it is impossible to miss any warnings.)
If you need to add an OpenFlow vendor extension message for a vendor that doesn't yet have any extension messages, then you will also need to edit build-aux/extract-ofp-msgs.
A: Add new members for your field to "struct flow" in lib/flow.h, and add new enumerations for your new field to "enum mf_field_id" in lib/meta-flow.h, following the existing pattern. Also, add support to miniflow_extract() in lib/flow.c for extracting your new field from a packet into struct miniflow. Then recompile and fix all of the new warnings, implementing new functionality for the new field or header as needed. (If you configure with --enable-Werror, as described in INSTALL.md, then it is impossible to miss any warnings.)
If you want kernel datapath support for your new field, you also need to modify the kernel module for the operating systems you are interested in. This isn't mandatory, since fields understood only by userspace work too (with a performance penalty), so it's reasonable to start development without it. If you implement kernel module support for Linux, then the Linux kernel "netdev" mailing list is the place to submit that support first; please read up on the Linux kernel development process separately. The Windows datapath kernel module support, on the other hand, is maintained within the OVS tree, so patches for that can go directly to ovs-dev.
A: Add your new action to "enum ofp_raw_action_type" in lib/ofp-actions.c, following the existing pattern. Then recompile and fix all of the new warnings, implementing new functionality for the new action as needed. (If you configure with --enable-Werror, as described in INSTALL.md, then it is impossible to miss any warnings.)
If you need to add an OpenFlow vendor extension action for a vendor that doesn't yet have any extension actions, then you will also need to edit build-aux/extract-ofp-actions.