This document provides a preliminary overview of the features offered by the SuzieQ analyzer.
Let's jump into what you can do with SuzieQ now.
We focus on the use of the analyzer via the CLI, suzieq-cli. You launch the CLI inside the docker image. You startup the docker image as described before, via the command:
docker run -it -v <parquet-out-local-dir>:/home/suzieq/parquet --name suzieq netenglabs/suzieq:latest
Once inside the Docker container, you can launch the CLI via the command suzieq-cli
.
There is a command prompt and at the bottom of the screen are some indicators of what SuzieQ is up to: The words at the bottom show specific choices that you can make to filter data.
A important concept in SuzieQ is the namespace. we were originally thinking of namespace as datacenter, but it can be whatever grouping of devices that you want. At this point SuzieQ cannot build a hierarchy of namespaces. We will add some way to group namespaces together in the future.
To get started, it's often a good idea to look at the help. Just type a '?' to get started: You see the commands that you can use with SuzieQ. For each command there are a small number of verbs and some filters that you can add. Most commands have at least the 'show', 'summary', and 'unique' verbs; some commands have more.
Let's look at some data. We'll start with the system data to get an overview of the devices that we have data for. This shows that we have data from 14 devices and some information about each of the systems. In this case, we have a single namespace, and it has some Cumulus Linux Routers and some Ubuntu hosts.
device show
Another example of the same command, with Arista EOS devices. This is also a single namespace, with routers and hosts.
Each command has completion to help you understand what you can do with the command. For instance, the 'interface' command shows that it has five verbs that you can use with it.
Let's look at BGP data. We'll use this to demonstrate some of the things you can do with SuzieQ.
We'll start with 'bgp show'. You can see that SuzieQ shows a row per connection of a device and it's peer. This command includes data such as the current state of the connection, asn, peer asn, and important statistics such as the number of prefixes received, how long the connection has been established, and the number of changes in the connection state.
bgp show
There's a lot of data there. One way to break out data for most commands is to use the 'unique' verb. Unique shows the unique items for a column, and a count of each. In this example, we just want to look at the ASNs. Let's look at what the ASNs are and how many there are of each. The 'unique' command requires that you use a 'columns=' filter to pick which of the columns in the data you want to examine.
bgp unique columns=asn
To get an overview of what is going on in a service, SuzieQ has the verb 'summarize'. To get an overview of BGP in your networks, SuzieQ has the command 'bgp summarize'.
bgp summarize
This can be a bit intimidating, as SuzieQ is summarizing a lot of information. As you can see, there is a column per namespace, though in this example we only have one namespace. We list the number of ASNs, peerAsns, etc. It shows the number of rows, which are the number of rows in the table if you had just done a 'bgp show'. For items that have a count of 3 or less, it shows the entries, and how many there are of each. In this example network, there are two VRFs, with 'default' having 32 rows and 'internet-vrf' having 4. If an item has has more than 3 that you'd like to examine, you can use the 'unique' verb as mentioned above. For instance, from summarize there are 9 ASNs. Using 'unique', as shown above, you can see each ASN and how many times it is being used.
Another interesting concept is shown in v4PfxRxStat, v6PfxRxStat, etc, which is if you see three numbers in a list, then SuzieQ is showing you min, max, and median. This is our best way to help you understand the distribution that you have.
Asserts are a very important part of SuzieQ and we have some interesting ones defined, but a lot more to add in the future. This shows our 'bgp assert'. This runs several different checks on the bgp data and shows which ones failed.
bgp assert
You might notice that there is a row entry per device and peer. There is then a lot of information that is needed to run the checks in the assert. Finally, at the end there is an assert column with pass or fail for each row, and if there is a fail, there is a list of reasons that the row failed. At the very end is a message if the Assert passed or Assert failed.
One of the nice things that we can do with SuzieQ is show all the different paths between two endpoints.
path show src='172.16.1.101' dest='172.16.4.104' namespace=dual-bgp
This is a little tricky to understand. In this example, there are eight different paths, represented by their pathid. For each pathid, we then show each hop. As you can see, we also show the mtu, and if it is an overlay.
Path does not yet work in all cases, but it works in most that we know of. Some EVPN cases do not work yet; underlays + overlays are tricky.
One quirk of the libraries that SuzieQ uses is that IP addresses must be put in quotes or they won't work.
You can also get a summary of the path between two IP addresses:
path summarize src='172.16.1.101' dest='172.16.4.104' namespace=dual-bgp
A quick peak at routes shows that there are 239 routes throughout the network.
We'd like to see the number of routes per device. That and other things can be understood from the summarize command. We can see that there are only 39 unique routes in the network.
Another nice attribute of SuzieQ is that you can do an LPM match and see the results from each device.
If you want to look at the database more directly, use the 'table' command. There isn't always a single connection between tables in the database and commands that are available. Also, we are collecting data in some tables, such as 'ifCounters', that do not yet have commands. We assume it's better to start getting data and then we can add useful analysis as we go along.
To see each table in the database and some statistics about each one:
table show
SuzieQ only saves data to the database if there have been changes to the data. So some tables will iterate often, and some will not.
A special table is called 'sqPoller', and we have access to that through the CLI. It records the work that SuzieQ poller is doing. At this point it's probably only useful for developers.
For each table, you can look at what is in the data and what are the columns that are displayed automatically. For instance, with BGP we collect a lot more data than we show by default.
You can always display more columns by adding the columns filter at the end of a show command. You can use 'columns=*' to get all the columns available for a command, but for bgp that is a lot!
You can directly call a SuzieQ command such as bgp summarize
directly from the CLI,
and check the output of the command.
Based on the command you run, SuzieQ will exit with interesting results. For instance, if you run an assert and it fails, you will get an exit code other than 0 (usually 255.)
As shown above, the SuzieQ cli works by commands (also called services), verbs, and filters. Some of the filters we've seen so far are 'columns' and 'namespace'. Different commands and verbs have different filters.
The standard filters that work on most command/verb pairs are:
Most of those are self-explanatory. The default for view is 'latest', which means the most current change. 'view=all' will provide the data across all time. 'view=changes' analyzes just the data that has changed.
'format=' allows you to produce data other than in a pandas dataframe. This is most useful at the command line to then read in the data from suzieq.
In the CLI you can set a specific context for filtering all the commands that you use. For instance, if you want to set a filter for the namespace that you will be investigating then you type
set namespace=dual-bgp
you can add or remove context filters as necessary.
By default when you use the CLI and you use a command, you will be using 'view=latest'. This is usually the most useful and what you expect to see. If you want to see all your data, add 'view=all'. You can also select specific start and end times that you are interested in.
SuzieQ also has the notion of seeing changes during a time window. If you specify a start-time and and end-time and don't specify 'view=all', then you will see only the changes during a time window.
The complicated non-docker way to install SuzieQ is to get the code from GitHub:
- git clone:
git clone [email protected]:netenglabs/suzieq.git
- SuzieQ assumes the use of python3.7 which may not be installed on your computer by default.
Ubuntu 18.04 ships with 3.6 as default, for example. Check your python version with
python3 --version
. If that is different from 3.7, you’ll need to add the python3.7 and 3.7 dev package. But, until we can build the different engines separately, we’re stuck with this requirement. - To install python3.7 on Ubuntu 18.04, please execute the following commands:
sudo add-apt-repository ppa:deadsnakes/ppa sudo apt install python3.7 python3.7-dev
- Install python3-pip if it has not been installed.
sudo apt install python3-pip
- From the suzieq directory, run pip3 on the requirements file:
pip3 install --user --disable-pip-version-check -r /requirements.txt
SuzieQ requires that you have a suzieq config file either in '/.suzieq/suzieq-cfg.yml' or '/.suzieq-cfg.yml'. It looks like:
data-directory: /home/jpiet/parquet-out
service-directory: /home/jpiet/suzieq/config
schema-directory: /home/jpiet/suzieq/config/schema
temp-directory: /tmp/suzieq
logging-level: WARNING
python3 suzieq/poller/sq-poller.py -D ~/dual-bgp
As demonstrated above in the README, SuzieQ has a powerful concept called Asserts. In an assert for a service some number of checks are made to ensure that the network is setup correctly. For each service that has an assert you get an output that shows all the data necessary for the checks, a pass/fail column, and a reason column for any failed checks.
Only Interface, BGP, OSPF and EVPN tables have predefined asserts defined. In each case, the check is performed for consistency across each end of a resource: for example between the two interface ends of a link, or the peers of a routing protocol. These asserts are a big aid in validating the network after using network automation tools such as Ansible to configure the network.
As mentioned before, if you run an assert from your shell and check the return code, a failure of the assert will be a non-zero value, usually 255.
The checks that are run in the BGP assert are:
- The BGP peers on both ends are configured with the expected ASN of the peer (we handle FRR’s remote-as external/internal model as well)
- The BGP peers on both ends are configured with the same set of AFI/SAFI
- For unestablished BGP sessions, we check if the reason is that either a peer is misconfigured, or if the expected peer is not even configured (we identify the owner of the peer IP address, and check if that peer has any failed session or not).
Dynamic BGP session mismatches are not yet handled.
Here an example result of bgp assert
command:
The following checks are performed as part of the evpnVni assert:
- If multicast is used for replication, the multicast group to VNI mapping is consistent throughout the network
- Any VTEP with a locally attached VNI is known to all other VTEPs that have the same VNI
- The local VXLAN interface(s) are up
- For FRR-based network OS, verify that the head-end replication entry is configured correctly
- For FRR-based network OS, verify that the Vxlan interface is part of the bridge
Currently, evpnVni assertion does not support a single Vxlan interface for all VNI model
Here an example result of evpnVni assert
command:
The following checks are performed on each appropriate pair of interfaces in the SuzieQ database:
- Match MTU: Verify that the MTU on both ends of the link are the same
- Match Speed: Verify that the speed on both ends of the link are the same
- Match portmode: Verify that the switchport mode (access, trunk, routed) on both ends of the link are the same
- Match PVID: Verify that the default VLAN on both ends of a link are the same. In case of access links, this is the only allowed VLAN on the link
- Match VLAN set: Verify that the set of active VLANs on both ends of a link are the same
- Match IP address: Verify that either the link is an unnumbered link (/32 or /128) or they both belong the same subnet
Here an example result of interface assert
command:
The following checks are performed as part of the OSPF assert:
- No duplicate router IDs configured across all OSPF routers in a namespace
- The peers of an OSPF interface are either unnumbered or belong to the same subnet
- The peers of an OSPF session have the same MTU
- The peers of an OSPF session have the same timers: hello and dead times
- The peers of an OSPF session have the same area
- The peers of an OSPF session have the same are stubbiness
- The peers of an OSPF session have the same network type
- The peers of an OSPF session belong to the same VRF
- Loopback interfaces are of the OSPF network type loopback
Here an example result of ospf assert
command:
For each table, there is a summarize
command.
This command gives an overview of the state of each namespace.
As the other commands, also on the summarize command it's possible to define a subset of the network to summarize. For example it is possible to get data of a subset of the namespaces, filter only a subset of the hosts etc.
In this example, it's shown the summary of spine nodes inside the dual-bgp
namespace.