Skip to content

Latest commit

 

History

History
351 lines (276 loc) · 16.8 KB

analyzer.md

File metadata and controls

351 lines (276 loc) · 16.8 KB

SuzieQ Analyzer

This document provides a preliminary overview of the features offered by the SuzieQ analyzer.

1. Analysis

Let's jump into what you can do with SuzieQ now.

We focus on the use of the analyzer via the CLI, suzieq-cli. You launch the CLI inside the docker image. You startup the docker image as described before, via the command:

docker run -it -v <parquet-out-local-dir>:/home/suzieq/parquet --name suzieq netenglabs/suzieq:latest

Once inside the Docker container, you can launch the CLI via the command suzieq-cli.

There is a command prompt and at the bottom of the screen are some indicators of what SuzieQ is up to: SuzieQ Start The words at the bottom show specific choices that you can make to filter data.

A important concept in SuzieQ is the namespace. we were originally thinking of namespace as datacenter, but it can be whatever grouping of devices that you want. At this point SuzieQ cannot build a hierarchy of namespaces. We will add some way to group namespaces together in the future.

To get started, it's often a good idea to look at the help. Just type a '?' to get started: SuzieQ Help You see the commands that you can use with SuzieQ. For each command there are a small number of verbs and some filters that you can add. Most commands have at least the 'show', 'summary', and 'unique' verbs; some commands have more.

Let's look at some data. We'll start with the system data to get an overview of the devices that we have data for. This shows that we have data from 14 devices and some information about each of the systems. In this case, we have a single namespace, and it has some Cumulus Linux Routers and some Ubuntu hosts.

device show

SuzieQ device show

Another example of the same command, with Arista EOS devices. This is also a single namespace, with routers and hosts. SuzieQ_device_show_eos

Each command has completion to help you understand what you can do with the command. For instance, the 'interface' command shows that it has five verbs that you can use with it. SuzieQ interface show help

1.1. BGP Analysis Demo

Let's look at BGP data. We'll use this to demonstrate some of the things you can do with SuzieQ.

We'll start with 'bgp show'. You can see that SuzieQ shows a row per connection of a device and it's peer. This command includes data such as the current state of the connection, asn, peer asn, and important statistics such as the number of prefixes received, how long the connection has been established, and the number of changes in the connection state.

bgp show

SuzieQ bgp

There's a lot of data there. One way to break out data for most commands is to use the 'unique' verb. Unique shows the unique items for a column, and a count of each. In this example, we just want to look at the ASNs. Let's look at what the ASNs are and how many there are of each. The 'unique' command requires that you use a 'columns=' filter to pick which of the columns in the data you want to examine.

bgp unique columns=asn

SuzieQ unique ASNs

To get an overview of what is going on in a service, SuzieQ has the verb 'summarize'. To get an overview of BGP in your networks, SuzieQ has the command 'bgp summarize'.

bgp summarize

SuzieQ_BGP_summarize This can be a bit intimidating, as SuzieQ is summarizing a lot of information. As you can see, there is a column per namespace, though in this example we only have one namespace. We list the number of ASNs, peerAsns, etc. It shows the number of rows, which are the number of rows in the table if you had just done a 'bgp show'. For items that have a count of 3 or less, it shows the entries, and how many there are of each. In this example network, there are two VRFs, with 'default' having 32 rows and 'internet-vrf' having 4. If an item has has more than 3 that you'd like to examine, you can use the 'unique' verb as mentioned above. For instance, from summarize there are 9 ASNs. Using 'unique', as shown above, you can see each ASN and how many times it is being used.

Another interesting concept is shown in v4PfxRxStat, v6PfxRxStat, etc, which is if you see three numbers in a list, then SuzieQ is showing you min, max, and median. This is our best way to help you understand the distribution that you have.

Asserts are a very important part of SuzieQ and we have some interesting ones defined, but a lot more to add in the future. This shows our 'bgp assert'. This runs several different checks on the bgp data and shows which ones failed.

bgp assert

Suzie_bgp_assert You might notice that there is a row entry per device and peer. There is then a lot of information that is needed to run the checks in the assert. Finally, at the end there is an assert column with pass or fail for each row, and if there is a fail, there is a list of reasons that the row failed. At the very end is a message if the Assert passed or Assert failed.

1.2. Path Demo

One of the nice things that we can do with SuzieQ is show all the different paths between two endpoints.

path show src='172.16.1.101' dest='172.16.4.104' namespace=dual-bgp

SuzieQ_path_show This is a little tricky to understand. In this example, there are eight different paths, represented by their pathid. For each pathid, we then show each hop. As you can see, we also show the mtu, and if it is an overlay.

Path does not yet work in all cases, but it works in most that we know of. Some EVPN cases do not work yet; underlays + overlays are tricky.

One quirk of the libraries that SuzieQ uses is that IP addresses must be put in quotes or they won't work.

You can also get a summary of the path between two IP addresses:

path summarize src='172.16.1.101' dest='172.16.4.104' namespace=dual-bgp

SuzieQ_path_summarize

1.3. Route Demo

A quick peak at routes shows that there are 239 routes throughout the network. SuzieQ route show

We'd like to see the number of routes per device. That and other things can be understood from the summarize command. We can see that there are only 39 unique routes in the network. SuzieQ_route_summarize

1.4. LPM Demo

Another nice attribute of SuzieQ is that you can do an LPM match and see the results from each device. SuzieQ_route_lpm

1.5. Investigate SuzieQ Tables

If you want to look at the database more directly, use the 'table' command. There isn't always a single connection between tables in the database and commands that are available. Also, we are collecting data in some tables, such as 'ifCounters', that do not yet have commands. We assume it's better to start getting data and then we can add useful analysis as we go along.

To see each table in the database and some statistics about each one:

table show

Suzie Tables show SuzieQ only saves data to the database if there have been changes to the data. So some tables will iterate often, and some will not.

A special table is called 'sqPoller', and we have access to that through the CLI. It records the work that SuzieQ poller is doing. At this point it's probably only useful for developers.

For each table, you can look at what is in the data and what are the columns that are displayed automatically. For instance, with BGP we collect a lot more data than we show by default.

SuzieQ tables describe bgp You can always display more columns by adding the columns filter at the end of a show command. You can use 'columns=*' to get all the columns available for a command, but for bgp that is a lot!

1.6. Using SuzieQ on the command line

You can directly call a SuzieQ command such as bgp summarize directly from the CLI, and check the output of the command. SuzieQ_bgp_summarize_command_line

Based on the command you run, SuzieQ will exit with interesting results. For instance, if you run an assert and it fails, you will get an exit code other than 0 (usually 255.) Suzie_bgp_assert_command_line

1.7. Filtering

As shown above, the SuzieQ cli works by commands (also called services), verbs, and filters. Some of the filters we've seen so far are 'columns' and 'namespace'. Different commands and verbs have different filters.

The standard filters that work on most command/verb pairs are:

  • hostname
  • start-time
  • end-time
  • view
  • namespace
  • format
  • columns SuzieQ_device_show_completion

Most of those are self-explanatory. The default for view is 'latest', which means the most current change. 'view=all' will provide the data across all time. 'view=changes' analyzes just the data that has changed.

'format=' allows you to produce data other than in a pandas dataframe. This is most useful at the command line to then read in the data from suzieq.

1.8. Context Filtering

In the CLI you can set a specific context for filtering all the commands that you use. For instance, if you want to set a filter for the namespace that you will be investigating then you type

set namespace=dual-bgp

you can add or remove context filters as necessary.

1.9. How time works in SuzieQ

By default when you use the CLI and you use a command, you will be using 'view=latest'. This is usually the most useful and what you expect to see. If you want to see all your data, add 'view=all'. You can also select specific start and end times that you are interested in.

SuzieQ also has the notion of seeing changes during a time window. If you specify a start-time and and end-time and don't specify 'view=all', then you will see only the changes during a time window.

2.2. Installation with Pipenv

The complicated non-docker way to install SuzieQ is to get the code from GitHub:

  1. git clone: git clone [email protected]:netenglabs/suzieq.git
  2. SuzieQ assumes the use of python3.7 which may not be installed on your computer by default. Ubuntu 18.04 ships with 3.6 as default, for example. Check your python version with python3 --version. If that is different from 3.7, you’ll need to add the python3.7 and 3.7 dev package. But, until we can build the different engines separately, we’re stuck with this requirement.
  3. To install python3.7 on Ubuntu 18.04, please execute the following commands:
    sudo add-apt-repository ppa:deadsnakes/ppa
    sudo apt install python3.7 python3.7-dev
    
  4. Install python3-pip if it has not been installed.
    sudo apt install python3-pip
    
  5. From the suzieq directory, run pip3 on the requirements file:
pip3 install --user --disable-pip-version-check -r /requirements.txt

SuzieQ requires that you have a suzieq config file either in '/.suzieq/suzieq-cfg.yml' or '/.suzieq-cfg.yml'. It looks like:

data-directory: /home/jpiet/parquet-out
service-directory: /home/jpiet/suzieq/config
schema-directory: /home/jpiet/suzieq/config/schema
temp-directory: /tmp/suzieq
logging-level: WARNING

python3 suzieq/poller/sq-poller.py -D ~/dual-bgp

2 Asserts

As demonstrated above in the README, SuzieQ has a powerful concept called Asserts. In an assert for a service some number of checks are made to ensure that the network is setup correctly. For each service that has an assert you get an output that shows all the data necessary for the checks, a pass/fail column, and a reason column for any failed checks.

Only Interface, BGP, OSPF and EVPN tables have predefined asserts defined. In each case, the check is performed for consistency across each end of a resource: for example between the two interface ends of a link, or the peers of a routing protocol. These asserts are a big aid in validating the network after using network automation tools such as Ansible to configure the network.

As mentioned before, if you run an assert from your shell and check the return code, a failure of the assert will be a non-zero value, usually 255.

2.1. BGP Assert

The checks that are run in the BGP assert are:

  • The BGP peers on both ends are configured with the expected ASN of the peer (we handle FRR’s remote-as external/internal model as well)
  • The BGP peers on both ends are configured with the same set of AFI/SAFI
  • For unestablished BGP sessions, we check if the reason is that either a peer is misconfigured, or if the expected peer is not even configured (we identify the owner of the peer IP address, and check if that peer has any failed session or not).

Dynamic BGP session mismatches are not yet handled.

Here an example result of bgp assert command: SuzieQ_bgp_assert

2.2. EvpnVni Assert

The following checks are performed as part of the evpnVni assert:

  • If multicast is used for replication, the multicast group to VNI mapping is consistent throughout the network
  • Any VTEP with a locally attached VNI is known to all other VTEPs that have the same VNI
  • The local VXLAN interface(s) are up
  • For FRR-based network OS, verify that the head-end replication entry is configured correctly
  • For FRR-based network OS, verify that the Vxlan interface is part of the bridge

Currently, evpnVni assertion does not support a single Vxlan interface for all VNI model

Here an example result of evpnVni assert command: SuzieQ_evpnvni_assert

2.3. Interface Assert

The following checks are performed on each appropriate pair of interfaces in the SuzieQ database:

  • Match MTU: Verify that the MTU on both ends of the link are the same
  • Match Speed: Verify that the speed on both ends of the link are the same
  • Match portmode: Verify that the switchport mode (access, trunk, routed) on both ends of the link are the same
  • Match PVID: Verify that the default VLAN on both ends of a link are the same. In case of access links, this is the only allowed VLAN on the link
  • Match VLAN set: Verify that the set of active VLANs on both ends of a link are the same
  • Match IP address: Verify that either the link is an unnumbered link (/32 or /128) or they both belong the same subnet

Here an example result of interface assert command: SuzieQ_interface_assert

2.4. OSPF Assert

The following checks are performed as part of the OSPF assert:

  • No duplicate router IDs configured across all OSPF routers in a namespace
  • The peers of an OSPF interface are either unnumbered or belong to the same subnet
  • The peers of an OSPF session have the same MTU
  • The peers of an OSPF session have the same timers: hello and dead times
  • The peers of an OSPF session have the same area
  • The peers of an OSPF session have the same are stubbiness
  • The peers of an OSPF session have the same network type
  • The peers of an OSPF session belong to the same VRF
  • Loopback interfaces are of the OSPF network type loopback

Here an example result of ospf assert command: SuzieQ_ospf_assert

3. Summarize

For each table, there is a summarize command. This command gives an overview of the state of each namespace.

SuzieQ_interface_summarize

As the other commands, also on the summarize command it's possible to define a subset of the network to summarize. For example it is possible to get data of a subset of the namespaces, filter only a subset of the hosts etc.

SuzieQ_interface_filtered_summarize

In this example, it's shown the summary of spine nodes inside the dual-bgp namespace.