Skip to content

Parallel Libtrace HOWTO: Conceptual Overview

Richard Sanger edited this page Sep 29, 2015 · 8 revisions

This section aims to give the reader a good grasp of the underlying mechanics of parallel libtrace, so that they can understand where their code fits in and how it will be executed.

As a starting point, the diagram below shows how a conventional libtrace program is structured. An input (and optional output) trace is created, configured and then started. The core of the program is the loop that continuously reads and processes packets one at a time. Once the packet is processed, the results are updated and then the program moves on to the next available packet. When there are no more packets or the program is halted, the loop stops and the final results (if any) are reported before the cleanup phase.

Libtrace 3 workflow

By comparison, the workflow of the main function in a parallel libtrace program is shown below. The key difference is that the program is no longer responsible for reading packets; instead, the program defines a set of callback functions that are to be invoked by a thread in response to events that occur while it is running. Most callbacks are optional, but you will need to at least define a callback for when a packet is read by a thread (i.e. the equivalent of the processing function in a serial program). When the trace is started, the new threads are spawned and the main loop simply waits for all of the threads to complete before moving to the cleanup phase.

Libtrace 4 workflow

If the input trace format does not support native parallelism (for example, reading from a file), the behaviour of a parallel libtrace program will resemble the diagram below. The packets are read in serial from the input source and assigned to processing threads by the hasher thread. Each processing thread operates in parallel and any results are published to a combiner thread. The combiner thread merges the results (and orders them, if required) and pushes the results through to a single reporter thread. The reporter thread will examine the results and produce any desired output.

Libtrace 4 behaviour (serial format)

If the input trace format does support parallelism (for example, a NIC with multiple processing pipelines), the behaviour changes slightly to match the diagram below. There are now effectively multiple input sources, each served by a single processing thread. The hash function is used to assign packets to the appropriate source and the hashing will be done by the packet source itself (i.e. in hardware) wherever possible for even better performance. The processing, combining and reporting threads behave the same as in the serial case.

Libtrace 4 behaviour (parallel format)

Implementations of common hasher and combiner functions are included with libtrace so you will only need to worry about writing these if you need something different. The main task of the libtrace programmer is to define the behaviour of the processing and reporter threads, i.e. how do you extract the information you want from the packets and how do you describe your results. The remainder of this HOWTO guide will cover this in a lot more detail.

One important thing to realise is that libtrace will handle all of the inter-thread communication for you. You don't have to worry about how packets get from the hasher to the processing thread. You don't have to worry about making sure you have an exclusive lock on a packet or a result before looking at it. You don't have to manage any buffers or queues. Instead, you just focus on what the analysis that you want to perform and let libtrace handle the sticky details of keeping everything thread-safe.

Having dealt with all of the conceptual background, let's start thinking about how we might write a parallel libtrace program.

Move on to the example scenario.

Clone this wiki locally