Skip to content

Parallel Libtrace HOWTO: Writing Main

Shane Alcock edited this page Sep 20, 2015 · 1 revision

OK, our callback functions are all written and ready to go. The final step in creating a working parallel libtrace program is to write our main function that will create, configure and finally start our input trace.

The following is the main function from our example application. I've left out the callback set creation and configuration code from the previous example for the sake of brevity but the main function is otherwise complete.

int main(int argc, char *argv[]) {
    libtrace_t *input;
    char *uri = "ring:eth0";
    libtrace_callback_set_t *processing, *reporter;

    /* XXX Insert callback set creation here */

    /* Create the input trace object */
    input = trace_create(uri);
    if (trace_is_err(input)) {
        trace_perror(input, "Creating trace");
        return 1;
    }

    /* Setting some parallel-specific configuration options. We can
     * also use trace_config here to set options that were present
     * in libtrace3 (BPF filters, snap length etc).
     */

    /* Set the number of processing threads to use. */
    trace_set_perpkt_threads(input, 4);

    /* We don't care about the order of our results, so we can
     * use the unordered combiner.
     */
    trace_set_combiner(input, &combiner_unordered, 
            (libtrace_generic_t){0});

    /* Try to balance our load across all processing threads. If
     * we were doing flow analysis, we should use 
     * HASHER_BIDIRECTIONAL instead to ensure that all packets for
     * a given flow end up on the same processing thread.
     */
    trace_set_hasher(input, HASHER_BALANCE, NULL, NULL);

    /* Start the parallel trace using our callback sets. The NULL 
     * parameter here is where we can provide global data for the
     * input trace -- we don't need any in this example.
     */
    if (trace_pstart(input, NULL, processing, reporter)) {
        trace_perror(input, "Starting parallel trace");
        return 1;
    }

    /* This will wait for all the threads to complete */
    trace_join(input);

    /* Clean up everything that we've created */
    trace_destroy(input);
    trace_destroy_callback_set(processing);
    trace_destroy_callback_set(reporter);
    return 0;
}

Let's go over the key points from this main function in a bit more detail:

  • Developers familiar with libtrace3 should note that we no longer need to call trace_create_packet(). Packet creation is handled within libtrace itself now.
  • The trace_create() step is exactly the same as it is in libtrace3.
  • In this example, I've introduced three of the parallel-specific configuration options. There are several more, but these are likely to be the most commonly used ones. If you wish to set these options, you must do so after creating the input trace but before calling trace_pstart(). All options that were present in libtrace3 (e.g. BPF filters) will still work with parallel libtrace.
  • You can use trace_set_perpkt_threads() to control the number of processing threads used by libtrace. If not set, libtrace will create one thread for each core it detects on your system.
  • trace_set_combiner() tells libtrace which combiner to use to funnel results through to the reporter thread. It is important to choose the right combiner for your application; use an ordered combiner if your results have any sort of inherent ordering that matters to your application (e.g. timestamps). If order does not matter, an unordered combiner will be much faster. You can also write your own custom combiner if none of the built-in combiners is suitable. If you do not choose a combiner, the unordered combiner will be used.
  • trace_set_hasher() tells libtrace which hasher to apply to the packets when assigning them to processing threads. Libtrace comes with three hashers included: a balanced hasher which tries to evenly spread packets across the threads, a unidirectional hasher which ensures all packets for a unidirectional flow are assigned to the same thread and a bidirectional hasher which ensures all packets for a bidirectional flow are assigned to the same thread. If you are doing any flow-level analysis, you will want to use either a bidirectional or unidirectional hasher, depending on what you consider to be a flow. If you don't care about flows, a balanced hasher will work well. The default hasher is the balanced hasher. You can also provide your own hasher function by setting the hasher type to HASHER_CUSTOM and providing a function name and any data it requires as the third and fourth arguments, respectively.
  • The input trace is started by calling trace_pstart(). This function takes four parameters; the first parameter is the trace to be started. The second parameter is a pointer to any global data that should be made available to all of the callback functions as their 'global' parameter; this may be NULL if no global data is required. The third parameter is a pointer to the callback set for the processing threads; this must not be NULL and it must have a packet callback defined. The fourth parameter is a pointer to the callback set for the reporter thread; this can be NULL, in which case no reporter thread will be started. Once this function has executed, the processing and reporter threads will be active and running and packets will be being read from the trace and dispatched by your hasher function.
  • The final key element of our main function is the call to trace_join(). This halts execution of our main function until all of the processing and reporter threads have exited (otherwise, our program would just carry on and exit immediately!). If we are processing a trace file, this will happen when we reach the end of the file and all results have been read by the reporter. If we are reading from a live source, this will only happen in the event of an error or a user interrupt; consider adding a signal handler for SIGINT and SIGTERM so that you can exit gracefully if the user wishes to interrupt your program.

At this point, we have a working parallel libtrace program. Let's move on to some slightly more advanced topics, starting with using ticks to keep our results synchronised.

Clone this wiki locally