Skip to content
/ mca2 Public
forked from DeepnessLab/mca2

Standalone Aho-Corasick based pattern matching system implementation, including compressed and non-compressed Aho-Corasick implementations

Notifications You must be signed in to change notification settings

yotamhc/mca2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pattern Matching Engines

Standalone Aho-Corasick based pattern matching system implementation, including compressed and non-compressed Aho-Corasick implementations, and MCA^2 system. This repository is based on the following papers:

Pattern matching code, Aho-Corasick implementation, DFA compression:

Anat Bremler-Barr, Yotam Harchol, David Hay: Space-time tradeoffs in software-based deep Packet Inspection. HPSR 2011: 1-8

Smart multicore environment with heavy packet detection and offloading:

Yehuda Afek, Anat Bremler-Barr, Yotam Harchol, David Hay, Yaron Koral: MCA2: multi-core architecture for mitigating complexity attacks. ANCS 2012: 235-246

Usage:

Input files

You should have two input files to run the non-compressed AC:

  • Patterns file
  • Trace file

(unfortunately, it expects some special binary format defined for these files back then...)

Patterns

There is a ready-to-use patterns file in the ZIP, named SnortPatternsFull2.bin. It contains about 6K patterns taken from Snort at some point of time. If it's enough for you, use it. Otherwise, you should use the utils/SnortConverter java utility, which reads patterns in ASCII format (where non-ascii binary values are specified in |XX| format as in Snort), and creates a file in my special binary format.

Trace

There is some ugly (but correct) way to convert PCAP to this format, using the Java code in utils/DumpConverter. Compile DumpConverter.java and run it with no arguments, it will show you the way (using tcpdump to write the pcap in hexa, then running this utility to convert the hexa to binary...).

Running Pattern Matching (HPSR'11)

To run the executable (say it is called main) you need to specify some arguments:

-t will time the run and show throughput
-m:X will use X threads for DPI
-a:path will read patterns from the given path and build a non-compressed AC DFA to scan with
-s:path will scan the trace given in the path
-c:path will read patterns and create a compressed automaton from them. It should be used with -l:0 -b:0 -d:prefix_path where prefix_path is a path to some file prefix that will be used to create several files that together represent the compressed automaton.
(you cannot use -c with -a or with -s)
-r:prefix_path will read a compressed automaton to scan with, to use with -s

Example run of non-compressed AC DFA:

./main -t -m:1 -a:SnortPatternsFull2.bin -s:my_trace.bin

Example run of compressed AC automaton:

Step 1: create the compressed automaton and save to disk

./main -c:SnortPatternsFull2.bin -d:snort_compressed

(this creates three files: snort_compressed.lookup, snort_compressed.patterns, snort_compressed.states)

Step 2: run the compressed AC automaton on a trace:

./main -t -m:1 -r:snort_compressed -s:my_trace.bin

Running MCA^2 (ANCS'12)

To run the system in MCA^2 mode (multithreaded with heavy packet isolation and transfer), uncomment the HYBRID_SCANNER flag in Common/Flags.h, compile, and run with both -a and -r, along with -s. You should also specify more parameters as appears in the usage string printed when running with no parameters.

For questions, contact yotamhc

About

Standalone Aho-Corasick based pattern matching system implementation, including compressed and non-compressed Aho-Corasick implementations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 93.1%
  • Java 5.9%
  • C++ 1.0%