RAPID is a lightweight framework for implementing dynamic race detection engines.
For the sake of simplicity, RAPID does not support its own logging mehanism but can leverage instrumentation of external tools --- RVPredict or RoadRunner --- for logging program executions (traces). On top of this, RAPID provides functionality for reading these traces. RAPID also supports two more trace formats: CSV and STD, described below.
RAPID supports four trace formats: STD, RoadRunner, RVPredict and CSV. Traces can be read in all the three formats, and can be converted from any of the three formats to either CSV or STD format. CSV format is close to being obsolete and STD serves as a better version than CSV.
Recall that a trace is a sequence of events, each of which can be seen as a record e = <t, op(decor), loc>
, where
t
is the identifier of the thread that performse
op(decor)
represents the operation alongwith the operand.op
can be one ofr
/w
(read from/ write to a memory location),acq
/rel
(acquire/release of a lock object),fork
/join
(fork or join of a thread). The fielddecor
is the corresponding memory location whenop
isr
/w
, lock identifier whenop
isacq
/rel
and thread identifer whenop
isfork
/join
.loc
represents the program location corresponding to this evente
.
The STD file format essentially represents a trace as a sequence of records, where the above three fields are separated by a |
. A sample STD file looks something like:
T0|r(V123)|345
T1|w(V234.23[0])|456
T0|fork(T2)|123
T2|acq(L34)|120
T2|rel(L34)|130
T0|join(T2)|134
which represents a trace containing 6 events, performed by 3 different threads (T0
,T1
,T2
). The 1st event is a read event on memory location V123
performed by thread T0
. The 2nd event is a write on memory location V234.23[0]
by thread T1
.
The 3rd event is performed by thread T0
and forks thread T2
.
The 4th event is an acquire event of lock L34
in thread T2
.
The 5th event release lock L34
in thread T2
.
The 6th event joins thread T2
in the parent thread T0
.
See here.
RVPredict is a commercial product of Runtime Verification Inc, and performs an SMT based analysis for predicting data races dynamically. RVPredict instruments Java bytecode and is able to perform dynamic analysis on the traces, at runtime. Additionally, it also provides the facility to log the trace generated to perform offline analysis on the logged traces. The developers of RVPredict have kindly shared with us (developers of RAPID) code snippets used to decode these compressed binary traces. As a result, RAPID can analyze traces generated by RVPredict’s logging feature. Thanks to the execellent features provided by RVPredict, the development of our dynamic analysis engines could be carried on without worrying about the details of instrumentation, which can be really cumbersome. Besides, this also allowed us to compare against RVPRedict's performance.
Download :
The academic version of RVPredict can be downloaded here here.
The version of RVPredict that allows us to log traces can be found here.
Generating RVPredict Logs : After obtaining a licensed copy of RVPredict, one can execute the following command to generate log files :
$ /path/to/rv-predict --log --base-log-dir /some/directory --log-dirname somename -cp /path/to/jar main.Class
where
/path/to/rv-predict
is the path to the script provided by RVPredict tool, often found in the folderRV-Predict/bin
/some/directory/somename
specifies the path where the log files will be storedmain.Class
specifies the path to the class that implements themain
function-cp /path/to/jar
is an optional parameter to specify the classpath to locate the classmain.Class
Executing the above command produces several log files with extension .bin
in the directory /some/directory/somename
. Each such .bin
file corresponds to the projection of the actual execution onto some specific thread. In addition, the directory also contains, amongst other files, the file metadata.bin
. This file contains information on how to map events in the trace logs to actual program locations and variables in the source code.
Alternatively, RVPredict can also be run as a java-agent in the following manner :
java -cp /path/to/jar -javaagent:/path/to/rv-predict.jar="--log --base-log-dir /some/directory --log-dirname somename" main.Class
Extracting metadata
The contents of the file metadata.bin
can be extracted as follows :
java -cp /path/to/rv-predict.jar com.runtimeverification.rvpredict.util.ReadLogFile /path/to/metadata.bin
Since the source code for RVPredict is proprietary. The code snippets shared by RVPredict developers have been obscured and provided as external jar libraries.
More documentation about RV-Predict is available here.
A CSV file represents a valid trace if
- the number of lines in the file is equal to the number of events in the trace,
- each of the lines has the same number of columns (= number of commas + 1), which is also the number of threads in the trace.
- in each line, exactly one column is populated, and is of the form
op(decor)
, similar to the middle field in the STD format. The index of the populated column corresponds to a unique thread.
The CSV equivalent of the above trace is givn below :
r(V123),,
,w(V234.23[0]),
fork(T2),,
,,acq(L34)
..rel(L34)
join(T2),,
RAPID has been designed to enable researchers to quickly implement and evaluate their dynamic analysis techniques. As a result, the architecture of RAPID has been kept at bare-bones minimum, so that it is simple and easy to understand. So far, the focus has not been on performance optimality.
Engine - The main interesting class that RAPID implements is the Engine
class.
Engine
is an abstract class, and can be inherited to implement different dynamic analysis engines.
The current distribution includes the following race detection engines :
- DJIT+ style HB race detection (see
engine.racedetectionengine.hb
) - HB race detection with FastTrack style epoch optimizations (see
engine.racedetectionengine.hb_epoch
) - HB race detection using Goldilocks algorithm (see
engine.racedetectionengine.goldilocks
) - Eraser's algorithm for violation of lockset principle (see
engine.racedetectionengine.lockset
) - DJIT+ style HB race detection which forces an order between each race detected (see
engine.racedetectionengine.fhb
) - Race Detection using the SHB partial order (see
engine.racedetectionengine.shb
) - Race detection using the SHB partial order with epoch optimization (see
engine.racedetectionengine.shb_epoch
) - Race detection using WCP partial order (
engine.racedetectionengine.wcp
) - Race detection for Sync-Preserving Races (
engine.racedetectionengine.syncpreserving
) - Velodrome algorithm for detecting atomicity violations (see
engine.atomicity.conflictserializability.velodrome
) - Aerodrome algorithm for detecting atomicity violations (see
engine.atomicity.conflictserializability.aerodrome_basic
for the basic algorithm andengine.atomicity.conflictserializability.aerodrome
for Aerodrome with optimizations)
The above packages also give a fair idea on how to write your favorite dynamic analysis engine in RAPID.
Generic Utilities - For implementing dynamic analysis engines, one might use sophisticated data structures, such as vector clocks (VC), adaptive VCs (epoch), etc. We have implemented popular ones in the package util
, and these are being used in some race detection engines.
Parsing Utilities - The parsing functions for reading traces in the three formats RVPredict, CSV, STD are implemented in parse
package. The parsing utilities for RVPredict format have been obfuscated and have been provided only as jar files
RAPID supports reading files specified in RoadRunner, STD, RVPredict or CSV format. RAPID also allows printing a given trace into either STD or CSV format.
Rapid icon made by Freepik from www.flaticon.com