Step by Step Instructions

Step by Step Instructions:

This document assumes that you have completed the first getting started steps. Now, we will explore those steps more elaborately.

Let’s first examine the last five lines of /home/osboxes/.bashrc file:

export PATH=~/pldi-19/llvm/bin:/$PATH
export HURON_BUILD=~/pldi-19/huron-repair/build
export HURON_RUNTIME=~/pldi-19/huron-repair/runtime
export SHERIFF=/home/osboxes/pldi-19/sheriff-master
export ITERATION=1

In the first line, we are modifying the PATH variable to include the llvm/bin directory so that all the necessary tools (clang, clang++, opt and others) can be accessed. We have provided pre-built binaries of llvm-7 within that bin directory.

In the second line, we are creating a new HURON_BUILD variable set to /home/osboxes/pldi-19/huron-repair/build. This directory contains all the necessary Huron binaries along with our llvm passes. If you want to rebuild Huron, you can run the following commands even though we recommend you use the prebuilt binaries.

$ cd ~/pldi-19/huron-repair/
$ rm -r build/
$ mkdir build
$ cd build
$ cmake ..
$ make

In the third line, we are creating another new HURON_RUNTIME variable set to /home/osboxes/pldi-19/huron-repair/runtime. This directory contains all the shared libraries necessary for Huron’s in-house runtime to capture, log, detect, and repair false sharing. If you want to rebuild runtime, you can run the following commands even though we recommend you use the prebuilt binaries as well.

$ cd ~/pldi-19/huron-repair/runtime
$ make clean
$ make

In the fourth line, we are creating another new SHERIFF variable set to /home/osboxes/pldi-19/sheriff-master. This directory contains all the necessary shared libraries of Sheriff. If you want to rebuild Sheriff, you can run the following commands even though we recommend that you use the prebuilt binaries.

$ cd ~/pldi-19/sheriff-master/
$ make clean
$ make

In the last line, we define the ITERATION variable set as 1. This variable notes how many times each benchmark will be executed. The corresponding average execution time is logged on the time.csv file. For the results presented in our paper, we have used at least 25 iterations for each benchmark. However, we recommend small numbers for artifact evaluation as large iterations can take a very long time (up to several hours). If you want, you can vary the number of iterations to observe the program behavior. If you decide to change the value of ITERATION, please re-execute the .bashrc script by running the source ~/.bashrc command.

Now, let us look at the whole Huron-process for a sample benchmark program.

$ cd ~/pldi-19/huron-repair/test_suites/lockless/
$ ls

The args file contains the command line arguments while running the benchmark program. The bash file cleans the benchmark program for subsequent builds. The builds the Huron-repaired in-production binary for the benchmark. The builds the Huron-instrumented in-house (profile) binary for the benchmark. The first builds the binary that is instrumented in house, monitors the in-house execution, post-processes the logged file to detect false sharing, generates a memory layout, where false sharing is repaired, then builds the repaired in-production binary, as well as the binaries repaired by Sheriff and manually-repaired binaries. The contains a dependency checking utility function. toy.c is the benchmark program. toy_manual.c is the manually repaired version. All the files with extensions bc, txt, log, and o are intermediate files of Huron. The profile_args file contains the command line arguments while running the in-house (profile-run) instrumented binary for Huron. The file runs all the different versions of the program ITERATION times and log the average on time.csv file.

Now, let us examine the content of file:

set -x
clang -c -emit-llvm toy.c -o main.bc
opt -load $HURON_BUILD/llvm-passes/Instrumenter/ -instrumenter < main.bc > main.inst.bc 2> inst.log
llc -filetype=obj main.inst.bc -o instrumented.o
clang instrumented.o -Wl,$HURON_RUNTIME/ -pthread -o instrumented.out

The first command, lets us debug the executed bash command. Then, we generate the LLVM bitcode for the benchmark program. Next, we apply our instrumentation pass to this bitcode. Then, we generate the machine-dependent assembly source code of the instrumented bitcode. Finally, we generate the instrumented binary along with our custom shared library to monitor and log memory accesses.

The 7th line of (./instrumented.out $args) script runs this instrumented binary with profile arguments. The next line detects false sharing by reading the logged memory accesses from record.log file and generates a memory layout to repair the false sharing as described by the address_translation_table.txt file. Now, let’s look at the details of file:

set -x
clang -c -emit-llvm toy.c -o main.bc
opt -load $HURON_BUILD/llvm-passes/MallocDependent/ -mallocdependent < main.bc > /tmp/main.bc 2> mallocdependent.log
opt -load $HURON_BUILD/llvm-passes/RedirectPtr/ -redirectptr -locfile address_translation_table.txt -depfile Andersen.txt  < main.bc > main.redirected.bc 2> redirect.log
llc -filetype=obj main.redirected.bc -o redirected.o
clang redirected.o -pthread -o product.out

The first two lines are identical to the file.

The next line denotes another custom llvm pass, mallocdependent. Huron repairs false sharing by changing the memory layout of the falsely-shared cache lines. The mallocdependent pass statically finds all the instructions on the module that are affected by the modified memory layout.

This pass uses an Andersen alias analysis to find this dependency. The next line repairs the detected false sharing using the modified memory layout and alias analysis results. The last two lines just generate the in-production binary for the benchmark.

Please run bash to build all the binaries (you can examine the script for build details). Then to run the binaries, use bash command. Please note that this bash command will take several minutes to complete. Therefore, just to run the programs use bash command instead.

The file will run and produce timing results in time.csv which contains two columns: the benchmark version and the execution time in milliseconds.

Thank you very much for taking the time to read this document.

