A paramount asset of the methodology is the laboratory notebook (labbook), similar to the ones biologist, chemists and scientist from other fields use on a daily basis to document the progress of their work. This notebook is a single file inside the project repository, shared between all collaborators. The main motivation for keeping a labbook is that anyone, from original researchers to external reviewers, can later use it to understand all the steps of the study and potentially reproduce and improve it. This self-contained unique file has two main parts. The first one aims at carefully documenting the development and use of the researchers’ complex source code. The second one is concerned with keeping the experimentation journal.
This part serves as a starting point for newcomers, but also as a good reminder for everyday users. The labbook explains the general ideas behind the whole project and methodology, i.e., what the workflow for doing experiments is and how the code and data are organized in folders. It also states the conventions on how the labbook itself should be used. Details about the different programs and scripts, along with their purpose follow. These information concern the source code used in the experiments as well as the tools for manipulating data and the analysis code used for producing plots and reports. Additionally, there are a few explanations on the revision control usage and conventions. Moreover, this part of the labbook contains a few examples how to run scripts, illustrating the most common arguments and format. Although such information might seem redundant with the previous documentation part, in practice such examples are indispensable even for experienced users, since some scripts have lots of environment variables, arguments and options. It is also important to keep track of major changes to the source code and the project in general inside a ChangeLog. Since all modifications are already captured and commented in Git commits, the log section offers a much more coarse-grain view of the code development history. There is also a list with a brief description of every Git tag in the repository as it helps finding the latest stable or any other specific version of the code.
All experiments should be carefully noted in this part, together with the key input parameters, the motivation for running such experiment and the remarks on the results. For each experimental campaign there should be a new entry that answers to the questions “why”, “when”, “where” and “how” experiments were run and finally what the observations on the results are. Inside the descriptive conclusions, Org-mode allows for using both links and git-links connecting the text to specific revisions of files. These hyperlinks point to crucial data and analysis reports that illustrate a newly discovered phenomenon.
This simple example project was hugely inspired by Arnaud Legrand’s Master course on performance evaluation and experimental methodology. The motivation and explanation of the methodology used for this project is explained in more details in this journal paper and PhD thesis of Luka Stanisic. To see how this methodology helped performing a much larger study concearning modeling and simulation of HPC applications running on top of dynamic task-based runtimes, go to the following webpage.
Sometimes outputs of the experiments can be very large (hundreds of MB) and adding them to Git repository is a bit problematic. Git was not made for handling large files and also pulling a large repository of several GBs takes a lot of time. Thus, it is advised to use supplementary tools that help dealing with large files inside Git, such as Git-LFS and git-annex. Additionally, Git-LFS is already integrated into GitHub and GitLab, so using it shouldn’t be that hard.
- Create a new branch for doing experiments
- Make sure everything is commited
- Run run_xp script with the desired parameters
- Add obtained new experiments results to the git
- Do the analysis
- Add to the labbook (this file) inside “* Experiment results” a new section describing conditions in which experiment was performed and the major notes about the results
- Commit/push all files in the experimental branch
- Merge this new branch with the remote “data” branch
Scripts for running the experiments and collecting the data are written directly in this file. Later they should be extracted from it using Org-mode’s tangling mechanism. If you are not so common with Org-mode, do not hesitate to write this scripts in regular bash (or python) files separately from laboratory notebook. In such case ignore the next first subsection and all Org-babel related things in the following section-inspect just the shell code inside the blocks.
For writing, tangling the current file and making all scripts executable an easy process, I added the following shortcuts to my Emacs configuration.
(global-set-key (kbd "C-c m") (lambda () (interactive) (org-babel-tangle-file buffer-file-name) (shell-command "chmod +x *.sh") ))
Also section properties are used to define in which files the code will be tangled.
- Written in Org-mode and later tangled in a shell script.
- Used for running the experiment in the right conditions, also calling the get_info script.
- It will also verify that the source code is commited and that you are in the right branch, before running the experiments.
Bash preamble.
#!/bin/bash
# Script for running parallel quicksort comparison
set -e # fail fast
Defining all variables that will be used in the script.
# Script parameters
basename="$PWD"
host="$(hostname | sed 's/[0-9]*//g' | cut -d'.' -f1)"
datafolder=""
dataname="QuicksortData"
srcfolder="$basename/src"
# DoE parameters
# Reading command line arguments
# Choosing which program to run (in case this script is a wrapper for multiple programs)
programname="./src/parallelQuicksort"
# Program options
testing=0
recompile=0
verbose=0
Writing the help output, to help users invoke the script.
help_script()
{
cat << EOF
Usage: $0 options
Script for running StarPU with benchmarking
OPTIONS:
-h Show this message
-t Enable testing
-c Recompile source code
-v Verbose output in the terminal
EOF
}
# Parsing options
while getopts "tcvh" opt; do
case $opt in
t)
testing=1
;;
c)
recompile=1
;;
v)
verbose=1
;;
h)
help_script
exit 4
;;
\?)
echo "Invalid option: -$OPTARG"
help_script
exit 3
;;
esac
done
Getting input parameter (number of elements to sort) from the command line argument.
shift $((OPTIND - 1))
RANGE1="-1"
RANGE2="-1"
if [[ $# == 1 ]]; then
RANGE1=$1
fi
if [[ $# == 2 ]]; then
RANGE1=$1
RANGE2=$2
fi
if [[ $# > 2 ]]; then
echo 'ERROR-more than two input parameters'
help_script
exit 2
fi
Verifying that everything is commited (if this is not a simple test), that we are in the right branch.
# Doing real experiments, not just testing if the code is valid and can be executed
if [[ $testing == 0 ]]; then
branch=$(eval basename $(git symbolic-ref HEAD))
echo "Now you are in git branch: ${branch}"
# Checking the name of the branch is not master
if [[ "$branch" == *master* ]]; then
echo "ERROR-experiments can be done only in specific xp branch!"
echo "Use -t option for testing"
exit 2
fi
# Checking if everything is commited
if git diff-index --quiet HEAD --; then
echo "Everything is commited"
else
echo "ERROR-need to commit before doing experiment!"
git status
exit 1
fi
fi
For real experiments (not tests), data folder will take the name from the branch. This can be configured differently, but such a way is the easiest and completely automatic.
# Name of data folder where to store results if testing
if [[ $testing == 1 ]]; then
dataname="Test"
datafolder="$basename/data/testing"
else
datafolder="$basename/data/$branch"
fi
mkdir -p $datafolder
Giving original name with appended with a unique number to the output file.
# Producing the original name for output file
bkup=0
while [ -e $datafolder/$dataname${bkup}.org ] || [ -e $datafolder/$dataname${bkup}.org~ ]; do
bkup=`expr $bkup + 1`
done
outputfile="$datafolder/$dataname${bkup}.org"
Calling an external script to capture metadata.
# Starting to write output file and capturing all metadata
title="Experiment for parallel quicksort"
./get_info.sh -t "$title" $outputfile
Compiling the source code and capturing the compilation output.
# Compilation
echo "* COMPILATION" >> $outputfile
if [[ $recompile == 1 ]]; then
echo "Cleaning the previously compiled code..."
make clean -C $srcfolder > /dev/null # Decided not to capture the output of clean
# Here for some more complex the configuration is needed
# The options and the output of the ./configure should also be captured
echo "** COMPILATION OUTPUT" >> $outputfile
echo "Compiling..."
make -C $srcfolder >> $outputfile
# make install (when needed)
echo "** SHARED LIBRARIES DEPENDENCIES" >> $outputfile
ldd $programname >> $outputfile
else
echo "# used previous compile" >> $outputfile
fi
Generating randomized input parameters for the experiment using R.
rm -f input_values
Rscript input_generator.R $RANGE1 $RANGE2
Preparing and capturing the line with which the experiment will be executed.
# Prepare running options
running="$programname"
echo "* COMMAND LINE USED FOR RUNNING EXPERIMENT" >> $outputfile
echo $running >> $outputfile
Run program.
echo "Executing program..."
# Writing results
rm -f stdout.out
rm -f stderr.out
time1=$(date +%s.%N)
set +e # In order to detect and print execution errors
# In fact only the following line is actually executing the experiment
# everything else is a wrapper, but it is very important that it is captured
#############################################
for NUM in $(cat input_values)
do
echo "Array size: $NUM" 1>> stdout.out
eval $running $NUM 1>> stdout.out 2>> stderr.out
done
#############################################
set -e
time2=$(date +%s.%N)
echo "* STDERR OUTPUT" >> $outputfile
cat stderr.out >> $outputfile
if [ ! -s stdout.out ]; then
echo "ERROR DURING THE EXECUTION!!!" >> stdout.out
fi
echo "* STDOUT OUTPUT" >> $outputfile
cat stdout.out >> $outputfile
if [[ $verbose == 1 ]]; then
cat stderr.out
cat stdout.out
fi
End and cleanup.
# Cleanup & End
echo "* ELAPSED TIME FOR RUNNING THE PROGRAM" >> $outputfile
echo "Elapsed: $(echo "$time2 - $time1"|bc ) seconds" >> $outputfile
rm -f stdout.out
rm -f stderr.out
rm -f input_values
cd $basename
- Written in Org-mode and later tangled in a shell script.
- Capturing all the necessary metadata.
- Enlarge it (or change it) according to your specific needs.
Bash preamble.
#!/bin/bash
# Script to get machine information before doing the experiment
set +e # Don't fail fast since some information is maybe not available
Defining all variables that will be used in the script.
# Script parameters
title="Experiment results"
host="$(hostname | sed 's/[0-9]*//g' | cut -d'.' -f1)"
Writing the help output, to help users invoke the script.
help_script()
{
cat << EOF
Usage: $0 [options] outputfile.org
Script for to get machine information before doing the experiment
OPTIONS:
-h Show this message
-t Title of the output file
EOF
}
# Parsing options
while getopts "t:h" opt; do
case $opt in
t)
title="$OPTARG"
;;
h)
help_script
exit 4
;;
\?)
echo "Invalid option: -$OPTARG"
help_script
exit 3
;;
esac
done
Getting output file name from the command line argument.
shift $((OPTIND - 1))
outputfile=$1
if [[ $# != 1 ]]; then
echo 'ERROR!'
help_script
exit 2
fi
Preambule of the output file.
echo "#+TITLE: $title" >> $outputfile
echo "#+DATE: $(eval date)" >> $outputfile
echo "#+AUTHOR: $(eval whoami)" >> $outputfile
echo "#+MACHINE: $(eval hostname)" >> $outputfile
echo "#+FILENAME: $(eval basename $outputfile)" >> $outputfile
echo " " >> $outputfile
Collecting metadata.
echo "* MACHINE INFO" >> $outputfile
echo "** PEOPLE LOGGED WHEN EXPERIMENT STARTED" >> $outputfile
who >> $outputfile
echo "** ENVIRONMENT VARIABLES" >> $outputfile
env >> $outputfile
echo "** HOSTNAME" >> $outputfile
hostname >> $outputfile
if [[ -n $(command -v lstopo) ]];
then
echo "** MEMORY HIERARCHY" >> $outputfile
lstopo --of console >> $outputfile
fi
if [ -f /proc/cpuinfo ];
then
echo "** CPU INFO" >> $outputfile
cat /proc/cpuinfo >> $outputfile
fi
if [ -f /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ];
then
echo "** CPU GOVERNOR" >> $outputfile
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor >> $outputfile
fi
if [ -f /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq ];
then
echo "** CPU FREQUENCY" >> $outputfile
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq >> $outputfile
fi
if [[ -n $(command -v nvidia-smi) ]];
then
echo "** GPU INFO FROM NVIDIA-SMI" >> $outputfile
nvidia-smi -q >> $outputfile
fi
if [ -f /proc/version ];
then
echo "** LINUX AND GCC VERSIONS" >> $outputfile
cat /proc/version >> $outputfile
fi
if [[ -n $(command -v module) ]];
then
echo "** MODULES" >> $outputfile
module list 2>> $outputfile
fi
Collecting revisions info.
echo "* CODE REVISIONS" >> $outputfile
git_exists=`git rev-parse --is-inside-work-tree`
if [ "${git_exists}" ]
then
echo "** GIT REVISION OF REPOSITORY" >> $outputfile
git log -1 >> $outputfile
fi
svn_exists=`svn info . 2> /dev/null`
if [ -n "${svn_exists}" ]
then
echo "** SVN REVISION OF REPOSITORY" >> $outputfile
svn info >> $outputfile
fi
This script is a wrapper around simple git revert. It will detect all source code commits inside xp# branch and it will revert them automatically. Thus, try to always keep separate commits for adding data files and changing source code.
Bash preamble.
#!/bin/bash
# Script for doing a revert all source modifications before merging "exp" branch with main "data" branch
# Shall always be called from "exp" branch
set -e # fail fast
Doing revert only for experimental branches.
whole_list="list.out"
rm -f $whole_list
branch=$(eval basename $(git symbolic-ref HEAD))
echo "Now you are in git branch: ${branch}"
if [[ "$branch" == master || "$branch" == data ]]; then
echo "ERROR: Revert should not be performed for master or data branch!"
exit 2
fi
Making sure everything is commited.
if git diff-index --quiet HEAD --; then
echo "Everything is commited"
else
echo "Warning: It is better to commit everything before doing revert!"
fi
Finding commits that need reverting.
all_commits=$(git log --format=format:%H master..HEAD)
while IFS= read -r commit
do
type=unknown
f=$(git diff-tree --no-commit-id --name-only $commit)
while IFS= read -r line
do
case "$type,$line" in
"unknown,data") type=Data
;;
"unknown,"*) type=Src
;;
"Src,data") type=error
echo "There is a commit with Src and Data together"
exit 2
;;
"Src,"*)
;;
"Data,"*) type=error
echo "There is a commit with Src and Data together"
exit 3
;;
*) type=internal_error
;;
esac
done <<< "$f"
echo -e "$type $commit" >> $whole_list
done <<< "$all_commits"
Showing all commits.
echo "All commits and their type:"
cat $whole_list
Reverting Src commits.
revert_list=$(cat $whole_list | grep "^Src" | cut -d' ' -f 2)
while IFS= read -r commit
do
git revert -n $commit
done <<< "$revert_list"
echo "Revert before merging with data branch"
Commiting revert-doing one big “anti-commit”
git commit -am "Revert before merging with data branch-done by git_revert.sh"
echo "DONE: Single anti-commit!"
Cleaning up.
rm -f $whole_list
The output file containing the array size between the two input arguments will later be used by run_xp. Note that the seed of the random generator, the number of array sizes to test and the number of repetition are constants of the R script, but they could also be its input parameters.
Constants of the script.
set.seed(42)
num=4
rep=5
# Disabling scientific notation in printing
options(scipen=999)
Reading the input arguments that represent the range.
args <- commandArgs(trailingOnly = TRUE)
range1<-as.numeric(args[1])
if (range1==-1)
range1<-1000000
range2<-as.numeric(args[2])
Sampling random values from the range. Adding repetitions and again rearranging the vector.
if (range2==-1){
x <- rep.int(range1, rep)
} else{
x <- c(range1, range2)
x <- append(x, sample(range1:range2, num, replace=F))
x <- rep.int(x,rep)
x <- x[sample.int(length(x))]
}
cat(x, file="input_values")
Ideally, the order in which different algorithm implementations are executed should be randomized as well, but at this point I don’t wish to change the source code file.
I copied the code from Arnaud Legrand, which he used for his performance evaluation classes for Master 2 students.
The code is quite simple at the moment and can be run in the following way.
./src/parallelQuicksort [1000000]
When run, the code executes initializes an array of the size given in argument (1000000 by default) with random integer values and sorts it using:
- a custom sequential implementation;
- a custom parallel implementation;
- the libc qsort function.
Times are reported in seconds.
- Typical Makefile (also taken from Arnaud Legrand) which compiles C code
Org-mode analysis file using shell and perl to extract data from the output files. Later using R to analyze the data and generate plots using ggplot2.
- All the source code, scripts for running experiments, analysis code and this labbook with only “Documentation” part
- Everything that is already in master branch
- All the data related to specific experimental campaign
- Also some important (not all) analysis .pdf report files
- Everything that is already in master branch
- All the data gathered from previous experimental campaigns and some analysis which compares different experimental sets of results
For running test experiments, just to validate that the script is working as planned (without caring about the actual experimental results).
./run_xp.sh -t -c -v
For executing real experiment, with an input parameter 2000000.
./run_xp.sh -c -v 2000000
- Contains important coarse-grain information about the changes on the project.
- This section may be redundant if:
- The project is relatively small and changes are captured well enough as Git commits
- Researchers keep the information about the progress in their private journals, which they use on a daily basis for all their projects
- However, if the project is large and involves multiple collaborators, it is better to use this ChangeLog for noting all major changes
- Started writing this small example project
When the experimental campaign is over, Git branches can be deleted as all the data has been merged into data branch. Still, it could be interesting to leave a track of that code version that can be spotted easier than a certain commit, in order to easy a future entry point for reproducing the experiments. For that, we put Git tags on some important branches.
Git tags can also be used as in the standard software development to annotate an important source code version.
The list of Git-tags can be copied by hand or even better using Org-babel.
git tag -n1
stable0.9 This code work correctly xp1 First experiments to test workflow xp2 Experiments to try analysis on multiple data xp3 Comparing 3 algorithm implemetation for a range of array size