This artifact makes heave use of Docker and Docker Hub. To reproduce our experiments we assume the user has the following:
- A machine with a working installation of Docker.
- A good internet connection.
We have tested our docker images on both Linux clients (Ubuntu/Debian and CentOS) as well as a Windows laptop with docker installed.
In addition to our end-to-end toolchain demonstrations (descried below), we have provided routines to reproduce each of our experiments. These reproductions do NOT include running our toolchain on Linux end-to-end (as this takes about 4 hours on our 64 core workstation). Instead, we have opted to provide pre-baked vectors.
See reproduce/rq1/README.md or just run the following:
docker pull jjhenkel/code-vectors-artifact:rq1
docker run -it --rm jjhenkel/code-vectors-artifact:rq1
WARNING: can take up to 4 hours
and 6 GB
of space. (Runtime on a 8-core desktop machine is about 3 hours
.)
See reproduce/rq2/README.md or just run the following:
docker pull jjhenkel/code-vectors-artifact:rq2
docker run -it --rm jjhenkel/code-vectors-artifact:rq2
NOTE: can take up to 1 hour
and 2 GB
of space.
See reproduce/rq3/README.md or just run the following:
docker pull jjhenkel/code-vectors-artifact:rq3
docker run -it --rm jjhenkel/code-vectors-artifact:rq3
See reproduce/rq4/README.md or just run the following:
docker pull jjhenkel/code-vectors-artifact:rq4
docker run -it --rm jjhenkel/code-vectors-artifact:rq4
This artifact makes heavy use of Docker and Docker Hub. As a prerequisite this artifact assumes the user has the following:
- A machine running some Linux distribution. (We've tested on Ubuntu/Debian and CentOS.) We've also done some limited testing of the end-to-end demo on OSX, but support for that platform is still experimental.
- A working installation of Make and Git.
- A working installation of docker (
docker run hello-world
produces output without errors).
Given these prerequisites installing this artifact should be easy. To install follow these steps:
git clone https://github.com/jjhenkel/code-vectors-artifact
cd ./code-vectors-artifact
make end-to-end-redis
make learn-vectors-redis
NOTE: can take up to 3 hours
and 4 GB
of space. (Runtime on a 4-core desktop machine is about an hour.)
If you were able to run the first three commands (listed above) you have done the following:
- Used our
spec2image
tool to create a docker image (locally) containing theredis
program's source files and steps to build that program from those source files. - Built that program (in a container) with our custom version of GCC 7.3.0 running our
c2ocaml
plugin. - Created an OCaml file for each procedure GCC/G++ encountered while building the
redis
application from source. - Grouped and merged these transformed OCaml files into larger chunks.
- Ran (in parallel) our
lsee
tool on these merged OCaml files to produce abstracted symbolic traces. - Merged the abstracted symbolic traces into a single trace corpus named
redis.traces.txt
.
That's a lot of cool stuff!
IMPORTANT: make sure you run make end-to-end-redis
before attempting to learn vectors.
Here's what learn-vectors-redis
does:
- Uses GloVe to build a vocabulary / cooccurence matrix
- Learns word vectors using some default parameters
- Runs a demonstration using some Python and the Gensim library