Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds currently known publications and public use cases to the README #6572

Merged
merged 9 commits into from
Nov 27, 2024
59 changes: 39 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ runtime traces produced from instrumented code,
PolyTracker is controlled via a Python script called `polytracker`. You can
install it by running

```
```shell-script
pip3 install polytracker
```

Expand All @@ -57,13 +57,13 @@ users are likely to run it in a containerized environment. Luckily,
`polytracker` makes this easy. All you need to do is have `docker` installed,
then run:

```
```shell-script
polytracker docker pull
```

and

```
```shell-script
polytracker docker run
```

Expand All @@ -78,7 +78,7 @@ instrumented program's control flow graph, and even extract a context free
grammar matching the inputs accepted by the program. You can explore these
commands by running

```
```shell-script
polytracker --help
```

Expand All @@ -100,7 +100,7 @@ instrumented environment. This will produce a `blight_journal.jsonl` file that
records all commands run during the build. If you have a C/C++ target, you can
instrument it by invoking `polytracker build` and passing your build command:

```bash
```shell-script
polytracker build gcc -g -o my_binary my_source.c
```

Expand All @@ -110,14 +110,14 @@ directory to build an instrumented version of your build target. The
instrumented build target will be built using the same flags as the original
build target.

```bash
```shell-script
polytracker instrument-targets my_binary
```

`build` also supports more complex programs that use a build system like
autotiools or CMake:

```bash
```shell-script
polytracker build cmake .. -DCMAKE_BUILD_TYPE=Release
polytracker build ninja
# or
Expand All @@ -127,8 +127,8 @@ polytracker build make

Then run `instrument-targets` on any targets of the build:

```bash
$ polytracker instrument-targets a.bin b.so
```shell-script
polytracker instrument-targets a.bin b.so
```

Then `a.instrumented.bin` and `b.instrumented.so` will be the instrumented
Expand Down Expand Up @@ -199,10 +199,10 @@ instrumentation parameters without needing to recompile the binary.
### Environment Variables

PolyTracker accepts configuration parameters in the form of environment
variables to avoid recompiling target programs. The current environment
variables PolyTracker supports is:
variables to avoid recompiling target programs. The current set of environment
variables that PolyTracker supports is:

```
```shell-script
POLYDB: A path to which to save the output database (default is polytracker.tdag)

WLLVM_ARTIFACT_STORE: Provides a path to an existing directory to store artifact/manifest for all build targets
Expand Down Expand Up @@ -251,20 +251,20 @@ focuses on ignoring system libraries. The original script can be found in
Check out this Git repository. From the root, either build the base PolyTracker
Docker image:

```commandline
```shell-script
pip3 install -e ".[dev]" && polytracker docker rebuild
```

or pull the latest prebuilt version from DockerHub:

```commandline
```shell-script
docker pull trailofbits/polytracker:latest
```

For a demo of PolyTracker running on the [MuPDF](https://mupdf.com/) parser run
this command:

```commandline
```shell-script
docker build -t trailofbits/polytracker-demo-mupdf -f examples/pdf/Dockerfile-mupdf.demo .
```

Expand All @@ -275,16 +275,16 @@ information provided by the taint analysis.
For a demo of PolyTracker running on Poppler utils version 0.84.0 run this
command:

```commandline
```shell-script
docker build -t trailofbits/polytracker-demo-poppler -f examples/pdf/Dockerfile-poppler.demo .
```

All the poppler utils will be located in
`/polytracker/the_klondike/poppler-0.84.0/build/utils`.

```commandline
$ cd /polytracker/the_klondike/poppler-0.84.0/build/utils
$ ./pdfinfo_track some_pdf.pdf
```shell-script
cd /polytracker/the_klondike/poppler-0.84.0/build/utils
./pdfinfo_track some_pdf.pdf
```

## Building PolyTracker from Source
Expand Down Expand Up @@ -324,6 +324,20 @@ source file. This is most common when instrumenting compression and
cryptographic algorithms that have large block sizes. There are a number of
mitigations for this behavior currently being researched and developed.

## Publications and Current Use Cases

Here are some of the publicly available things we've done with PolyTracker. If you know of anything else you'd like to see listed here, please let us know!

- The [Format Analysis Workbench](https://github.com/galoisinc/faw) integrates several PolyTracker features from different versions of the codebase, namely grammar extraction and blind spot detection.
- Harmon, Carson, Bradford Larsen, and Evan A. Sultanik. "[Toward automated grammar extraction via semantic labeling of parser implementations.](https://bradfordlarsen.com/files/publications/semantic-labeling-langsec-2020.pdf)" 2020 IEEE Security and Privacy Workshops (SPW). IEEE, 2020.
- Brodin, Henrik, Marek Surovič, and Evan Sultanik. "[Blind spots: Identifying exploitable program inputs.](https://langsec.org/spw23/papers/Brodin_LangSec23.pdf)"
2023 IEEE Security and Privacy Workshops (SPW). IEEE, 2023.
- Henrik used PolyTracker's blind spots (`mapping` and `cavities` more precisely) trace analysis functionality to pinpoint a CVE and [wrote about it on the Trail of Bits blog](https://blog.trailofbits.com/2023/03/30/acropalypse-polytracker-blind-spots/).
- Kaoudis, Kelly, Henrik Brodin, and Evan Sultanik. "[Automatically Detecting Variability Bugs Through Hybrid Control and Data Flow Analysis.](https://langsec.org/spw23/papers/Kaoudis_LangSec23.pdf)"
2023 IEEE Security and Privacy Workshops (SPW). IEEE, 2023.
- Evan Sultanik, Marek Surovič, Henrik Brodin, Kelly Kaoudis, Facundo Tuesca, Carson Harmon, Lisa Overall, Joseph Sweeney, and Bradford Larsen.
"[PolyTracker: Whole-Input Dynamic Information Flow Tracing.](https://github.com/trailofbits/publications/blob/master/papers/issta24-polytracker.pdf)" In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2024.

## License and Acknowledgements

This research was developed by [Trail of Bits](https://www.trailofbits.com/)
Expand All @@ -333,8 +347,13 @@ licensed under the [Apache 2.0 license](LICENSE). © 2019, Trail of Bits.

## Maintainers

Please contact us using `[email protected]`.

[Evan Sultanik](https://github.com/ESultanik)<br />
[Henrik Brodin](https://github.com/hbrodin)<br />
[Kelly Kaoudis](https://github.com/kaoudis)<br />

## Past Maintainers

[Marek Surovič](https://github.com/surovic)<br />
[Facundo Tuesca](https://github.com/facutuesca)<br /> <br />
`[email protected]`