-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The scalability of PolyTracker #6576
Comments
Hi,
There might be additional things needed, but this should be the high-level steps at least. |
Hi, thank you very much for your detailed reply!! I will follow your guidance and test it out. |
Hi @hbrodin , I did make some attempts recently. There are two steps I haven't finished.
Can you further help me provide some instructions, especially the second point, like referring to the required code locations? Thank you very much. |
Hey @llooFlashooll I can speak some to the second point, though naturally I defer to Henrik if I'm wrong. Tracking tainted bytes can start anywhere input is read or taken into the instrumented program, but you will need to define what those start points would be for a Rust program, if they are different from what is already implemented in Polytracker. What is already implemented that should still work with Rust or C++ or whatever is the code that actually writes source labels out to the tdag and starts tainting "from there" with respect to the data flow of the instrumented program. What you might need to define for Rust is when those source labels should be created. To do this, I believe you'd need to add whatever you are interested in initially tainting to the taint sources. If we use taint_source_buffer as an example,
In the above function we're setting taint sources in the tdag. Each source label set will correspond to a particular input byte. Please keep in mind that taint and provenance are tracked at the byte level in Polytracker. What we're doing for each type of taint-source function varies a bit, so you might want to read all of them before deciding how to implement your own. With respect to sinks, there is also some naming overlap in the code, since we also refer to Polytracker writing output to the tdag file as writing to a sink, but we do need to write the program sinks to the tdag so that the full "trees" of taint can be post hoc reconstructed. Taint-sinks are the functions that processed the bytes that identify where taint tracking stopped. If you have not yet, you may also potentially want to modify the ABI lists here to make sure the Rust functionality you are interested in tracking taint through gets instrumented, and any functionality you are not interested in instrumenting can be ignorelisted. My understanding is that any code that is not either part of the ABI lists nor defined as a source or sink will be instrumented as if it were codebase business logic, meaning taint can be tracked through it, but can't originate from it. |
Thank you very much again! I have solved the first point and am working on your instructions. |
Hi folks, I really appreciate your work.
However, I have a question about whether this repo can scale to other LLVM-backend languages such as Rust, etc.?
For example, I want to use a simple program to test.
The text was updated successfully, but these errors were encountered: