Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import null pointer information from PDG into static analysis #1086

Merged
merged 17 commits into from
Oct 23, 2024

Conversation

ahomescu
Copy link
Contributor

No description provided.

@ahomescu ahomescu self-assigned this Apr 25, 2024
@ahomescu ahomescu force-pushed the ahomescu/non_null branch from ac9abd6 to 0a55190 Compare April 25, 2024 05:21
@ahomescu
Copy link
Contributor Author

I still need to test this on some actual code (maybe lighttpd?)

@kkysen kkysen changed the title Import null pointer information from PDF into static analysis Import null pointer information from PDG into static analysis Apr 25, 2024
c2rust-analyze/src/analyze.rs Outdated Show resolved Hide resolved
c2rust-analyze/src/analyze.rs Outdated Show resolved Hide resolved
pdg/src/builder.rs Outdated Show resolved Hide resolved
@ahomescu ahomescu force-pushed the ahomescu/non_null branch from 0a55190 to 1f4aa09 Compare April 26, 2024 23:05
@ahomescu ahomescu force-pushed the ahomescu/non_null branch from 1f4aa09 to 2b30b48 Compare May 18, 2024 05:51
@spernsteiner
Copy link
Collaborator

Is this waiting on anything, other than needing a cargo fmt to fix the CI?

Copy link
Contributor

@fw-immunant fw-immunant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. The brokenness in visit_place is surprising but I agree that the current behavior is wrong as described; see inline comment.

@ahomescu ahomescu force-pushed the ahomescu/non_null branch 9 times, most recently from f0cadd4 to de844f5 Compare June 13, 2024 07:21
@ahomescu ahomescu force-pushed the ahomescu/non_null branch 4 times, most recently from 3a5ab91 to 710b626 Compare June 15, 2024 05:54
@spernsteiner
Copy link
Collaborator

@ahomescu Don't forget to fix the CI. Currently it's failing on the cargo fmt style check.

Copy link
Collaborator

@spernsteiner spernsteiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments. I haven't looked at the PDG code yet.

analysis/runtime/src/events.rs Outdated Show resolved Hide resolved
analysis/runtime/src/events.rs Outdated Show resolved Hide resolved
analysis/runtime/src/events.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@spernsteiner spernsteiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems generally reasonable. This approach of collapsing nested projections down to a single byte offset will limit our ability to extract borrowchecker aliasing information from the PDG, if we end up wanting to do that in the future.

Are there any automated tests related to instrumentation or the PDG? I thought we had one or two, but I could be wrong.

pdg/src/builder.rs Outdated Show resolved Hide resolved
CopyRef => unimplemented!(),
AddrOfLocal(ptr, _, size) => {
// TODO: is this a local from another function?
mapping.size = size.try_into().ok();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this conversion ever fail? If not, it seems like you could make the field size: usize instead of Option<usize> and avoid the None case above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field would still need to be size: Option<usize> because we have other cases where we get a pointer without a known size. That comes from the None on line 268.

Some of this is happening because we get pointers from native libraries (e.g. glibc) which we have no control over. I think the best we can do right now is say "this is a pointer we know nothing about", and maybe add some clever heuristics later.

Re the conversion: I could do Some(size.try_into().unwrap()), that might be better for debugging?

pdg/src/graph.rs Outdated Show resolved Hide resolved
@ahomescu ahomescu force-pushed the ahomescu/non_null branch from 710b626 to 31c703a Compare July 18, 2024 05:45
Copy link
Collaborator

@spernsteiner spernsteiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable overall. Left a few comments.

analysis/runtime/src/metadata.rs Outdated Show resolved Hide resolved
c2rust-analyze/src/analyze.rs Show resolved Hide resolved
dynamic_instrumentation/src/instrument.rs Outdated Show resolved Hide resolved
pdg/src/builder.rs Outdated Show resolved Hide resolved
@ahomescu ahomescu force-pushed the ahomescu/non_null branch 5 times, most recently from 34ddf68 to b0351bf Compare August 21, 2024 22:51
@ahomescu ahomescu force-pushed the ahomescu/non_null branch 3 times, most recently from 39acd9d to f898e86 Compare August 28, 2024 01:35
Record field and index projections for pointers
by recording a Project event with a base and target
pointer pair.
Add an index value to Project events to differentiate between
different projections with the same pointer offset, e.g.,
(*p).x and (*p).x.a if a is the first field of the structure.

This is implemented as a HashMap<u64, Vec<usize>> where each element
is a unique combination of field projections. The Project key
points to an element in this map. The keys are randomly-generated
64-bit keys with very low probability of collision.
The event assignment-based PDG
construction steps were giving incorrect results
for indirect accesses, so remove them completely.
Track the provenance of pointers with finer granularity
by storing the size of every allocation, local, and constant
inside the corresponding events. With this information, we can
keep track of the boundaries of every object and track whether
a projected pointer falls inside the original allocation.
Keep track of the size of every local for the
new provenance algorithm.
Add a new arg_ptr() method that allows passing
a pointer directly to an event handler.
The handler needs to be generic on the pointee type:
fn foo_handler<T: ?Sized>(ptr: *const T) { ... }

This change allows us to get more information about
the pointees from the event handler body, e.g.,
call std::mem::size_of_val().
Add a new AddrOfSized event and use it to keep
track of the sizes of all constant operands for
the new provenance algorithm.
Handle Offset nodes with a base pointer of 0
where the offset is non-zero, potentially resulting
in a brand new pointer.

One such case occurs in mod_cgi from lighttpd:
    const uintptr_t baseptr = (uintptr_t)env->b->ptr;
    for (i = 0; i < env->oused; ++i)
            envp[i] += baseptr;
Mark each PDG graph with a boolean flag that
represents whether that graph corresponds to the
null pointer or not. The PDG construction algorithm
seems to build one unique graph for all null pointers
in the entire program.
Add one test where a function argument can be either
null or non-null in the recur() function of the
analysis/tests/misc example code.
The dynamic instrumentation code inserts additional
locals for its own instrumentation code.
Remove the NON_NULL permission from all nodes in
the null graph from the PDG.
If the dynamic analysis tells us a pointer is never NULL
(i.e. it never shows up in an is_null graph), then
force the NON_NULL permission on it using updates_forbidden.
This is potentially an unsound transformation because the
pointer may actually be NULL on other inputs not covered
by the sample ones.
@ahomescu ahomescu merged commit beb017f into master Oct 23, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants