analyze: borrowck: cache results of polonius runs on disk #1056

spernsteiner · 2024-01-10T18:39:44Z

The Polonius stage of borrow checking takes a long time to run on certain functions, such as lighttpd's li_MD5Transform. Worse, we often run Polonius multiple times on the same function as the interprocedural analysis iterates to reach a fixpoint. This branch speeds up the analysis by caching Polonius results on disk.

The caching logic is fairly simple: the core Polonius analysis is effectively a pure function from input facts to output facts, so we hash the input facts before each call and check whether a file named after that hash is present in the cache directory. There's no need to factor in any details of the crate, MIR, permissions, etc. If the current Polonius query has the same input facts as a previous query, it will necessarily produce the same output facts, regardless of how those input facts were computed.

Computing the input facts still has a nontrivial cost for some functions, but this branch provides significant speedups on algo_md5 and lighttpd_rust_amalgamated once c2rust-analyze has run once to populate the cache.

aneksteind · 2024-01-18T19:58:23Z

c2rust-analyze/src/borrowck/mod.rs

+
+    // Tuples only implement `Serialize` up to length 12, so split up this 17-element tuple into
+    // several pieces.  Note this must match the tuple format in `try_load_cached_output`.
+    let raw = (


why is a tuple needed for the serialization as opposed to serializing the struct directly?

The struct is defined by the polonius crate and doesn't implement Serialize. Serde has some support for deriving impls for types in "remote crates", but it requires duplicating the struct definition and generally seems like a bit of a pain. Since we don't use the struct in fields of other Serialize types, we don't need a proper Serialize impl for it, and this tuple trick is sufficient.

analyze: borrowck: cache results of polonius runs on disk

4dd444c

spernsteiner requested a review from aneksteind January 10, 2024 18:39

spernsteiner force-pushed the analyze-polonius-cache branch from cc51425 to 4dd444c Compare January 10, 2024 19:11

aneksteind reviewed Jan 18, 2024

View reviewed changes

aneksteind approved these changes Jan 18, 2024

View reviewed changes

analyze: borrowck: clarify comments around polonius output serialization

09cb658

spernsteiner merged commit d818e4d into master Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

analyze: borrowck: cache results of polonius runs on disk #1056

analyze: borrowck: cache results of polonius runs on disk #1056

Uh oh!

spernsteiner commented Jan 10, 2024

Uh oh!

aneksteind Jan 18, 2024

Uh oh!

spernsteiner Jan 18, 2024

Uh oh!

Uh oh!

analyze: borrowck: cache results of polonius runs on disk #1056

analyze: borrowck: cache results of polonius runs on disk #1056

Uh oh!

Conversation

spernsteiner commented Jan 10, 2024

Uh oh!

aneksteind Jan 18, 2024

Choose a reason for hiding this comment

Uh oh!

spernsteiner Jan 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!