Unbounded proofs of memory safety #146

feliperodri · 2023-05-12T19:10:50Z

Description

We modified existing CBMC proofs to remove any assumptions over the size of the input (a.k.a. bound). We need to instrument all loops with Hoare-style loop invariants and assigns clauses (define the modified memory within a loop), which CBMC uses to prove correctness. We also annotate all loops using decreases clauses, which CBMC uses to prove termination.

We include a new proof for multiSearch function. That allow us to dramatically improve performance in CI. All CBMC proofs in CI now run under 3 minutes using less than 3GB of RAM (before they took up to 40 minutes and 40GB of RAM).

Finally, we clean up all Makefiles.

Test Steps

See proof results.

Checklist:

I have tested my changes. No regression in existing tests.
I have modified and/or added unit-tests to cover the code changes in this Pull Request.

Related Issue

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

source/core_json.c

jimgrundy · 2023-05-26T14:53:36Z

source/core_json.c

@@ -102,6 +120,13 @@ static size_t countHighBits( uint8_t c )
    size_t i = 0;

    while( ( n & 0x80U ) != 0U )
+    assigns( i, n )


I'm not sure what/why we are proving here. It's not memory safety because there are no memory accesses. It could be that we are proving that the loop computes the correct value, but then I'd expect to see a post condition after the loop or a function contract.

It's unclear because both the contract and assertions with more context for this invariant are in the caller function skipUTF8MultiByte. We have a contract for skipUTF8, which calls skipUTF8MultiByte, which finally calls countHighBits. The computational result from countHighBits, if not constrained properly, may lead to spurious arithmetic overflows. As I refactor and clean up these invariants, I'll add more documentation for clarification.

source/core_json.c

jimgrundy · 2023-05-26T15:06:14Z

source/core_json.c

@@ -1070,7 +1135,7 @@ static JSONStatus_t skipCollection( const char * buf,
            case '[':
                depth++;

-                if( depth == JSON_MAX_DEPTH )
+                if( depth >= JSON_MAX_DEPTH )


This, and the other local changes, is interesting. Was the code wrong before, or just harder to prove correct?

Here, it was harder to prove correct. As you can see, depth is part of our assigns clause. Thus, after havocing this variable and performing the inductive step, the code wasn't considering the case where depth could be beyond JSON_MAX_DEPTH (which wouldn't happen if we just iterate over the loop). We might want to change the invariant or make the code more cohesive. In this case, I opt for the latter.

Perhaps I should move all code alterations to a separate commit with proper documentation.

jimgrundy · 2023-05-26T15:11:14Z

source/core_json.c

@@ -1466,6 +1536,9 @@ static bool skipQueryPart( const char * buf,
    while( ( i < max ) &&
           !isSeparator_( buf[ i ] ) &&
           !isSquareOpen_( buf[ i ] ) )
+    assigns( i )
+    loopInvariant( i  <= max )


I was about to ask what we are proving here again since there is no memory access or post-condition ... but I think I get it now. We are just trying to get rid of all bounded analysis, so even though this loop is not interesting from a memory safety point of view we need it abstracted to we can claim we are doing unbounded memory safety. The invariant "true" would have worked equally well for that I guess.

And maybe we needed the invariant to prove the decreases clause?

You're right both times. We just to cross the line of unbounded analysis and we just want to prove termination :)

jimgrundy · 2023-05-26T15:21:39Z

test/cbmc/sources/core_json_contracts.c

@@ -39,7 +39,7 @@
 bool isValidBoundedBuffer( char * buf,
                           size_t max )
 {
-    return ( 0 < max && max < CBMC_MAX_BUFSIZE )
+    return ( 0U < max )


I realize this wasn't your change, but that's a curious use of bit-wise and on the next line. What's that about?

We added a comment on top of the file about it.

/** * When to use `&` vs `&&`: * Prefer `cond1 & cond2` when `cond2` can be evaluated without errors without knowing that `cond1` holds: e.g. `( 0 < i ) & ( i < j )`. * * Use `cond1 && cond2` when `cond1` must be established first to ensure that `cond2` can be evaluated without error: e.g. * `( allocated( p, size ) & ( 0 < i & i < size ) ) && p[i] > 0`. */

It mainly help us avoid existing performance issues during instrumentation when dealing with memory predicates and operands.

So I know this is already all here, and not expecting you to go and change all of these
But I really dislike this coding style. It's rather hard for a human to follow, which is clear since there needs to be a comment on the top of the file explaining it.

Then I feel like a compiler is going to optimize this

return ( start >= old_start ) && ( ( old_start < max ) ? ( start <= max ) : ( start == old_start ) );

To be, for all intents and purposes, the same as

bool retVal = false; if ( ( start >= old start ) && ( old_start < max) ) { retVal = ( start <= max ); } else { retVal = ( start == old_start ); } return ( retVal );

Unfortunately, we use this style because the style you suggested leads to drastic performance drawbacks when CBMC's back-end is solving the mathematical formula generated from the code. We can focus on this PR to include the loop invariants and I can certainly open a GitHub issue so we can properly discuss what would be a more readable C-style for these specifications.

test/cbmc/sources/core_json_contracts.c

Signed-off-by: Felipe R. Monteiro <[email protected]>

In order to turn all bounded CBMC proofs into unbounded proofs, we must make some small changes to coreJSON. Basically, we want to avoid uninitialized variables (CBMC sees them as non-deterministic values) and make sure the code in loops is resilient enough when working with inputs of arbitrary size. The latter is important because we must annotate all loops with loop invariants. The implementation should work together with these annotations in order to prove memory safety for inputs of arbitrary size. Signed-off-by: Felipe R. Monteiro <[email protected]>

Modify existing CBMC proofs to remove any assumptions over the size of the input (a.k.a. bound). Instrument all loops with Hoare-style loop invariants and assigns clauses (define the modified memory within a loop), which CBMC uses to prove correctness. Annotate all loops using decreases clauses, which CBMC uses to prove termination. Signed-off-by: Felipe R. Monteiro <[email protected]>

Skptak · 2023-10-12T17:08:00Z

source/core_json.c

@@ -73,7 +73,7 @@ static void skipSpace( const char * buf,
                       size_t * start,
                       size_t max )
 {
-    size_t i;
+    size_t i = 0U;


I'm fairly certain you don't need to mark 0 as 0U?
I believe it's unsigned by default in C

Misra 2012 requires U. signed is default.

Literal 0 is an integer constant of type "signed int" (ISO:C90 6.1.3.2 - Semantics).

Signed-off-by: Felipe R. Monteiro <[email protected]>

feliperodri · 2023-10-14T04:41:42Z

@n9wxu could I get one more review?

test/cbmc/include/core_json_contracts.h

feliperodri force-pushed the unbounded-cbmc-proofs branch 3 times, most recently from 9a63610 to 2b0e454 Compare May 12, 2023 19:31

jimgrundy approved these changes May 26, 2023

View reviewed changes

Add a CBMC proof harness for multiSearch function

50bbaed

Signed-off-by: Felipe R. Monteiro <[email protected]>

feliperodri force-pushed the unbounded-cbmc-proofs branch 2 times, most recently from af58865 to 2b71ebe Compare October 12, 2023 16:12

feliperodri added 3 commits October 12, 2023 16:56

Make all CBMC proofs unbounded

e504a6b

Signed-off-by: Felipe R. Monteiro <[email protected]>

feliperodri force-pushed the unbounded-cbmc-proofs branch from 2b71ebe to 69b6d11 Compare October 12, 2023 16:56

feliperodri marked this pull request as ready for review October 12, 2023 16:56

Skptak reviewed Oct 12, 2023

View reviewed changes

feliperodri requested a review from Skptak October 12, 2023 21:42

n9wxu previously approved these changes Oct 12, 2023

View reviewed changes

feliperodri dismissed n9wxu’s stale review via daa79b9 October 13, 2023 14:54

feliperodri force-pushed the unbounded-cbmc-proofs branch from 79b2953 to daa79b9 Compare October 13, 2023 14:54

n9wxu previously approved these changes Oct 13, 2023

View reviewed changes

feliperodri dismissed n9wxu’s stale review via 0f43e5c October 13, 2023 16:43

feliperodri force-pushed the unbounded-cbmc-proofs branch 2 times, most recently from 0f43e5c to c7f1b4f Compare October 13, 2023 17:21

Add patch for loop invariants in CI

52918f1

Signed-off-by: Felipe R. Monteiro <[email protected]>

feliperodri force-pushed the unbounded-cbmc-proofs branch from c7f1b4f to 52918f1 Compare October 13, 2023 17:33

AniruddhaKanhere approved these changes Oct 13, 2023

View reviewed changes

feliperodri requested a review from n9wxu October 16, 2023 17:59

archigup reviewed Oct 17, 2023

View reviewed changes

test/cbmc/include/core_json_contracts.h Show resolved Hide resolved

archigup approved these changes Oct 17, 2023

View reviewed changes

AniruddhaKanhere merged commit 8c7ba86 into FreeRTOS:main Oct 17, 2023
11 checks passed

feliperodri deleted the unbounded-cbmc-proofs branch October 18, 2023 15:32

kar-rahul-aws mentioned this pull request May 13, 2024

Remove changes to achieve 100% coverage #161

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unbounded proofs of memory safety #146

Unbounded proofs of memory safety #146

feliperodri commented May 12, 2023 •

edited

Loading

jimgrundy May 26, 2023

feliperodri May 31, 2023

jimgrundy May 26, 2023

feliperodri May 31, 2023

feliperodri May 31, 2023

jimgrundy May 26, 2023

jimgrundy May 26, 2023

feliperodri May 31, 2023

jimgrundy May 26, 2023

feliperodri May 31, 2023

Skptak Oct 12, 2023 •

edited

Loading

feliperodri Oct 12, 2023

Skptak Oct 12, 2023

archigup Oct 12, 2023

feliperodri Oct 12, 2023

feliperodri commented Oct 14, 2023

Unbounded proofs of memory safety #146

Unbounded proofs of memory safety #146

Conversation

feliperodri commented May 12, 2023 • edited Loading

Description

Test Steps

Checklist:

Related Issue

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Skptak Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feliperodri commented Oct 14, 2023

feliperodri commented May 12, 2023 •

edited

Loading

Skptak Oct 12, 2023 •

edited

Loading