Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbounded proofs of memory safety #146

Merged
merged 5 commits into from
Oct 17, 2023

Conversation

feliperodri
Copy link
Contributor

@feliperodri feliperodri commented May 12, 2023

Description

We modified existing CBMC proofs to remove any assumptions over the size of the input (a.k.a. bound). We need to instrument all loops with Hoare-style loop invariants and assigns clauses (define the modified memory within a loop), which CBMC uses to prove correctness. We also annotate all loops using decreases clauses, which CBMC uses to prove termination.

We include a new proof for multiSearch function. That allow us to dramatically improve performance in CI. All CBMC proofs in CI now run under 3 minutes using less than 3GB of RAM (before they took up to 40 minutes and 40GB of RAM).

Finally, we clean up all Makefiles.

Test Steps

See proof results.

Checklist:

  • I have tested my changes. No regression in existing tests.
  • I have modified and/or added unit-tests to cover the code changes in this Pull Request.

Related Issue

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@feliperodri feliperodri force-pushed the unbounded-cbmc-proofs branch 3 times, most recently from 9a63610 to 2b0e454 Compare May 12, 2023 19:31
source/core_json.c Outdated Show resolved Hide resolved
@@ -102,6 +120,13 @@ static size_t countHighBits( uint8_t c )
size_t i = 0;

while( ( n & 0x80U ) != 0U )
assigns( i, n )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what/why we are proving here. It's not memory safety because there are no memory accesses. It could be that we are proving that the loop computes the correct value, but then I'd expect to see a post condition after the loop or a function contract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear because both the contract and assertions with more context for this invariant are in the caller function skipUTF8MultiByte. We have a contract for skipUTF8, which calls skipUTF8MultiByte, which finally calls countHighBits. The computational result from countHighBits, if not constrained properly, may lead to spurious arithmetic overflows. As I refactor and clean up these invariants, I'll add more documentation for clarification.

source/core_json.c Outdated Show resolved Hide resolved
@@ -1070,7 +1135,7 @@ static JSONStatus_t skipCollection( const char * buf,
case '[':
depth++;

if( depth == JSON_MAX_DEPTH )
if( depth >= JSON_MAX_DEPTH )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and the other local changes, is interesting. Was the code wrong before, or just harder to prove correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it was harder to prove correct. As you can see, depth is part of our assigns clause. Thus, after havocing this variable and performing the inductive step, the code wasn't considering the case where depth could be beyond JSON_MAX_DEPTH (which wouldn't happen if we just iterate over the loop). We might want to change the invariant or make the code more cohesive. In this case, I opt for the latter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I should move all code alterations to a separate commit with proper documentation.

@@ -1466,6 +1536,9 @@ static bool skipQueryPart( const char * buf,
while( ( i < max ) &&
!isSeparator_( buf[ i ] ) &&
!isSquareOpen_( buf[ i ] ) )
assigns( i )
loopInvariant( i <= max )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to ask what we are proving here again since there is no memory access or post-condition ... but I think I get it now. We are just trying to get rid of all bounded analysis, so even though this loop is not interesting from a memory safety point of view we need it abstracted to we can claim we are doing unbounded memory safety. The invariant "true" would have worked equally well for that I guess.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe we needed the invariant to prove the decreases clause?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right both times. We just to cross the line of unbounded analysis and we just want to prove termination :)

@@ -39,7 +39,7 @@
bool isValidBoundedBuffer( char * buf,
size_t max )
{
return ( 0 < max && max < CBMC_MAX_BUFSIZE )
return ( 0U < max )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this wasn't your change, but that's a curious use of bit-wise and on the next line. What's that about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added a comment on top of the file about it.

/**
 * When to use `&` vs `&&`:
 * Prefer `cond1 & cond2` when `cond2` can be evaluated without errors without knowing that `cond1` holds: e.g.  `( 0 < i ) & ( i < j )`.
 *
 * Use `cond1 && cond2` when `cond1` must be established first to ensure that `cond2` can be evaluated without error: e.g.
 * `( allocated( p, size ) & ( 0 < i & i < size ) ) && p[i] > 0`.
 */

It mainly help us avoid existing performance issues during instrumentation when dealing with memory predicates and operands.

Copy link
Member

@Skptak Skptak Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I know this is already all here, and not expecting you to go and change all of these
But I really dislike this coding style. It's rather hard for a human to follow, which is clear since there needs to be a comment on the top of the file explaining it.

Then I feel like a compiler is going to optimize this

    return ( start >= old_start ) &&
           ( ( old_start < max ) ? ( start <= max ) : ( start == old_start ) );

To be, for all intents and purposes, the same as

bool retVal = false;
if ( ( start >= old start ) && ( old_start < max) )
{
    retVal = ( start <= max );
}
else
{
    retVal = ( start == old_start );
}
return ( retVal );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, we use this style because the style you suggested leads to drastic performance drawbacks when CBMC's back-end is solving the mathematical formula generated from the code. We can focus on this PR to include the loop invariants and I can certainly open a GitHub issue so we can properly discuss what would be a more readable C-style for these specifications.

test/cbmc/sources/core_json_contracts.c Show resolved Hide resolved
test/cbmc/sources/core_json_contracts.c Show resolved Hide resolved
test/cbmc/sources/core_json_contracts.c Outdated Show resolved Hide resolved
test/cbmc/sources/core_json_contracts.c Show resolved Hide resolved
@feliperodri feliperodri force-pushed the unbounded-cbmc-proofs branch 2 times, most recently from af58865 to 2b71ebe Compare October 12, 2023 16:12
Signed-off-by: Felipe R. Monteiro <[email protected]>
In order to turn all bounded CBMC proofs into unbounded proofs,
we must make some small changes to coreJSON. Basically, we want
to avoid uninitialized variables (CBMC sees them as non-deterministic
values) and make sure the code in loops is resilient enough when
working with inputs of arbitrary size. The latter is important
because we must annotate all loops with loop invariants.
The implementation should work together with these annotations
in order to prove memory safety for inputs of arbitrary size.

Signed-off-by: Felipe R. Monteiro <[email protected]>
Modify existing CBMC proofs to remove any assumptions over the size
of the input (a.k.a. bound). Instrument all loops with Hoare-style
loop invariants and assigns clauses (define the modified memory
within a loop), which CBMC uses to prove correctness. Annotate all
loops using decreases clauses, which CBMC uses to prove termination.

Signed-off-by: Felipe R. Monteiro <[email protected]>
@feliperodri feliperodri marked this pull request as ready for review October 12, 2023 16:56
@@ -73,7 +73,7 @@ static void skipSpace( const char * buf,
size_t * start,
size_t max )
{
size_t i;
size_t i = 0U;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly certain you don't need to mark 0 as 0U?
I believe it's unsigned by default in C

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misra 2012 requires U. signed is default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Literal 0 is an integer constant of type "signed int" (ISO:C90 6.1.3.2 - Semantics).

n9wxu
n9wxu previously approved these changes Oct 12, 2023
n9wxu
n9wxu previously approved these changes Oct 13, 2023
Signed-off-by: Felipe R. Monteiro <[email protected]>
@feliperodri
Copy link
Contributor Author

@n9wxu could I get one more review?

@AniruddhaKanhere AniruddhaKanhere merged commit 8c7ba86 into FreeRTOS:main Oct 17, 2023
11 checks passed
@feliperodri feliperodri deleted the unbounded-cbmc-proofs branch October 18, 2023 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants