-
Notifications
You must be signed in to change notification settings - Fork 21
Function contracts
- Writing function contracts
- Checking function contracts
- Using function contracts
- Warning: Check before use!
A function contract is a function specification. It consists of
- a precondition that must be true when the function is called,
- a postcondition that will be true when the function returns, and
- a write set giving the set of all memory locations that the function might change (nothing outside the write set will be changed).
CBMC can check that a function satisfies its function contract by proving that for all memory states and for all function arguments, if the memory and arguments satisfy the precondition when the function is invoked, then when the function returns, only memory locations in the write set will have been changed, and the memory and return arguments will satisfy the postcondition.
CBMC can replace a function with its function contract by checking that the memory and function arguments satisfy the precondition when the function is invoked, and then replacing the values of memory locations in the write set with arbitrary values satisfying the postcondition.
The value of function contracts is proof abstraction:
- API specifications: If we can write function contracts for every function in an interface, then we can ignore the details of the interface implementation when we validate code using the interface. The proof becomes valid for any implementation of the API, and what might otherwise be a large and complicated problem for CBMC to solve may become much smaller and more tractible.
- Proof scaling: Applying CBMC directly to a code base can be intractible. A function with a complicated call graph simply may not fit within the capacity of CBMC on any computer of reasonable size. But suppose we think of the function call graph as a computation tree, and suppose we can find a small set of functions such that if we prune the computation tree at the invocations of these functions, then we get a collection of subtrees of reasonable size. If we can write function contracts for these functions, and if we can prove that these functions satisfy their contracts, then we can use these function contracts in the proofs of other function contracts for other functions that depend on them. We can essentially replace a subtree with the function contract for the function at the root of the subtree. In this way, we can use function contracts to scale CBMC proof to code bases larger than CBMC could otherwise validate.
We write a function contract by annotating the function definition with
- requires clauses that describe the precondition,
- ensures clauses that describe the postcondition, and
- assigns clauses that describe the write set.
Recall our string buffer example strbuf2.c from the preceding section with function
typedef struct { size_t length; char* buffer; } strbuf;
char cache;
void cache_fifth_char(strbuf* str) { cache = str->buffer[4]; }
and proof harness
main() {
size_t len;
strbuf* str = strbuf_allocate(len);
__CPROVER_assume(strbuf_is_valid(str));
__CPROVER_assume(str->length > 4);
cache_fifth_char(str);
__CPROVER_assert(strbuf_is_valid(str), "String buffer remains valid");
__CPROVER_assert(str->length > 4, "String buffer remains length >4");
}
This proof harness is --- as explicitly as possible --- describing the
preconditions and postconditions for the function. The assume clauses
describe the preconditions and the assert clauses describe the postconditions.
One thing not described here is the fact that the function modifies the
value of cache
.
Now let's write the function contract for this function. All we have to do is annotate the function definition with requires, ensures, and assigns clauses.
void cache_fifth_char(strbuf* str)
__CPROVER_requires(strbuf_is_valid(str))
__CPROVER_requires(str->length > 4)
__CPROVER_ensures(strbuf_is_valid(str))
__CPROVER_ensures(str->length > 4)
__CPROVER_assigns(cache)
{
cache = str->buffer[4];
}
Having done this, we can now simplify the proof harness to doing nothing but allocating data
main() {
size_t len;
strbuf* str = strbuf_allocate(len);
cache_fifth_char(str);
}
You can see the full example in contract1.c.
Notice by the way, the strength of the function contract. In particular,
the assigns clause says that cache
may be modified, but the assigns
clause also says that only cache
is modified.
In particular, left unchanged is
- the pointer
str
, - the length
str->length
, - the pointer
str->buffer
, and - the contents of the buffer pointed to by
str->buffer
.
This means that our ensures clauses are actually unnecessary, because they follow immediately from the preconditions and the assigns clause. Let's leave them alone for now and treat them as valuable documentation.
But let's modify this function just a bit to illustrate things we
can say with function contracts that are harder to say
(or at least clumsier) with a proof harness alone.
Let's modify the function cache_fifth_char
to increment a global counter
and return the character written to the cache.
char cache;
int counter;
char cache_fifth_char(strbuf* str)
{
counter = counter + 1;
cache = str->buffer[4];
return str->buffer[4];
}
Now we can add to the function contract the lines
__CPROVER_ensures(cache == __CPROVER_return_value)
__CPROVER_ensures(counter == __CPROVER_old(counter) + 1)
__CPROVER_assigns(counter)
The first line illustrates how the symbol __CPROVER_return_value
can be used to refer to the function return value in the postcondition
of a function contract. It says that the return value and the cached
value are the same character.
The second line illustrates how the symbol __CPROVER_old(v)
applied
to a variable v
can be used to refer to the value of v
before
function invocation. It says that the value of counter
after function
return is one greater than the value of counter
before function
invocation.
The third line says that the function now modifies counter
in additon
to cache
.
You can see the full example in contract2.c
To prove that cache_fifth_char
satisfies its function contract,
we run CBMC as before,
but first we have to instrument the program to do the contract checking.
We use the goto-cc
compiler to produce the goto program as usual,
but now we use goto-instrument --enforce-contract cache_fifth_char
to instrument the program with contract checking.
goto-cc -o contract2.goto contract2.c
goto-instrument --enforce-contract cache_fifth_char contract2.goto proof2.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof2.goto
We have proved that cache_fifth_char
satisfies its function contract.
Now let's use our function contract to verify another function that calls it. (Drum roll, please.)
Let's forget about old proof harnesses and let's rewrite main to
use cache_fifth_char
in some way.
#define LEN 5
main() {
strbuf* str = strbuf_allocate(LEN+1);
if (!str || !str->buffer) return;
strcpy(str->buffer, "abcde");
counter = 0;
cache = 'a';
cache_fifth_char(str);
assert(counter == 1);
assert(cache == 'e');
}
The function fills the buffer with a string whose fifth character is 'e' and
sets counter
and cache
to initial values.
Then it invokes cache_fifth_char
and checks that counter
has been incremented
and cache
has been set to the fifth character 'e'.
We use function contracts just as we checked function contracts, except that
this time we call goto-instrument
with --replace-call-with-contract
to
instrument the program to use the function contact in place of the function call.
We run the following, and...
goto-cc -o contract3.goto contract3.c
goto-instrument --replace-call-with-contract cache_fifth_char contract3.goto proof3.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof3.goto
It fails! What? The assertion cache == 'e'
is failing! How is this possible?
The function contract ensures that cache == __CPROVER_return_value
, and it is obvious
from the code that the return value is the fifth character which is 'e'.
But the function contract doesn't actually say that the return value is the
fifth character. And, in fact, if we run CBMC with --trace
to produce the error
trace demonstrating the assertion failure, we see that cache
is being set to 'd'
and the return value is 'd', and all of that is perfectly consistent with the
function contract that we actually wrote.
Let's fix this by adding the missing postcondition
__CPROVER_ensures(cache == str->buffer[4])
to produce contract3a.c. Now running
goto-cc -o contract3a.goto contract3a.c
goto-instrument --replace-call-with-contract cache_fifth_char contract3a.goto proof3a.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof3a.goto
successfully validates the code, and we are done.
Except...
We have just made a horrible mistake in our conclusion that we have fully validated our code: We never went back to check that function satisfied the new function contract.
We first proved that cache_fifth_char
satisfied the original function without the missing postcondition
cache == str->buffer[4]
using the proof harness contract2.c.
We then tried to validate our code using our original function contract contract3.c and failed,
so we created a new function contract by adding the missing postcondition.
We then successfully validated our code using the new function contract contract3a.c.
But we never proved that that cache_fifth_char
satisfies the new function contract.
If we go back to our origintal proof harness for the function contract contract2.c
and add the missing postcondition contract2a.c, we can prove that
cache_fifth_char
satisfies the new function contract with
goto-cc -o contract2a.goto contract2a.c
goto-instrument --enforce-contract cache_fifth_char contract2a.goto proof2a.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof2a.goto
Now we really are done, and we have learned an important lesson:
- It is essential to check a function contract before using it in another proof.
It is fine to work out the correct function contract by using it in different contexts to learn what is required of a function contract, but in the end, the final version of the function contract needs to be checked before it is used. To forget to do this is a serious mistake.