Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Region and VSA bug fixes #270

Merged
merged 25 commits into from
Jan 16, 2025
Merged

Memory Region and VSA bug fixes #270

merged 25 commits into from
Jan 16, 2025

Conversation

Comment on lines 590 to 597
modified = transforms.VSAIndirectCallResolution(
ctx.program,
result.vsaResult,
result.mmmResults
).resolveIndirectCalls()
if analysisResult.size < 2 then
modified = true
else
modified = transforms.VSAIndirectCallResolution(
ctx.program,
result.vsaResult,
result.mmmResults
).resolveIndirectCalls()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this, why do we not try to resolve indirect calls until at least two iterations have been completed?

while (modified || (analysisResult.size < 2 && config.memoryRegions)) {
while (modified || (analysisResult.size < 2)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to run the analyses multiple times unless it's necessary for the memory regions

@l-kent
Copy link
Contributor

l-kent commented Nov 12, 2024

This fixes #266 and #264, which is great, but doesn't fix #265 since the heap regions are still not showing up in the output anywhere.

I'm not sure why you say this addresses #100 since that's not really related to this and was fixed a long time ago, do you mean #263? With these changes, that is still an issue for indirect_calls/functionpointer/clang:BAP and indirect_calls/functionpointer/clang_pic:BAP (it seems to vary a bit which variations are affected which is why I think it's potentially some sort of non-deterministic iteration-order issue).

For instance, for indirect_calls/functionpointer/clang:BAP I get the following output:

    R29, Gamma_R29 := bvadd64(R31, 32bv64), Gamma_R31;
    stack_13, Gamma_stack_13 := memory_store32_le(stack_13, bvadd64(R29, 18446744073709551612bv64), 0bv32), gamma_store32(Gamma_stack_13, bvadd64(R29, 18446744073709551612bv64), true);
    assume {:captureState "%000003ee"} true;
    stack_7, Gamma_stack_7 := memory_store32_le(stack_7, bvadd64(R29, 18446744073709551608bv64), R0[32:0]), gamma_store32(Gamma_stack_7, bvadd64(R29, 18446744073709551608bv64), Gamma_R0);
    assume {:captureState "%000003f6"} true;
    R8, Gamma_R8 := zero_extend32_32(memory_load32_le(stack_18, bvadd64(R29, 18446744073709551608bv64))), gamma_load32(Gamma_stack_18, bvadd64(R29, 18446744073709551608bv64));

You can see the last two accesses clearly have the same index but they are incorrectly given different stack regions.

There is still an issue issue with the VSA (same as the one here which has only been partially fixed: #264 (comment)) for the following cases:

  • correct/basic_lock_read/gcc_pic:BAP
  • correct/initialisation/clang_pic:BAP
  • correct/initialisation/gcc_pic:BAP

For correct/basic_lock_read/gcc_pic:BAP, the relevant parts of the .bir are below.

000002fe: R0 := 0x10000
00000305: R0 := mem[R0 + 0xFD8, el]:u64
0000030c: R0 := pad:64[mem[R0, el]:u32]
00000312: #4 := 31:0[R0] - 1

The address 0x10FD8 points to the address of z (0x11018), so after %00000305, R0 == 0x11018 and after %0000030c R0 is equal to the value stored at z. The issue is on the second iteration of the analyses( which uses the VSA results) at %00000312, the VSA incorrectly has that R0 == 0x11018 which is not correct.

} else {
// this is a constant but we need to check if it is a data region
checkIfDefined(evalMemLoadToGlobal(assign.rhs, 1, assign), n)
return checkIfDefined(tryCoerceIntoData(assign.rhs, assign, 1), n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this use an access size of 1? This seems to produce a lot of regions with size 1 in the GRA results that are not actually real regions.

@yousifpatti
Copy link
Contributor Author

This is a Draft PR. I am working on pushing fixes for the other two issues. The #100 was tagged by mistake

@l-kent
Copy link
Contributor

l-kent commented Nov 12, 2024

All good, it's very close to being all fixed!

@yousifpatti
Copy link
Contributor Author

yousifpatti commented Nov 22, 2024

@l-kent I think the latest commits should address most of the issues. I will work on merging main into this branch

@l-kent
Copy link
Contributor

l-kent commented Nov 22, 2024

I can handle merging it, I'll look at that now.

@l-kent
Copy link
Contributor

l-kent commented Nov 22, 2024

This most recent set of changes has introduced a lot of regressions, so it's not suitable for merging.

@yousifpatti
Copy link
Contributor Author

Unit tests?

@l-kent
Copy link
Contributor

l-kent commented Nov 22, 2024

You can run MemoryRegionSystemTestsBAP and MemoryRegionSystemTestsGTIRB to see. Before yesterday's changes, only these tests were failing in MemoryRegionSystemTestsBAP, since you had fixed most of the outstanding issues by early last week:

  • correct/basic_lock_read/gcc_pic:BAP
  • correct/initialisation/clang_pic:BAP
  • correct/initialisation/gcc_pic:BAP

With the most recent changes, now all of the following tests fail:

  • correct/basic_arrays_write/gcc:BAP
  • correct/basic_arrays_write/gcc_pic:BAP
  • correct/basic_function_call_reader/clang_pic:BAP
  • correct/basic_lock_read/gcc:BAP
  • correct/basic_lock_read/gcc_pic:BAP
  • correct/basic_lock_security_read/gcc_pic:BAP
  • correct/basic_lock_security_write/clang:BAP
  • correct/basic_lock_security_write/clang_pic:BAP
  • correct/basic_lock_security_write/gcc:BAP
  • correct/basic_lock_security_write/gcc_pic:BAP
  • correct/basic_sec_policy_write/clang:BAP
  • correct/basic_sec_policy_write/clang_pic:BAP
  • correct/basic_sec_policy_write/gcc:BAP
  • correct/basic_sec_policy_write/gcc_pic:BAP
  • correct/ifbranches/gcc:BAP
  • correct/initialisation/clang_pic:BAP
  • correct/initialisation/gcc_pic:BAP
  • correct/using_gamma_conditional/clang:BAP
  • correct/using_gamma_conditional/clang_pic:BAP

I'll have a look at why these are failing.

}
)
}
print("Done")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't include print calls like this

@l-kent
Copy link
Contributor

l-kent commented Nov 22, 2024

tryCoerceIntoData evaluates R8 at %00000371 incorrectly because that is the result the VSA gives.

@l-kent
Copy link
Contributor

l-kent commented Nov 22, 2024

The heap identification doesn't seem to be working in all cases yet, but it's a start at least.

For malloc_with_local/clang:BAP, these are the relevant allocations and heap accesses in the .bir file:

0000038f: call @malloc with return %00000391

000003a3: call @malloc with return %000003a5

000003cc: mem := mem with [R9] <- 7:0[R8] // heap access to location allocated at malloc 0000038f

000003e0: mem := mem with [R9, el]:u32 <- 31:0[R8] // heap access to location allocated at malloc 000003a3

000003ee: R1 := pad:64[mem[R8]] // heap access to location allocated at malloc 0000038f

0000040f: R1 := pad:64[mem[R8, el]:u32] // heap access to location allocated at malloc 000003a3

Currently, %000003cc and %000003e0 are correctly identified as heap accesses (to separate heap locations), but %000003ee and %0000040f are incorrectly identified as stack accesses.

@yousifpatti
Copy link
Contributor Author

For correct/initialisation/clang:BAP, the main problem happens when that line is called for the line %00000371. The relevant parts of the .bir are below:

00000305: R9 := 0x10000
0000030c: R9 := mem[R9 + 0xFC8, el]:u64

0000036b: R8 := pad:64[mem[R9, el]:u32]
00000371: R8 := pad:64[31:0[R8] + 3]

0x10FC8 contains the value 0x11040, which is the address of the global variable a. At %00000371, R8 is equal to the contents of a, but tryCoerceIntoData evaluates it as 0x11040 instead, so the GRA incorrectly creates a new, non-existent region of size 1 with address 0x11043 (69699 in decimal) which overrides the correct region with 0x11040 as they overlap.

Hi @l-kent I am exploring correct/initialisation/clang_pic:BAP, can you please give me some clues as to why this example does not verify even after the latest changes that do not create that non-existent region?

@l-kent
Copy link
Contributor

l-kent commented Nov 25, 2024

There is still an inconsistency between the GRA result and the MMM dataMap - the dataMap now has no region with address 69696 (there should be one of size 4) at all, but the GRA result has a region with address 69696 and size 8.

@l-kent
Copy link
Contributor

l-kent commented Nov 25, 2024

I haven't figured out why that region is missing from the MMM dataMap yet. The size issue in the GRA result is probably related to something being loaded from the symbol table data somewhere. The GRA result is also inconsistent as it (correctly) has a region with address 69700 with size 4, which overlaps the region at address 69696 with (incorrect) size 8.

@l-kent
Copy link
Contributor

l-kent commented Nov 25, 2024

All the other issues seem to be solved though, which is great.

@yousifpatti
Copy link
Contributor Author

There is still an inconsistency between the GRA result and the MMM dataMap - the dataMap now has no region with address 69696 (there should be one of size 4) at all, but the GRA result has a region with address 69696 and size 8.

In globals list we have
69696 -> a
and global sizes
a -> 64

which means region 69696 should be of size 8 not 4?

@l-kent
Copy link
Contributor

l-kent commented Dec 2, 2024

a is an array containing two 32-bit values. We want our regions to be based on the accesses that actually happen in the program.

@l-kent
Copy link
Contributor

l-kent commented Dec 4, 2024

I tried to merge this in but I can't merge this in without breaking things so I will leave it for another time.

@yousifpatti
Copy link
Contributor Author

That is okay. This branch doesn't have the change of MemoryLoad as a statement so it is a fair bit of changes. Thanks for trying I will help merge after the demo

l-kent added 2 commits December 5, 2024 16:40
# Conflicts:
#	src/main/scala/analysis/ANR.scala
#	src/main/scala/analysis/Analysis.scala
#	src/main/scala/analysis/GlobalRegionAnalysis.scala
#	src/main/scala/analysis/InterprocSteensgaardAnalysis.scala
#	src/main/scala/analysis/MemoryRegionAnalysis.scala
#	src/main/scala/analysis/RNA.scala
#	src/main/scala/analysis/UtilMethods.scala
#	src/main/scala/analysis/VSA.scala
#	src/main/scala/util/RunUtils.scala
#	src/test/scala/SystemTests.scala
@l-kent
Copy link
Contributor

l-kent commented Dec 5, 2024

I've merged main into this branch and fixed one problem that was my fault but there are still further issues. I've made it throw an exception when the RegionInjector can't find a region in the MMM that should be there.

…in GRA results, throw exception if MMM has incorrect size
case assign: LocalAssign =>
// this is a constant but we need to check if it is a data region
checkIfDefined(evalMemLoadToGlobal(assign.rhs, 1, assign), n)
checkIfDefined(tryCoerceIntoData(assign.rhs, assign, 1, noLoad = true), n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current issues seem to be in significant part due to this line. Creating regions with a size of 1 here creates a lot of regions with incorrect sizes that cause problems.

@l-kent
Copy link
Contributor

l-kent commented Dec 6, 2024

These are what currently fail:

  • correct/basic_arrays_write/clang:BAP *** FAILED *** MMM returned region for 69684 with size 1 bytes which does not match requested size 4 bytes
  • correct/basic_arrays_write/clang_O2:BAP *** FAILED *** failed to find region with address 69684 of size 4 bytes
  • correct/basic_arrays_write/gcc:BAP *** FAILED *** MMM returned region for 69656 with size 1 bytes which does not match requested size 4 bytes
  • correct/basic_arrays_write/gcc_O2:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/no_interference_update_x/clang:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/no_interference_update_x/clang_pic:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/no_interference_update_x/gcc:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/no_interference_update_x/gcc_O2:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/no_interference_update_x/gcc_pic:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/no_interference_update_y/clang:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/no_interference_update_y/clang_pic:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/no_interference_update_y/gcc:BAP *** FAILED *** failed to find region with address 69652 of size 4 bytes
  • correct/no_interference_update_y/gcc_O2:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/no_interference_update_y/gcc_pic:BAP *** FAILED *** failed to find region with address 69652 of size 4 bytes
  • correct/secret_write/clang_O2:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/secret_write/gcc_O2:BAP *** FAILED *** failed to find region with address 69660 of size 4 bytes
  • correct/using_gamma_conditional/clang:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/using_gamma_conditional/clang_O2:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/using_gamma_conditional/clang_pic:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/using_gamma_conditional/gcc:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/using_gamma_conditional/gcc_pic:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/using_gamma_write_z/clang:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/using_gamma_write_z/clang_pic:BAP *** FAILED *** failed to find region with address 69688 of size 4 bytes
  • correct/using_gamma_write_z/gcc:BAP *** FAILED *** failed to find region with address 69652 of size 4 bytes
  • correct/using_gamma_write_z/gcc_O2:BAP *** FAILED *** failed to find region with address 69656 of size 4 bytes
  • correct/using_gamma_write_z/gcc_pic:BAP *** FAILED *** failed to find region with address 69652 of size 4 bytes

The issue seems to mostly be with regions loaded from the symbol table being incorrectly deleted or overwritten.

@l-kent
Copy link
Contributor

l-kent commented Dec 6, 2024

I don't really understand why this is only happening now, but convertMemoryRegions in MemoryModelMap seems to be causing the other problems - if a region loaded from the symbol table is not in the GRA result, it is completely removed. We want to keep non-conflicting regions from the symbol table around, and may just need to adjust the size of ones that do conflict?

@l-kent
Copy link
Contributor

l-kent commented Dec 6, 2024

There are also some issues with stack & heap identification as seen in MemoryRegionTestsMRA but we don't need to fix those right away.

@l-kent
Copy link
Contributor

l-kent commented Dec 20, 2024

This still fails for the following two cases:

  • correct/basic_arrays_write/clang:BAP *** FAILED *** MMM returned region for 69684 with size 1 bytes which does not match requested size 4 bytes
  • correct/basic_arrays_write/gcc:BAP *** FAILED *** MMM returned region for 69656 with size 1 bytes which does not match requested size 4 bytes

The other bug which was causing most of the failures has been fixed though, great!

@yousifpatti
Copy link
Contributor Author

Thanks for checking Liam. I have got it to pass everything except initialisation/clang_O2 which I will chase

Command: List(boogie, /printVerifiedProceduresCount:0, /timeLimit:10, /useArrayAxioms, ./src/test/correct/initialisation/clang_O2/initialisation_bap.bpl)
./src/test/correct/initialisation/clang_O2/initialisation_bap.bpl(200,5): Error: a postcondition could not be proved on this return path
./src/test/correct/initialisation/clang_O2/initialisation_bap.bpl(145,3): Related location: this is the postcondition that could not be proved
Execution trace:
./src/test/correct/initialisation/clang_O2/initialisation_bap.bpl(157,3): lmain
./src/test/correct/initialisation/clang_O2/initialisation_bap.bpl(200,5): Error: a postcondition could not be proved on this return path
./src/test/correct/initialisation/clang_O2/initialisation_bap.bpl(144,3): Related location: this is the postcondition that could not be proved
Execution trace:
./src/test/correct/initialisation/clang_O2/initialisation_bap.bpl(157,3): lmain

@l-kent l-kent merged commit 730a06c into main Jan 16, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants