Underapproximate Reasoning #288
Replies: 1 comment 1 reply
-
Yes, there are some hidden dragons. First off, not having any false positives would restrict the number of recognizable true positives too much: The used method of computation (fixpoints over the control flow graph) is not path-sensitive. Bugs that occur on every execution path are too trivial to be of interest, these bugs are very rare in real-world software. So we would need to implement selective path-sensitivity in our analysis (which is possible) and make it so exact that it can guarantee that a particular bug can be triggered. This gets complicated quite fast and you tend to wish to have started with symbolic execution instead, because there the path-sensitivity is a given. The second reason is more a matter of defining false positives. We are always missing certain knowledge about the execution environment of a program. For example, the execution environment may guarantee that the program is only called with a small number of possible inputs. But our analysis cannot know the input set, since it is not encoded in the binary itself. So a bug that can only be triggered by an input that is not possible in the execution environment is a false positive, although it would be a true positive in another execution environment. In practice there are a lot of small assumptions about the execution environment that are sometimes true and sometimes false. So even if our analysis would be theoretically false-positive-free for some set of assumptions about the execution environment, it is still not false-positive-free for some other set of assumptions. In practice you tend to just accept false positives for this type of approach and use other analysis methods (like symbolic execution) to prune false positives later on if you really need to guarantee that no false positives are contained in your analysis results. |
Beta Was this translation helpful? Give feedback.
-
Hi there!
A speculative question born out of curiosity rather than any practical need! I've been looking at some of Peter O'Hearn's work on incorrectness logic and under-approximate reasoning.
cwe_checker
explicitly comes with warnings that both false positives and false negatives may be present and it seems the focus in on a high performing analysis as a quick and easy step early in the VR process. However I was wondering how complex it would be to implement an under-approximate (and thus false positive free) variant of the analyses incwe_checker
? From my basic exposure to the internals of the system it seems like one would just have to tweak the behaviour of the relevant domains (inverting the lattice) to achieve this. Are there any likely hidden dragons/complexity I'm missing here?Thanks,
James
Beta Was this translation helpful? Give feedback.
All reactions