-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix & improve Propagate Control Flow normalization pass #462
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for some very small nitpicks, looks good to me.
By the way: The reason that it performs so bad on MIPS is probably because we still have a known error when parsing conditional assignment instructions in MIPS. Since the whole control flow path propagation exists mainly for improving the control flow in the presence of conditional assignments, it cannot do much if the instructions are not parsed correctly.
Edit: The bug fix might not be complete, see my other comment. You can probably check with some elaborated unit test whether I am right or not.
src/cwe_checker_lib/src/intermediate_representation/project/propagate_control_flow.rs
Outdated
Show resolved
Hide resolved
src/cwe_checker_lib/src/intermediate_representation/project/propagate_control_flow.rs
Outdated
Show resolved
Hide resolved
I might have an idea why the old code did not catch the case: We use the CFG to check for incoming edges when removing blocks, right? But if a callee has no return instruction, then the CFG may not contain an edge to the return site. But the TID of the return site is nevertheless referenced in the call instruction. Your code solves that by retargeting the return site. In theory it might be possible that the return site cannot be retargeted, while all other edges to the return site are retargeted nevertheless. So I think the bug still persists, it is just less likely now... |
Good catch! It is indeed not too difficult to construct this situation. Essentially a case where a call to a non-returning function is not recognized as such and "returns" to a conditional block that can be optimized away. See below for a concrete example. I see a couple of options how to proceed: a.) Add another, preceding normalization pass that recognizes calls that have b.) and c.) are both unsatisfactory. For a.) it would be interesting to know if you have a feeling whether there are any bad surprises waiting down the line. On the first sight it seems like the return target information is worthless if the artificial return nodes are not generated so we might as well throw it away entirely (Indirect calls and call-other are handled differently and are not affected, right?). The question is rather why Ghidra generated it in the first place.
In this example the first This unit test uses the above example to show that we can still trigger this bug under those conditions. #[test]
fn call_return_to_cond_jump_removed() {
let sub_1 = Sub {
name: "sub_1".to_string(),
calling_convention: None,
blocks: vec![
mock_condition_block("cond_blk_1", "cond_blk_2", "end_blk_1"),
mock_block_with_defs_and_call("call_blk", "sub_2", "cond_blk_2"),
mock_condition_block("cond_blk_2", "end_blk_2", "end_blk_1"),
mock_block_with_defs("end_blk_1", "end_blk_1"),
mock_block_with_defs("end_blk_2", "end_blk_2"),
],
};
let sub_1 = Term {
tid: Tid::new("sub_1"),
term: sub_1,
};
let sub_2 = Sub {
name: "sub_2".to_string(),
calling_convention: None,
blocks: vec![mock_block_with_defs("loop_block", "loop_block")],
};
let sub_2 = Term {
tid: Tid::new("sub_2"),
term: sub_2,
};
let mut project = Project::mock_arm32();
project.program.term.subs =
BTreeMap::from([(Tid::new("sub_1"), sub_1), (Tid::new("sub_2"), sub_2)]);
let cfg_before_normalization = graph::get_program_cfg(&project.program);
cfg_before_normalization.print_compact_json();
propagate_control_flow(&mut project);
// construction of CFG would panic now
//graph::get_program_cfg(&project.program);
let expected_blocks = vec![
// `cond_blk_1` can be retarget.
mock_condition_block("cond_blk_1", "end_blk_2", "end_blk_1"),
// `call_blk` can not be re-targetd since no condition is known.
mock_block_with_defs_and_call("call_blk", "sub_2", "cond_blk_2"),
// `cond_blk_2` was removed since cond_blk_1 was re-targeted.
// Note: `call_blk` did not contribute an incoming edge since the
// callee does not return.
mock_block_with_defs("end_blk_1", "end_blk_1"),
mock_block_with_defs("end_blk_2", "end_blk_2"),
];
assert_eq!(
&project.program.term.subs[&Tid::new("sub_1")].term.blocks[..],
&expected_blocks[..]
);
} PS: How do you like the updated
|
Your updated Right now, your option a) would probably result in the best CFG that we can generate. Just make sure to mention it prominently in the doc-comment of the normalization pass. Maybe also add a INFO-level log message for each function (or call?) deemed non-returning. Just to make it easier to spot real-world binaries where suspiciously many functions are marked as non-returning. |
Nice, I included that in the first new commit.
I decided to split up the existing
Here are some stats about how often this new pass strikes. All in all they suggest that it is a pretty rare thing. I included some reasons for false positives in the doc comment.
However, these stats may also be a bit biased. On the IoT
Maybe something with the compiler+settings. In general it does no harm to retarget those calls as they can not be analyzed anyway - however - maybe we can address some of the root causes an another point. |
The original fix for Issue fkie-cad#461 in Commit ("lib/ir/project: propagate control flow for call returns") was incomplete. The original problem was due to a call to a function without a return instruction "returning" to a block that could be optimized away in the propagate control flow pass. Retargeting the call return can only solve the issue when the return block can be retargeted (and the retarget is not optimized away), which is not the case for condition blocks. Thus, always retarget returns from calls to functions without a ret to the artificial sink. Link: fkie-cad#462 (comment) Signed-off-by: Valentin Obst <[email protected]>
…tion Add a test to verify that retargeting returns from calls to non-returning functions is indeed solving the problem this pass has with "dangling" references to return sites. Link: fkie-cad#462 (comment) Signed-off-by: Valentin Obst <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the issue you mentioned yourself, everything looks good to me.
Thanks for the review! Then I'll now clean up the commit log and resolve the merge conflicts with the benchmarking PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small nits.
src/cwe_checker_lib/src/intermediate_representation/project/propagate_control_flow.rs
Outdated
Show resolved
Hide resolved
src/cwe_checker_lib/src/intermediate_representation/project/propagate_control_flow.rs
Outdated
Show resolved
Hide resolved
This patch does two things: 1. It allows the re-targeting of jumps for which no known true condition is available. Without a known condition, only blocks that consist of a single, unconditional jump can be skipped. 2. It allows the re-targeting of call returns in the same way that we already do it for unconditional jumps. For calls we never have a known condition as side-effects may invalidate any knowledge we have after the execution of all DEFs in the block. Example: Before the optimization we might have code like this: BLK [blk_0040a9c4] DEF [instr_0040a9c4_0] ra:4 = 0x40a9cc:4 JMP [instr_0040a9c4_1] call sub_00403f80 ret blk_0040a9cc BLK [blk_0040a9cc] JMP [instr_0040a9cc_1] Jump to blk_0040a9d0 BLK [blk_0040a9d0] DEF [instr_0040a9d0_0] a0:4 = ((0x43:4 << 0x10:4) + 0xffffb730:4) JMP [instr_0040a9d0_1] Jump to blk_0040a9d4 whereas after the optimization it becomes: BLK [blk_0040a9c4] DEF [instr_0040a9c4_0] ra:4 = 0x40a9cc:4 JMP [instr_0040a9c4_1] call sub_00403f80 ret blk_0040a9d0 BLK [blk_0040a9d0] DEF [instr_0040a9d0_0] a0:4 = ((0x43:4 << 0x10:4) + 0xffffb730:4) JMP [instr_0040a9d0_1] Jump to blk_0040a9d4 Fixes: 2487aac ("remove dead code originating from control flow propagation (fkie-cad#384)") Closes: fkie-cad#461 Reported-by: https://github.com/ElDavoo Signed-off-by: Valentin Obst <[email protected]>
…g edges If a basic block has multiple incoming edges that are all conditioned on the same condition, use this condition when retargeting the control flow transfer at the end of the block. Signed-off-by: Valentin Obst <[email protected]>
Signed-off-by: Valentin Obst <[email protected]>
Remember precondition and branch condition when retrageting a block that ends with a conditional jump. Signed-off-by: Valentin Obst <[email protected]>
Signed-off-by: Valentin Obst <[email protected]>
No functional changes. Hopefully. Signed-off-by: Valentin Obst <[email protected]>
No functional changes. Signed-off-by: Valentin Obst <[email protected]>
The original fix for Issue fkie-cad#461 in Commit ("lib/ir/project: propagate control flow for call returns") was incomplete. The original problem was due to a call to a function without a return instruction "returning" to a block that could be optimized away in the propagate control flow pass. Retargeting the call return can only solve the issue when the return block can be retargeted (and the retarget is not optimized away), which is not the case for condition blocks. Thus, always retarget returns from calls to functions without a ret to the artificial sink. Link: fkie-cad#462 (comment) Signed-off-by: Valentin Obst <[email protected]>
…tion Add a test to verify that retargeting returns from calls to non-returning functions is indeed solving the problem this pass has with "dangling" references to return sites. Link: fkie-cad#462 (comment) Signed-off-by: Valentin Obst <[email protected]>
The pass that retargets "returns" from non-returning functions runs after the block-to-sub mapping has been made unique. This invariant is relied upon by later analyses. Currently, the pass does not uphold this invariant since it always retargets to the same global artificial sink block. Modify the pass s.t. it preserves a unique block-to-sub mapping by retargeting returns directly to the Sub's artificial sink. Signed-off-by: Valentin Obst <[email protected]>
Link: fkie-cad#462 Signed-off-by: Valentin Obst <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like force push did not break anything.
Issue #461 discovered a problem with the Propagate Control Flow normalization pass: In rare cases it could happen that a basic block would be removed without re-targeting calls that return to it. link
This PR fixes the issue by allowing the re-targeting of call returns. Besides that, it includes a number of improvements to the optimization pass at large. In particular:
Testing
There is a unit test for each of the proposed changes that make the optimization more aggressive. Furthermore, the optimizations of the new pass were manually verified (at random) for three programs for three different architectures (MIPS, ARM32, AMD64).
Measurements
The following table includes a comparison of this PR and the current optimization pass. It show the number of basic block nodes and jump edges in the CFG of the unoptimized
readelf
IR program for different host architectures. Next to that are the fractions of basic blocks and jump edges that can be removed by the current and proposed optimization passes respectively. In short, the improved pass can remove ~1.8x more stuff from the CFG than the old one.One interesting bit is that the optimization performs performs worst for MIPS, which was the architecture for which it was originally introduced.