-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PPC: Display data values on hover for pools as well #140
Conversation
Floats and doubles will now always be displayed with a decimal point and one digit after it, even if they are whole numbers. Floats will also have the f suffix. This is so you can tell the data type just by glancing at the value.
39498d9
to
9051f48
Compare
This fixes some false positives.
Because we no longer have access to the actual symbol name via sections, guess_data_type can no longer detect the String data type for pooled references.
objdiff-core/src/obj/mod.rs
Outdated
@@ -106,6 +106,7 @@ pub struct ObjIns { | |||
pub mnemonic: Cow<'static, str>, | |||
pub args: Vec<ObjInsArg>, | |||
pub reloc: Option<ObjReloc>, | |||
pub fake_pool_reloc: Option<ObjReloc>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we can't store this in reloc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, there's no scenario where reloc
and fake_pool_reloc
need to both be Some
, so it might work. The reason I didn't do that initially is because reloc
is accessed in many more places in the code than just the hover tooltip and I wasn't sure if I should be messing with all those without understanding them.
I just checked all the references to reloc
now:
-
The JSON output of
objdiff-cli --output
(inInstruction::new
, might also be used by wasm?):
-
The context menu when right-clicking on an instruction (in
ins_context_menu
):
There are other places besides those 2 that reference the field, but I think they rely on the relocation's type being a supported one, not R_PPC_NONE
, so they don't actually show up in a user-visible place in practice.
Notice that if we go with this route there's no name, size, etc for the symbol being referenced here because of the hack where I replaced the target symbol with an empty placeholder.
Without the placeholder hack, the actual symbol is correctly displayed:
If you want I can merge the two fields and add checks to Instruction::new
and ins_context_menu
to skip showing the symbol if its name is empty. If this is done, any more references to ObjIns.reloc
added in the future would also need similar checks to prevent the placeholder from being used.
Alternative, maybe less hacky approach: instead of the hack where I resolve the placeholder to the real symbol in the GUI code for showing the tooltip every frame, I could do it just once diff/code.rs process_code_symbol
and mutate the result before returning it.
I didn't think of this before, but unlike ObjArch::process_code
, process_code_symbol
already has access to obj.sections
without changing the function signature.
This would allow the actual symbol name to be displayed everywhere more consistently. It would also avoid needing to add repetitive checks on !target.name.is_empty()
every time ins.reloc
is used anywhere in the future to avoid bugs.
Basically, anything that calls process_code_symbol
would see reloc
as having the correct pooled reference target symbol. And it seems like objdiff only calls ObjArch::process_code
via process_code_symbol
.
Do other projects that rely on objdiff like dtk use this reloc
field for anything? Would it be bad for it to refer to an indirect pooled reference there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a couple commits with the changes I described above if you want to see for yourself if they're okay. From my testing it seems to work fine for both objdiff-gui and objdiff-cli and causes the fake relocation symbol to display everywhere the real one would:
I didn't test with other things like wasm/dtk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the continued work on this. It's shaping up quite nicely.
.and_then(|ty| obj.arch.display_data_type(ty, &reloc.target.bytes)) | ||
{ | ||
ui.colored_label(appearance.highlight_color, s); | ||
if reloc.addend >= 0 && reloc.target.bytes.len() > reloc.addend as usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the logic behind this exactly? I'm confused by the check against target.bytes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two lines down, I slice into target.bytes
with reloc.addend
being the start index in order to support displaying values that are in the middle of the symbol.
reloc.addend should always be 0 <= reloc.addend < symbol.bytes.len()
to avoid a panic on slice. I'm pretty sure the fake addends I create in this PR always are.
But for real addends read from object files, I think theoretically the addend could be garbage if there's something wrong with the file? Or a bug in the compiler that produced the object file, or something? I figured displaying nothing would be better than objdiff crashing entirely if something funky ever did happen, but I haven't encountered an example of this in reality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait I forgot there actually is one situation it crashes without this check, .bss symbols. They have a size and I give them an addend if a field in the middle of the symbol is accessed, but target.bytes
is always an empty vec, because .bss variables don't have bytes stored in the object file.
This crashes if I remove the reloc.target.bytes.len() > reloc.addend
check:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! I missed that it was pulling the data from the symbol's contents.
🚀 |
This expands on #108 in order to allow the data display in the tooltip to also work for references to symbols that have been pooled in the current function, instead of just direct references where the relocation is on the same line as the load instruction itself.
Here are some examples of what this supports:
It allows viewing pooled float literal loads so that you can more easily see if the value is the same or different, regardless of whether the offset specified by the instruction is different or not (these screenshots are from the target and base of a fully decompiled function, but other functions in the TU haven't been started yet, so the .rodata section is unfinished):
data:image/s3,"s3://crabby-images/67e42/67e422473f5e0998f39025dd20e9922bd3efc613" alt="2024-12-01_17_16_35_805_objdiff"
data:image/s3,"s3://crabby-images/0c4f0/0c4f0be852ecbb4528e4a987b2816b4bfd9ad173" alt="2024-12-01_17_16_44_904_objdiff"
It allows viewing any string that gets loaded from
data:image/s3,"s3://crabby-images/bf33f/bf33f50436809c625ce58e04f73af52a7990da0e" alt="2024-12-01_17_39_35_772_objdiff"
@stringBase
, not just the first string like in the current version of objdiff:It allows you to see the array symbol name that gets loaded into a particular register when it's done via an
data:image/s3,"s3://crabby-images/45bad/45bad051fa1139a78f4bfc8b1acafa1e1407a582" alt="2024-12-01_16_56_29_163_objdiff"
addi
from a source register that has the pool start address in it (example is at the start of a loop):And then, when one of the elements of that array symbol gets loaded with a lwzx/lbzx/etc instruction, it shows the value of the first element of the array (same function as above screenshot, inside the body of the loop):
data:image/s3,"s3://crabby-images/35ab8/35ab8896ef82024d46ffa3f96ffa367e673a85e1" alt="2024-12-01_16_57_10_452_objdiff"
Note that it does not support showing all the other elements of the array besides the first. So #124 is related but still not fully resolved.
Fixing that issue would be tricky because arrays might contains structs and not just single primitive data type, so you would need to at least know the array's stride.
When a value in the middle of a symbol is accessed, it now shows the offset from the start of the symbol that is being accessed in the tooltip:
data:image/s3,"s3://crabby-images/253ce/253ce437b7e8c40e29b13d221bd20ee65e1354e9" alt="2024-12-01_18_08_23_818_objdiff"
The above example is showing a load of the
v_offset
field at offset 0x4 in a PTMF struct. The PTMF is located at offset 0x200 from the start of the .data section. So it displays200+4
rather than204
, and@1373+4
instead of@1373
.I explain the exact implementation details in the comments so I won't repeat that here.
But I will mention what I didn't implement: Following branches.
I determine what registers contain what pools just by looking through each instruction in the function from top to bottom, ignoring control flow. So there could easily be false positives or false negatives.
I was actually originally planning to implement following branches, but was surprised that the naive branchless implementation I wrote as a temporary placeholder just seemed to... work on all the functions I tested it on. So I didn't implement branches because I didn't want to add a bunch of complexity without even one example of where it would help.
But I only checked maybe a dozen functions in total, so it's definitely possible for weird edge cases to exist that I didn't catch. Let me know if functions like that are found.