[Debugger]: Current Instruction: CALL instead of Return Offset #6700

chezwicker · 2021-03-27T08:22:54Z

chezwicker
Mar 27, 2021

Please feel free to reclassify as feature request (making this configurable) if you don't agree with my classification!

Describe the bug

The instruction selected as current in the dynamic listing seems to be the address of the return offset. I would prefer to have the preceding CALL instruction selected, that would be more consistent with what most IDEs' debuggers do - and much more easily understandable imho.

To Reproduce

I've only looked at this in Windows using the "IN-VM" connector to dbgeng so far, but I assume the behavior is the same using GDB.

Steps to reproduce the behavior:

Load a windows program
"Step into" once
Have a look at the call stack and (double) click on some stack frames
See which instruction is highlighted in "Dynamic Listing" (RIP)

Expected behavior

The CALL instruction preceding the instruction at the return offset should be highlighted.

Environment (please complete the following information):

OS: Windows 10 10.0 amd64 (VM)
Java Version: AdoptOpenJDK 11.0.10
Ghidra Version: 9.3 DEV, commit b69c3d6
Ghidra Origin: locally built

d-millar · 2021-03-29T22:53:45Z

d-millar
Mar 29, 2021
Collaborator

Not opposed to this in principal, but I'd prefer not to have to do this calculation in Listing, i.e. it'd be nice if that's a property in the object that we can highlight off of. Doing it on the GUI side raises portability issues across targets. For instance, not quite sure what Windows ARM should look like off the top of my head.

0 replies

nsadeveloper789 · 2021-03-30T19:53:15Z

nsadeveloper789
Mar 30, 2021
Maintainer

In principle, we didn't want to modify the connected debugger's interpretation of machine state. When you ask WInDbg to provide a backtrace, IIRC, it is also going to give the return offsets. Part of that deals with how x86 works. I don't believe the call offset is recorded by the machine, only the return address. Put simply, we want the information displayed by Ghidra to reflect exactly the information displayed in the connected debugger. We can add static analysis to that context, but we do not intend to change the interpretation, even for commonly-known quirks. FWIW, there is support for client-side / user-scripting annotation of the dynamic listing (trace database), it's just not well integrated into the UI, yet.

0 replies

chezwicker · 2021-03-30T20:23:55Z

chezwicker
Mar 30, 2021
Author

@nsadeveloper789: maybe my expectation is somewhat skewed, then. I assumed one of the main goals was to offer a unified experience in Ghidra, abstracting from the individual underlying debuggers connected.

I'm also not quite sure marking the CALL as "current line" would be changing the interpretation, would it? The debugger rightly provides the return address for the call executed, but shouldn't the UI show the "current instruction" the program has been interrupted at instead of the one it would eventually return to once all later stack frames have been cleared?

The main reason I'm keen to have this is because the current behavior is one of the more confusing things to people getting acquainted with Ghidra; but maybe that's because they typically come from the IDE world, not from OS specific debuggers.

Thanks anyway for taking the time to look into this!

0 replies

d-millar · 2021-03-30T21:33:57Z

d-millar
Mar 30, 2021
Collaborator

@chezwicker I have to ask - and don't take this the wrong way, I'm actually quite curious - which IDE are you using? @nsadeveloper789 and I were discussing the issue, and neither gdb nor windbg highlight the call instruction. Both highlight the return address. I would like to see an IDE that is highlighting the call, and maybe dig into its code. There are a number of cases I can think of where this would be very tricky and if someone has a solution.... Cheers!

0 replies

chezwicker · 2021-03-31T22:05:30Z

chezwicker
Mar 31, 2021
Author

@d-millar : well, any IDE I've used so far (e.g. VSCode, Eclipse) highlight the function call, not the following statement. I guess I was being unclear though that I'm not referring to debugging binaries, but working from source.

But I think I'm missing sth. in general: are there cases when it wouldn't be a CALL instruction preceding the return offset or where the op codes would be ambiguous? If not, would it not be relatively simple to disassemble the preceding bytes to find the matching instruction? Given that we know the forward, the type of CALL could be devised, no?

0 replies

d-millar · 2021-04-01T01:06:19Z

d-millar
Apr 1, 2021
Collaborator

@chezwicker Ahh, sure, I get it - for source that's definitely the norm. For disassembly, though, I think you'd be hard pressed to find an example where it's not the return address that's highlighted. Gdb IDEs , windbg (and the other kd variants), VisualStudio, VSCode, XCode, and even Eclipse, I believe, all highlight the return address for disassembly. The reason for this is, I think, two-fold.

First, backwards disassembly for variable-length instruction sets is inherently ill-defined. For instance, it's quite possible to have the offset in a far call actually be interpretable as a valid near call. While forward disassembly is well-determined, as it has to be, reverse disassembly is at best an educated guess.

Second, there's actually no inherent requirement in most assembly languages that the return address be preceded immediately by the calling instruction. Most compilers observe this rule, but I can think of a bunch of examples where it's violated, among them code with embedded data, hand-optimized assembly, stacks built using push operations only, certain exception handlers, and various instances of kernel code, especially interrupt handlers.

Admittedly, your proposal would probably work and be correct 99% of the time, but would be deeply misleading and incorrect for the remaining 1%. Also, as you've pointed out, we'd like to be consistent with other tools with the same functionality, but our closest relatives really aren't source IDEs. For better or worse, we're a bit more primitive than that. :)

0 replies

chezwicker · 2021-04-07T20:52:58Z

chezwicker
Apr 7, 2021
Author

@d-millar I'm afraid I have a two follow-up questions:

Since we do have the forward disassembly from the static analysis, would we not be able to use that to avoid "educated guessing" (in "normal" cases)?
- When you say that forward disassembly is well-determined, do you mean "reproducible"? Or does Ghidra follow all branches of an execution to deal with anti-disassembly techniques?
- Regarding clever approaches, how does Ghidra handle "overlapping" instructions (like backward jumps to bytes previously recognized as part of another instruction)?
I'm a bit confused here, I must admit: aren't the CALL instructions standardized in what they do for a given chipset (as opposed to a given compiler / assembler)? Wouldn't the return address originally written always be the instruction right after? I must admit I might be missing some stuff because I'm mainly dealing with Intel.
- I'm aware that the address can be (and in "optimized" cases will be) overwritten
- I guess then the question is: what exactly does Ghidra show us:
  - the original address written by the caller
  - the address current when interrupted
  - or the latest value?

Thanks for your time!

0 replies

d-millar · 2021-04-08T01:50:21Z

d-millar
Apr 8, 2021
Collaborator

@chezwicker No problem re questions - you're doing us a huge favor by providing feedback! The very least I can do is answer your questions, insofar as I am able. So....

First question, first. By "reproducible", I actually meant, I guess, well-determined. If you start at a particular point, there is really only one interpretation of that instruction (assuming it is an instruction). The instruction has a fixed length, so you know where to start the next instruction, and so on. When you hit a direct branch, you can decode the instruction that follows, but you also have a relative offset that will get you to the other path, so you're still in good shape. If you get to an indirect branch like "jmp [rax]", well, things get more complicated. The static disassembly engine in Ghidra basically follows this technique to explore the code. It does a lot of other stuff too, but in its simplest form that is what it's doing.

For dynamic analysis, i.e. the debugger, we do none of that really or at best a little sliver of that. We really only attempt to recover a short sequence of instructions going forward in the dynamic view. And , for things like "overlapping" instructions, neither the static nor the dynamic engine does anything truly spectacular because...well, what can you do? At best, you put a marker on the code indicating that something odd is going on that the analyst should be aware of, which is essentially what the static engine does. The dynamic engine has the luxury of recomputing the disassembly as you go, but hard to say whether that's more or less clever.

Going backwards is a whole different deal, and perhaps an example here would be useful. Let's say we have the following bytes at offset 0x45dc8e: e8 00 00 ff d0 48 49 ff. On the stack, we have the return address, 0x45dc93. What is the address of the calling instruction? Well, maybe it's 0x45dc8e or maybe it's 0x45dc91. If I had to guess, I'd go with the latter, as the former is calling a subroutine that's a fair ways away, but I really have no way of knowing, especially if my context is just the stack and the list of bytes preceding it. Unfortunately, even in "normal" code, there are plenty of examples like this. We can use heuristics to make an educated guess, but (a) it's a guess, and (b) we're spending resources to make this guess in an environment that we want to be responsive.

OK, with regard to your second question, yes, for a chipsets like Intel's, call will presumably push the address immediately following the call instruction in a uniform and consistent way. However, the stack may be created any number of different ways. The stack frame may not have been generated by a call at all. That doesn't make it an invalid stack frame, just a stack that's hard to walk.

So, what is Ghidra showing us? And on this point, I think I might concede we may not have made the best choice. In the Stack View in Ghidra's debugger, the address shown is the return address pushed on the stack, i.e. it's value the program counter will have after the frame is popped. Calling it the PC maybe was a bad idea. We might want to say "return address". The one problem is the value in frame 0 IS the program counter (there is no return address), so calling it "return address" for that frame might be confusing.

If any of this isn't clear or is questionable or wrong (or you'd like more, really horrible examples), let me know! Happy to discuss some more!

D

0 replies

chezwicker · 2021-04-09T21:43:30Z

chezwicker
Apr 9, 2021
Author

@d-millar thanks for the detailed reply!

The main thing I don't get is why Ghidra solely relies on the dynamic (backwards) disassembly instead of taking into account an existing static disassembly. Of course in the case of overlapping instructions (an idea for that below), the dynamic view would have to prevail, but the default could be to use the static disassembly if the PC/IP aligns with an instruction found there?

As for interweaving instructions and dynamic flow, my idea (for the static analysis) would be to always follow all paths, and where there's a conflict (interwoven instructions or different results depending on which branch is taken), I'd offer both alternatives, leading to something that I assume would look a bit like a graphical representation of a git revision history (branches going out and possibly being "merged" back). When executing dynamically, the path taken could be highlighted (for the sections which are recorded, especially when stepping) and possibly corrected, improving the static analysis.

With regards to the example you gave, I think we do have additional information: we know where we currently are (or more generically, where the PC/IP stood for the frame following the call), no? Of course, that still doesn't give us all the answers (especially for indirect calls as for the second option in your example (if we had the register info for previous frames - #2868 - we would get even further ;-)), but combined with the static disassembly, how often would this likely go wrong? Would it be off in more cases than now (when the return address can also be changed by manipulating the stack)?

Thanks!

P.S. I was wondering about the example, what is 48 49 FF? Can't seem to make sense of that...

0 replies

d-millar · 2021-04-09T22:21:34Z

d-millar
Apr 9, 2021
Collaborator

@chezwicker Your point is well-made, and I think you may have convinced @nsadeveloper789 to at least put in hooks for the kind of thing you're asking for. From a design standpoint, I think out argument boils down to two points, which we are admittedly somewhat attached to. First, we're trying (trying!) to make the debugger as efficient as possible. This is particularly important for remote targets over slow connections, and maybe less relevant here where the computation could be done on the machine where Ghidra proper is running, but we really don't like adding features that slow the execution down. For instructions where we had a static match, you're probably right - the expense might not be horrible, but, even checking whether you have a static match is an expense. Second and, I think, more importantly, we really hate including features that are, even just occasionally, wrong. I cannot tell you how many times I've gone down a rabbit hole trying to figure out what's going on in my target only to find out the debugger kind of/sort of/didn't really reflect what was going on under the hood. I like truth in advertising and GUIs.

That said, we're considering this one. :)

D

P.S. Re 48 49 FF, you know - I really don't know. I think they might just have been bytes left in a chunk of memory I was overwriting to test out examples. Sorry about that!

0 replies

nsadeveloper789 · 2021-04-12T13:51:16Z

nsadeveloper789
Apr 12, 2021
Maintainer

To clarify my opinion on this: I like the idea from an abstract and pedagogical standpoint, because in some sense, you're "in the CALL", but the reality is the CALL has already executed and the machine has moved on to the next instruction. That being said, Ghidra is meant to be an extensible framework. At the moment, the LocationTrackingSpec interface, which is where all of the PC/SP-highlighting logic is implemented, is not marked as an ExtensionPoint. That is something I'd like to rectify, giving you the tools to try implementing this yourself, but it's no guarantee, and it may be a little bit before I can get it in.

0 replies

nsadeveloper789 · 2021-04-13T16:45:19Z

nsadeveloper789
Apr 13, 2021
Maintainer

We just pushed a lot of pending changes, including making LocationTrackingSpec an extensible interface.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Debugger]: Current Instruction: CALL instead of Return Offset #6700

{{title}}

Replies: 12 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[Debugger]: Current Instruction: CALL instead of Return Offset #6700

chezwicker Mar 27, 2021

Replies: 12 comments

d-millar Mar 29, 2021 Collaborator

nsadeveloper789 Mar 30, 2021 Maintainer

chezwicker Mar 30, 2021 Author

d-millar Mar 30, 2021 Collaborator

chezwicker Mar 31, 2021 Author

d-millar Apr 1, 2021 Collaborator

chezwicker Apr 7, 2021 Author

d-millar Apr 8, 2021 Collaborator

chezwicker Apr 9, 2021 Author

d-millar Apr 9, 2021 Collaborator

nsadeveloper789 Apr 12, 2021 Maintainer

nsadeveloper789 Apr 13, 2021 Maintainer

chezwicker
Mar 27, 2021

d-millar
Mar 29, 2021
Collaborator

nsadeveloper789
Mar 30, 2021
Maintainer

chezwicker
Mar 30, 2021
Author

d-millar
Mar 30, 2021
Collaborator

chezwicker
Mar 31, 2021
Author

d-millar
Apr 1, 2021
Collaborator

chezwicker
Apr 7, 2021
Author

d-millar
Apr 8, 2021
Collaborator

chezwicker
Apr 9, 2021
Author

d-millar
Apr 9, 2021
Collaborator

nsadeveloper789
Apr 12, 2021
Maintainer

nsadeveloper789
Apr 13, 2021
Maintainer