Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of CONCAT22 in syscall wrappers #1

Open
matthewturk opened this issue Nov 30, 2022 · 3 comments
Open

Handling of CONCAT22 in syscall wrappers #1

matthewturk opened this issue Nov 30, 2022 · 3 comments
Assignees

Comments

@matthewturk
Copy link

Hi! This is a really awesome and very helpful toolbox, and has made my janky python script to insert 0x21 handlers totally unnecessary!

I was wondering if there was a way to automatically handle stack pointers to syscalls. It's entirely possible I'm doing something incorrect, but when looking at a syscall-wrapping function (which I assume comes from a stdlib of sorts for my executable) I am getting Ghidra disassembly that looks like:


/* WARNING: Unable to track spacebase fully for stack */

word __stdcall FUN_1c6f_3e99(void)

{
  word wVar1;
  byte in_CL;
  undefined2 unaff_BP;
  undefined2 in_SS;
  undefined2 in_DS;
  undefined in_CF;
  
  *(undefined2 *)((short)&stack0x00000000 + -2) = unaff_BP;
  swi(0x21);
  wVar1 = DosOpenFile((char *)CONCAT22(in_DS,*(undefined2 *)((short)&stack0x00000000 + 4)),0,in_CL);
  if ((bool)in_CF) {
    wVar1 = 0xffff;
  }
  return wVar1;
}

It's correctly getting the DosOpenFile reference, but it's unable to see that the second half of the pointer ((short)&stack0x00000000 + 4) is part of the function call. All of the storage for DosOpenFile seems correct, and it's definitely working with that, but somehow the function call isn't matching up and it's trying to turn it into a full char* rather than the custom storage one. (I think?)

I recognize it might be out of the purview of GhidraDosToolbox to address this, but is this an issue you've encountered before and might have an idea of how to solve? Thanks again for your hard work on this, and any tips you might have.

@Gravelbones
Copy link
Owner

It is a bit hard to tell without seeing the actual assembler code.
Most likely there are a number of problems, which together causes these "errors".

This looks like an internal function, which takes all its "parameters" in registers, but Ghidra has detected none of those.
E.g. the function before this is "fopen" which creates a proper stack frame, by setting BP, then calls this function as part of its processing, relying on the setup of BP in the function before.
This is not something Ghidra handles.

So the function expects that BP[+4] contains a pointer to the name of the file, but BP is not set inside the function, nor is BP defined as a parameter to this function. This "confuses" the disassembler.
So the disassembler tries to handle all this missing information by creating stack0x0000000, unaff_BP, in_DS, in_SS, in_CF.

Another problem is that Ghidra has problems with Segment:Offest addresses.
And I have defined that DosOpenFile expects the name at DS:BX.
So to join the 2 values which makes out one parameter, it uses the function CONCAT22 which means join 2 x 2 bytes to one value.
You would get a different result if you remove DS: from the parameter list.

Another problem is that all internal library functions never conforms to any calling convention and the standard parameter passing, which Ghidra tries to do. So __stdcall is all wrong. This should be a custom calling convention, see below.
A part of what I try to do is crafting a custom calling convention which matches the registers used for each DOS function.

So if you want to "fix" this, you need to create the missing register parameters. The input being DS, BP and CL.
Most likely you should have AX (wVar1) as output. But doing all this may cause other problems.
Removing DS from the parameter list as above, removes the need for passing DS as parameter.

What you really would like (need), is a FID library to identify the higher levels functions like fopen, fwrite ... and simply ignore these internal functions. But of course you need to identify them first. :/

What the function does, is open a file passed into the upper function as first parameter and return the handle number or -1 in case of error (in AX).

@matthewturk
Copy link
Author

@Gravelbones Thank you so much -- I really appreciate you taking the time to write all this! It has given me a lot of options, and also helped me understand why I'm seeing the results I'm seeing while using this project.

I'm going to use this as a way forward, and see if I can make it work internally to this file, and then evaluate whether or not that's worth moving on to a FID. I'm reasonably sure the application was programmed with Borland Turbo C 2.0, but the assembly in the .OBJ and .LIB files distributed with that version of Turbo C don't match, so I suspect they may have implemented their own wrappers to the system calls.

I'll investigate other avenues about how to address the Segment:Offset issues. I've noticed that Ghidra will happily reference the correct locations in the string names of variables (i.e., DAT_2000_6502 or whatever) but not correctly insert memory references. This seems to be a similar issue to the way that pointers in the file I'm working in are defined as explicitly 16 bits, even though they're a combination of DS:DX and whatnot. (Thus making it a little tricky to get assigning char * types to arguments.)

While I really do appreciate you taking the time to reply to me, please go ahead and close this issue if you feel that any further discussion is out of scope! Thanks for your lengthy and helpful reply.

@Gravelbones
Copy link
Owner

You are welcome.
Must admit that I haven't been looking at this for some time, as I found far to many things that need addressing to truly improve on this.
I may try to restart my work at Christmas, where I hope to have some time again.
Do you have access to the Borland C 2.0 object files?
I have access to version 3.0.
Remember that DOS had 6 different memory models. So you have to match the correct memory model of the object file to your program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants