-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCSDIS: Analyse control flow for recursive functions, and functions with incompatible fork merging #57
Comments
As discussed privately with @DrMcCoy and on DeadlyStream, I have a potential solution to the recursion issue. Copy/pasting my algorithm below: *** BEGIN QUOTE *** In any case, what you can do is the following (after constructing a call graph):
One issue with this is the problem of vector/struct arguments. This method will tell you the size of the stack space which is popped from the stack after calling each subroutine; however, it does not tell you the structure of said space. However, this is easy to calculate once you know the number of arguments - when you call the function in another subroutine, you can use the state of the stack at that call to determine the types of the sub's arguments (including how much space they take up). *** END QUOTE *** To be clear, even though the linear system might be overdetermined, it should still have only one solution. One easy way to implement this is to use a linear algebra solver for overdetermined systems, and check the residual is zero (within a tolerance). I've implemented this algorithm in my ncs2nss repo if you're interested. It doesn't work all the time, but I suspect this is due to either bugs in the rest of my code (there are still many) or compiler errors. I feel the latter can be resolved with a modification to the algorithm, but we can discuss that later. |
Hi :)
My ncs2nss decompiler is nearly complete, but I've run into a bit of a roadblock. I use the output from ncsdis's analysis to create function signatures, which greatly simplifies the data flow analysis (because I can do the analysis in a single pass). However, the control flow fails in the following cases:
From looking at the decompiled code (attached to the issue - don't take it as gospel though, as there are certainly mistakes in it) for the file "cp_end_trasksp_d.ncs", which causes the block fork issue (as discussed on the DeadlyStream forums), I have a suspicion that the problem is due to some compiler issue with returning from a subroutine, where it doesn't clean up the stack properly. If I had to guess, I'd say that this issue was never found because it is caught and handled by the interpreter at runtime, but I could be completely wrong. In any case, hopefully the attached decompiled code will help out.
The recursion issue is more difficult - however, I think I have an idea for how it can be solved. When a recursive function is detected, you can parse the function and do the following in a single pass:
After this analysis, the final step is to find the position on the stack of the lowest "below-stack assignment". If this position is not contained by any of the arguments, it's a return value. If the "below-stack assignment" set is empty, or all of the assignments were to positions known to be arguments, there is no return type and it's a void subroutine.
The above algorithm works on non-recursive functions with no issues. If the function calls itself directly (direct recursion), a simple heuristic algorithm I can suggest is to run the previous algorithm multiple times, with the number of arguments hard-set to 0, 1, 2, 3, ..., until you find a number for which the stack is consistent throughout and at the end of the analysis (basically, guessing). Once you have the number of arguments, the return value can also be calculated as above.
If the function does not call itself, but is rather recursively called by another function, I think you would have to run the "guessing" algorithm but with guesses for each of the different subroutines in the recursion graph, which expands the number of possibilities exponentially with the number of functions in the loop. Although the analysis might be relatively expensive, I think it should still be quick enough for any scripts requiring recursion (I'd imagine that most scripts would have fairly small chains).
This should allow you to infer the arguments and return value of a subroutine without having to rely on other subroutines calling it. You could also implement this as a separate check and then compare the result of this to the analysis already implemented, and ensure they are the same as a sanity check.
I might implement this in my ncs2nss code as well, but my preference would be to rely on ncsdis (because there's no point in duplicating code and ncsdis is already most of the way there).
Please let me know if you have any thoughts on this. Thanks :)
cp_end_trasksp_d.txt
The text was updated successfully, but these errors were encountered: