Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked BlueprintHookManager to better-prevent hooks from corrupting the code or each other #320

Merged

Conversation

Epp-code
Copy link

@Epp-code Epp-code commented Nov 7, 2024

When testing blueprint hooks, I learned that hooking a function return after hooking a function start would corrupt the start hook and crash upon blueprint function call. I also realized that, because the existing approach sometimes needs to move original function instructions, there's a chance it could move an instruction that an unrelated part of the function tries to jump to, which would also crash. There's a high-level description of the issue and thoughts on a fix in Discord here:

https://discord.com/channels/555424930502541343/862002356626128907/1303244570708152360

This PR implements what I hinted at in Discord - it's a fix that should support arbitrary amounts of arbitrary offset hooking (though the offsets must still be aligned with instructions, of course). Every new hook entry point now moves all instructions after it just enough to make room for the hook jump. It then scans the entire function and updates any affected jump instructions to point to their updated locations. It then installs the hook jump and the new hook code after the function's existing code.

Thoughts/explanations for the reviewer:

  1. I tried to mimic the existing code style but please let me know if anything looks out of place or needs more comments/etc. I am fairly code-style-agnostic and am happy to clarify things.
  2. The most complex part of this PR is ModifyJumpTargetsForNewHookOffset, which attempts to find jump targets for every applicable opcode and needs careful review.
  3. Because we cannot know how to update the jump target expression of an EX_ComputedJump, I simply assert if the function contains an EX_ComputedJump. As far as I can tell, there is no way to make hooking such a function safe and it was not safe before this PR.
  4. I removed KismetBytecodeDisassembler::GetStatementLength as each statement Json now has its size included and this no longer seem to be used; theoretically, some mod author might be using it but I'd rather err on the side of removing dead code.
  5. The checkf() macros in the affected classes are currently being compiled to no-ops when built for shipping by Alpakit. According to Archengius, compiling out checkf's is not intended and looks like the result of a partially-reverted patch. I don't know what the long-term solution is in the engine code but, for now, I changed them to fgcheckf macros, which are currently not ever compiled out. This could result in crashes for anybody who was unwittingly violating these checks in a way that didn't already crash.
  6. The hook manager stores hooks by what their offset would have been in the original, unmodified function, regardless of the how the function has been edited by existing hooks. This ensures users have a consistent experience and never have to know how the function is modified.

More thoughts

An alternate approach could have been to simply insert the hook calls directly into the function and update affected jump targets in the function. The main downside would be that the amount to update the jump targets would depend on the length of the code needed to invoke the hook, which would make changing that code fragile and error-prone. An unconditional jump will always be the same size even if the hook invocation code changes or needs to vary based on the hook itself for some reason.

…redefinedHookOffset::Return would result in two separate hooks in the map - though they would be called correctly, if a function from the first hook set a return jump, it would skip the functions from the second hook, which would violate the contract that all hook functions at a location are called when a return jump is set.
Copy link
Contributor

@Th3Fanbus Th3Fanbus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me but I would like @mircearoata to take another look

Copy link
Member

@mircearoata mircearoata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These below aren't things that you should implement in this PR, but I'll write them down here so they aren't lost if anyone ever wants to tackle them.

While looking into the uses of the missing EX_SkipOffsetConst fixup, I remembered that events have functions stubs that call into ExecuteUbergraph at an offset, which led to discovering that EX_ComputedJump is only ever used in ExecuteUbergraph as the first statement, since the stub passes an int (not SkipOffsetCode, which is why I was looking into where EX_SkipOffsetConst is used in the first place). Now, this does mean that the argument to EX_ComputedJump can only be an int property, and so it can be adjusted if event hooks follow the function call to ExecuteUbergraph to hook the event at a specific offset.
This whole thing sounds rather convoluted, since the provided offset would be relative to the event offset in the Ubergraph (iirc all of an event's bytecode is a contiguous section of the Ubergraph), and it would likely need a different API to make the distinction clear, but I think it could have usecases.

Going through JSON when disassembling the bytecode feels a bit unnecessary, not sure why Arch went for JSON initially, to me it seems like using UObjects or structs to represent the bytecode data would be much easier to consume, with the drawback of potentially creating a lot of UObjects (one per expression, not sure how large functions can get).
And regarding the public API of the disassembler and the JSON output, as far as I know no mod uses the disassembler directly, since there's not much reason to, it's only useful for hooking functions, and the Asset Dumper does not use this to dump the function bytecode, it has its own implementation which is the same as SML's (and why the SML version has SML in the struct name) so that it can work on any game.

@Epp-code Epp-code force-pushed the epp-rework-blueprinthookmanager branch from fcff71c to c0631d4 Compare November 22, 2024 23:52
…ctions and fixed sanity check on ResolvedHookOffset when inserting a hook. Also cleaned up some additional usages of GetNumberField that should have been GetIntegerField.
@mircearoata mircearoata merged commit dd64e7c into satisfactorymodding:dev Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants