Skip to content
This repository has been archived by the owner on Dec 30, 2021. It is now read-only.

optimize how we patch blocking syscalls #23

Open
wangbj opened this issue Mar 5, 2019 · 6 comments
Open

optimize how we patch blocking syscalls #23

wangbj opened this issue Mar 5, 2019 · 6 comments
Labels
enhancement New feature or request

Comments

@wangbj
Copy link
Collaborator

wangbj commented Mar 5, 2019

With current design, if a syscall blocks, systrace don't patch it until it returns. The reason behind that is because if we do patch, when the original syscall is blocked, after it resumes it see invalid instructions after the two-byte syscall instruction. best case is we get SIGILL or SIGSEGV, worst case it the trail three-byte could be a valid instruction sequence, which lead to undefined behavior.

Though we still cannot patch when a syscall is blocked, we can however make the blocking window a lot shorter, such as modifying the syscall parameters, to make it non-blocking. Another approach is we can also patch certain syscalls before hand, so that we wouldn't have to worry about it later.

building glibc can easily expose this issue: the build process seems create tons of pipes, and causes lots of blocking read/write.

@wangbj wangbj added the enhancement New feature or request label Mar 5, 2019
@rrnewton
Copy link
Collaborator

rrnewton commented Mar 5, 2019

@wangbj - I need you to unpack this for me a bit further, because I don't understand why we need to ever allow the code to return to the instruction after the original syscall (PC = orig_syscall + 2).

If we turn the very 1st attempt to execute the syscall into a trap, then the handler runs before the syscall ever gets to -- effectively a prehook. If we do ultimately execute a blocking syscall, it should be via the untraced_syscall function right? We should always call the captured_syscall function, irrespective of how the event was intercepted (trap or patched code site), right? There's not some way that individual syscall invocations slip through the cracks and don't get intercepted, is there? (Which would mean they can genuinely block at the syscall PC.)

In fact, I think the following theorem should hold in general:

  • Theorem: No syscall in the original app should ever be executed from its original address in the code (except the single instruction inside the body of untraced_syscall)

If this theorem is false for our design (and worse, cannot be made true), then I want to understand why.

@wangbj
Copy link
Collaborator Author

wangbj commented Mar 5, 2019

You're right, there's a bug when handling ptrace_event_exec, the patched_syscalls field should be zeroed, because exec* replace the entire program's code/data. The issue you mentioned should be fixed by commit 80e47d6. But we still have the needs of patching the same syscalls repeatedly for every exec*-ed new processes.

@rrnewton
Copy link
Collaborator

Wait, so does the theorem hold? It's hard for me to understand how that linked patch connects to the issue of patching blocking syscalls (which is an issue even if we never call fork/exec, right?).

@wangbj
Copy link
Collaborator Author

wangbj commented Mar 11, 2019

I believe so, there's a patched_syscall member for each task (or tracee), to keep record of patched syscall sites, when we exec, this field should have been cleared, because the old patched_syscall doesn't apply to the new task (at least for now), as exec just creates a brand new context.

To elaborate we won't try to patch a syscall site, if it was recorded in patched_syscall, hence why we see a lots of syscalls going through with secomp instead.

@chamibuddhika
Copy link

I have a query which I feel is related to discussion. What would be the control flow in which handle_syscall_exit reached?

https://github.com/iu-parfunc/systrace/blob/58e6b261c9035cc912c61847255289f1ae8b0530/src/traced_task.rs#L770

@wangbj
Copy link
Collaborator Author

wangbj commented Apr 19, 2019

I have a query which I feel is related to discussion. What would be the control flow in which
handle_syscall_exit reached?

This is the SECCOMP syscall exit, it is caused by call ptrace(pid, PTRACE_SYSCALL,...) when entered SECCOMP syscall enter stop.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants