-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: put bpf_link's program when link is safe to be deallocated #7997
base: bpf-next_base
Are you sure you want to change the base?
Conversation
Upstream branch: e626a13 |
Upstream branch: e626a13 |
979790c
to
7f7e84a
Compare
3c8d365
to
8f9bc60
Compare
Upstream branch: e5e4799 |
7f7e84a
to
b187aa1
Compare
Upstream branch: e5e4799 |
b187aa1
to
a801bcc
Compare
8f9bc60
to
103c737
Compare
Upstream branch: 4d99e50 |
a801bcc
to
a410a35
Compare
103c737
to
0e4003e
Compare
Upstream branch: 77017b9 |
a410a35
to
50373ad
Compare
0e4003e
to
5e2942e
Compare
Upstream branch: f2daa5a |
50373ad
to
948a06c
Compare
5e2942e
to
3e790e1
Compare
Upstream branch: 9a78313 |
948a06c
to
1adefe8
Compare
3e790e1
to
2f0c8a7
Compare
In general, BPF link's underlying BPF program should be considered to be reachable through attach hook -> link -> prog chain, and, pessimistically, we have to assume that as long as link's memory is not safe to free, attach hook's code might hold a pointer to BPF program and use it. As such, it's not (generally) correct to put link's program early before waiting for RCU GPs to go through. More eager bpf_prog_put() that we currently do is mostly correct due to BPF program's release code doing similar RCU GP waiting, but as will be shown in the following patches, BPF program can be non-sleepable (and, thus, reliant on only "classic" RCU GP), while BPF link's attach hook can have sleepable semantics and needs to be protected by RCU Tasks Trace, and for such cases BPF link has to go through RCU Tasks Trace + "classic" RCU GPs before being deallocated. And so, if we put BPF program early, we might free BPF program before we free BPF link, leading to use-after-free situation. So, this patch defers bpf_prog_put() until we are ready to perform bpf_link's deallocation. At worst, this delays BPF program freeing by one extra RCU GP, but that seems completely acceptable. Alternatively, we'd need more elaborate ways to determine BPF hook, BPF link, and BPF program lifetimes, and how they relate to each other, which seems like an unnecessary complication. Note, for most BPF links we still will perform eager bpf_prog_put() and link dealloc, so for those BPF links there are no observable changes whatsoever. Only BPF links that use deferred dealloc might notice slightly delayed freeing of BPF programs. Also, to reduce code and logic duplication, extract program put + link dealloc logic into bpf_link_dealloc() helper. Tested-by: Jordan Rife <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]>
BPF link's lifecycle protection scheme depends on both BPF hook and BPF program. If *either* of those require RCU Tasks Trace GP, then we need to go through a chain of GPs before putting BPF program refcount and deallocating BPF link memory. This patch adds bpf_link-specific sleepable flag, which can be set to true even if underlying BPF program is not sleepable itself. If either link->sleepable or link->prog->sleepable is true, we'll go through a chain of RCU Tasks Trace GP and RCU GP before putting BPF program and freeing memory. This will be used to protect BPF link for sleepable (faultable) raw tracepoints in the next patch. Tested-by: Jordan Rife <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]>
Now that kernel supports sleepable tracepoints, the fact that bpf_probe_unregister() is asynchronous, i.e., that it doesn't wait for any in-flight tracepoints to conclude before returning, we now need to delay BPF raw tp link's deallocation and bpf_prog_put() of its underlying BPF program (regardless of program's own sleepable semantics) until after full RCU Tasks Trace GP. With that GP over, we'll have a guarantee that no tracepoint can reach BPF link and thus its BPF program. We use newly added tracepoint_is_faultable() check to know when this RCU Tasks Trace GP is necessary and utilize BPF link's own sleepable flag passed through bpf_link_init_sleepable() initializer. Tested-by: Jordan Rife <[email protected]> Reported-by: Jordan Rife <[email protected]> Fixes: a363d27cdbc2 ("tracing: Allow system call tracepoints to handle page faults") Signed-off-by: Andrii Nakryiko <[email protected]>
Upstream branch: 1850ce1 |
1adefe8
to
6425f9a
Compare
Pull request for series with
subject: bpf: put bpf_link's program when link is safe to be deallocated
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=905159