caprevoke: Avoid crashing if the revoker encounters a kernel capability #2319

markjdb · 2025-02-06T15:08:10Z

This mitigates a bug in the DTRACEIOC_EPROBE ioctl handler (which I have since fixed locally) that caused kernel panics when running certain dtrace scripts.

Kernel bugs, especially in ioctl handlers, can lead to a kernel capability being leaked to userspace. When the revoker encounters such a capability it ends up accessing memory beyond the end of the bitmap, which may trigger a panic, e.g., if the vm_cheri_revoke_tlb_fault() fault handler is invoked in FAST_COPYIN mode. Such bugs ought to be fixed, of course, but panicking the system is not really desireable, especially since the resulting crash dump doesn't help much in determining how the capability was leaked in the first place. Explicitly check for out-of-bounds capabilities before accessing the shadow bitmap and print a warning to the console instead.

jrtc27 · 2025-02-06T15:51:27Z

Hm, should this not be a deterministic panic? If this happens then the kernel has just found out that it has screwed up enforcing its security.

markjdb · 2025-02-06T16:01:13Z

Hm, should this not be a deterministic panic? If this happens then the kernel has just found out that it has screwed up enforcing its security.

My interpretation of @rwatson 's comments yesterday is that panicking is probably too severe a reaction. Maybe I misunderstood. This problem is not uncommon, especially given that FreeBSD doesn't consider kernel address leaks to be bugs. And, panicking makes it harder to actually debug the issue in my experience.

I'm happy to change the policy here. Perhaps a security.cheri sysctl to control the behaviour here is warranted.

nwf · 2025-02-06T16:53:14Z

Leaking kernel addresses is one thing, but kernel capabilities are... spicy. The MMU probably means they're probably equivalent to addresses while they're in userspace, but userspace being able to present one back to the kernel seems... risky?

jrtc27 · 2025-02-06T17:03:30Z

Especially spicy for the process if the bounds of those capabilities include part of userspace, or a sealing range.

brooksdavis · 2025-02-06T17:50:07Z

My inclination is that the revoker should kill the process with an option to panic.

@qwattash's experiment to use local-global to prevent kernel capability leaks is probably relevant here. We could also do something heavy handed in the ioctl code and have some sort of checked copyout.

jrtc27 · 2025-02-06T17:51:55Z

My hesitation with suggesting that was that other processes may also be sharing that memory, or copies thereof, so killing the process stops that one from using it, but you now know there is a real chance of other processes having that capability too.

qwattash · 2025-02-06T19:18:15Z

My inclination is that the revoker should kill the process with an option to panic.

@qwattash's experiment to use local-global to prevent kernel capability leaks is probably relevant here. We could also do something heavy handed in the ioctl code and have some sort of checked copyout.

The patch was relatively simple, perhaps it is worth to merge it and enable it? I think there are currently 2 limitations:

I may have missed some place where user memory is mapped into the kernel.
If we start using local/global in userspace, you can't send local capabilities to the kernel and expect to get them back anymore.

brooksdavis · 2025-02-06T22:32:04Z

My inclination is that the revoker should kill the process with an option to panic.
@qwattash's experiment to use local-global to prevent kernel capability leaks is probably relevant here. We could also do something heavy handed in the ioctl code and have some sort of checked copyout.

The patch was relatively simple, perhaps it is worth to merge it and enable it? I think there are currently 2 limitations:

I may have missed some place where user memory is mapped into the kernel.

This seems like an omnipresent risk so I'm not sure there's more do be done.

If we start using local/global in userspace, you can't send local capabilities to the kernel and expect to get them back anymore.

I think this is ok so long as there's a way to turn of the enforcement at the user/kernel boundary for experimentation. I think there's broad agreement that the 2-color scheme isn't sufficient in practice for machines with MMUs.

brooksdavis · 2025-02-06T23:35:00Z

My hesitation with suggesting that was that other processes may also be sharing that memory, or copies thereof, so killing the process stops that one from using it, but you now know there is a real chance of other processes having that capability too.

It is true that killing the process isn't fully safe, but I agree with Mark that it's likely hard to debug the problem if you just panic. At a minimum I think we should make killing the process an option.

I almost wonder if we want some sort of "scan all the vmspaces" functionality we could trigger in the event that we stumble upon a leak.

markjdb requested review from brooksdavis and nwf February 6, 2025 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

caprevoke: Avoid crashing if the revoker encounters a kernel capability #2319

caprevoke: Avoid crashing if the revoker encounters a kernel capability #2319

markjdb commented Feb 6, 2025

jrtc27 commented Feb 6, 2025

markjdb commented Feb 6, 2025

nwf commented Feb 6, 2025

jrtc27 commented Feb 6, 2025 •

edited

Loading

brooksdavis commented Feb 6, 2025

jrtc27 commented Feb 6, 2025

qwattash commented Feb 6, 2025

brooksdavis commented Feb 6, 2025

brooksdavis commented Feb 6, 2025

caprevoke: Avoid crashing if the revoker encounters a kernel capability #2319

Are you sure you want to change the base?

caprevoke: Avoid crashing if the revoker encounters a kernel capability #2319

Conversation

markjdb commented Feb 6, 2025

jrtc27 commented Feb 6, 2025

markjdb commented Feb 6, 2025

nwf commented Feb 6, 2025

jrtc27 commented Feb 6, 2025 • edited Loading

brooksdavis commented Feb 6, 2025

jrtc27 commented Feb 6, 2025

qwattash commented Feb 6, 2025

brooksdavis commented Feb 6, 2025

brooksdavis commented Feb 6, 2025

jrtc27 commented Feb 6, 2025 •

edited

Loading