Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init: optionally load the system SELinux policy #400

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

WavyEbuilder
Copy link

Implements #399 . Currently a draft PR for some of the reasons noted in that issue. Another thing to add:

  • I saw mentioned here, Replace use of iostream-based classes #240 (comment), that you don't want to be falling back to C interfaces. For now, I'm just calling fprintf if the SELinux policy fails to load (as /dev/console is likely unable to be accessed at that point). Would you prefer std::cout to be used for now in this case?

@WavyEbuilder WavyEbuilder force-pushed the master branch 2 times, most recently from 5952f94 to 35df218 Compare October 15, 2024 20:53
@WavyEbuilder
Copy link
Author

Current TODO (just to note I am aware of it) is to link against libselinux in the Makefile.

@WavyEbuilder
Copy link
Author

I've added an --enable--selinux option to the configure script, however I am not really sure how best to go about linking against libselinux as it is an optional dependency. Any advice in this regard?

@WavyEbuilder
Copy link
Author

Think I've got that sorted by passing a linker flag.

@davmac314
Copy link
Owner

I will review properly when I get a chance. I don't know a heap about SELinux so bare with me. A couple of things I will point out now though:

  • Definitely avoid using fprintf, it's best if you can keep to the conventions already used throughout. cerr would normally be the right option for a definite error message, however, output should generally go via the log interface instead of directly via cout/cerr. Eg log(loglevel_t::ERROR, "(error message here)") possibly followed by flush_log() if this is before a call to exit().
  • I'm not a fan of the re-exec approach either. If the correct label can be assigned to the already-running executable (i.e. what systemd apparently does), it seems like we should just do that.
  • I'm inclined to think that there should either be a command-line option to disable loading the policy, or that should be the default and there should a command-line option to enable it.
  • Probably the whole setup should be moved to its own separate function

@WavyEbuilder
Copy link
Author

WavyEbuilder commented Oct 16, 2024

Hey! Appreciate the fast response.

Definitely avoid using fprintf, it's best if you can keep to the conventions already used throughout. cerr would normally be the right option for a definite error message, however, output should generally go via the log interface instead of directly via cout/cerr. Eg log(loglevel_t::ERROR, "(error message here)") possibly followed by flush_log() if this is before a call to exit().

A lot of the other init systems mentioned /dev/console not being available at that point in time (which is a reasonable assumption as if the policy fails to load there is a good chance SELinux would block access to /dev/console). I had a quite glance through dinit-log.cc and it appears like it just uses stdout - would this be correct?

I'm not a fan of the re-exec approach either. If the correct label can be assigned to the already-running executable (i.e. what systemd apparently does), it seems like we should just do that.

I'll make sure to change that to use setcon_raw(3)

I'm inclined to think that there should either be a command-line option to disable loading the policy, or that should be the default and there should a command-line option to enable it.

Generally the command-line option is provided to the kernel cmdline (which the selinux_init_load_policy parses and handles for us). The main options are enforcing=0 (force SELinux to boot in enforcing mode regardless of /etc/selinux/config) and selinux=0 (disable selinux alltogether). So as far as I can tell, all of that should be handled for us

Probably the whole setup should be moved to its own separate function

I'll make sure to do that

@WavyEbuilder
Copy link
Author

Also, would you like me to commit any changes as separate commits until you are happy with it so you can see the diffs between changes a bit easier, or would rebasing be preferred?

@WavyEbuilder
Copy link
Author

Alright took a look at this a bit more thoroughly and I have a few design questions to raise quickly, notably regarding Probably the whole setup should be moved to its own separate function:

  1. Shall we plan ahead now for other security frameworks? I'm only really an SELinux guy as I mentioned in Load security policies for LSMs #399 , but it might be worth doing similar to systemd, i.e.:
static int initialize_security(
                bool *loaded_policy,
                dual_timestamp *security_start_timestamp,
                dual_timestamp *security_finish_timestamp,
                const char **ret_error_message);

Then we could just call our implementation of a similar function once in dinit_main (ideally as early as possible, being security frameworks it makes sense to attempt to load them as early as possible). Now given that we have C++ to hand here, I was wondering how to go about propagating errors back. Being C++11, we don't have anything like std::optional, but taking a look at some functions that can return failure in dinit's source, they appear to return a bool. My original idea was something like std::optional<std::string> (and returning an error message on failure, otherwise std::nullopt), but of course that is only available in C++17. Would something like char * work here instead then maybe (returning nullptr on success... though that's a bit strange granted)? Or is the specific failure message relevant at all here and should we just return a bool to indicate success status?

  1. Regarding headers for this, I was thinking maybe something along the lines of:
    src/includes/mac-util.h // contains initialize_security, might be worth just putting it in dinit-util.h?
    src/includes/dinit-mac/selinux.h // SELinux helper functions so we could (like systemd again) just quickly call a generic setup function for each MAC framework.

I think that'd be a reasonable way of structuring things, but being my first PR to dinit and my first real work on a C++ project below 14, your input would be greatly appreciated.

Thanks!

@davmac314
Copy link
Owner

taking a look at some functions that can return failure in dinit's source, they appear to return a bool ... Would something like char * work here instead then maybe (returning nullptr on success... though that's a bit strange granted)? Or is the specific failure message relevant at all here and should we just return a bool to indicate success status?

Log any relevant message (via log(...) functions), then return a bool (usually false for failure).

Shall we plan ahead now for other security frameworks?

If we're just talking about adding a single method, I don't see any advantage in doing that now, as it can easily be done if and when support for other security frameworks are added. Let's just keep it simple.

Generally the command-line option is provided to the kernel cmdline (which the selinux_init_load_policy parses and handles for us). The main options are enforcing=0 (force SELinux to boot in enforcing mode regardless of /etc/selinux/config) and selinux=0 (disable selinux alltogether). So as far as I can tell, all of that should be handled for us

Ok, that sounds fine.

@davmac314
Copy link
Owner

Generally the command-line option is provided to the kernel cmdline (which the selinux_init_load_policy parses and handles for us). The main options are

Actually, thinking about this more, there may still be cases where SELinux is enabled but the loading of the policy should not be performed. One example is given by the Systemd code here.

Also, can you clarify what this might mean? I want to know what file descriptors SELinux would keep open and in what circumstances.

Are you doing this work for a distribution or is it a more personal endeavour?

@davmac314
Copy link
Owner

(Finally: make sure you have read CONTRIBUTING and CODE-STYLE documents, if you haven't already. Thanks!).

@WavyEbuilder
Copy link
Author

WavyEbuilder commented Oct 17, 2024

Log any relevant message (via log(...) functions), then return a bool (usually false for failure).

Got it, thanks :)

If we're just talking about adding a single method, I don't see any advantage in doing that now, as it can easily be done if and when support for other security frameworks are added. Let's just keep it simple.

Yup, makes sense, should be fairly easy with to deal in the future. Will do!

Actually, thinking about this more, there may still be cases where SELinux is enabled but the loading of the policy should not be performed. One example is given by the Systemd code here.

That's a good example. For that specific initrd case, I had a look at systemd's in_initrd(void) function and found this comment:

/* If /etc/initrd-release exists, we're in an initrd.
 * This can be overridden by setting SYSTEMD_IN_INITRD=0|1.
 */ 

Would you like to do the same with an override here? (maybe something more like DINIT_IN_INITRD)

If you'd still like to add a flag override of some sorts to tell dinit to not load the selinux policy, I feel like that should be an opt in sort of thing, because while not loading the selinux policy won't load any of the user's policy, it doesn't mean selinux won't be loaded. In that case, everything runs in the kernel context, and taking my distro (Gentoo) as an example, Portage fails to operate if I just run with the system booted in the kernel context, so that is very much an edge case and almost never intended to be a reality. But the initrd case makes sense.

Also, can you clarify what this might mean? I want to know what file descriptors SELinux would keep open and in what circumstances.

Afaik this is generally in the case of when it needs to audit something related to that fd so it keeps it open for a bit? I remember seeing something similar in the past, but I'm not exactly too happy with that answer as I can't answer it with 100% confidence, so if that's okay I'll do a bit of digging in the libselinux docs and get back to you on that.

Are you doing this work for a distribution or is it a more personal endeavour?

I'm currently using Gentoo and dinit on basically all of my systems (which doesn't have upstream Gentoo support currently), however I am hoping to try improving the support (and maybe possibly getting official support) for dinit for both Adelie and Gentoo (nothing offical, just me on my own there, though I have spoken to some developers of the respective distros about that, but absoloutly nothing really offical yet). This specific piece of work was started when I hit a few weird bugs with my SELinux policy while dealing with a few ebuilds for dinit related to SELinux (policy for it), so in the future as Gentoo has offical SELinux support it might be useful, but for now really just take it as a personal endeavour, I'm not affiliated with any distro officially :)

(Finally: make sure you have read CONTRIBUTING and CODE-STYLE documents, if you haven't already. Thanks!).

Had a read over them already, but I'll make sure to reread CODE-STYLE

@WavyEbuilder
Copy link
Author

Did a bit of digging and found this commit systemd/systemd@a3dff21 which seems to explain it quite nicely.

@WavyEbuilder
Copy link
Author

Hmm reading the setcon(3) man page:

sets  the  current  security context of the process to a new value.  Note that use of this function requires that the entire application be trusted to maintain any desired separation between the old and new security contexts, unlike exec-based transitions performed via setexeccon(3).  When possible, decompose your application and use setexeccon(3) and execve(3) instead.

Still seems fine to transition with setcon_raw(3), but it probably makes sense to make sure to do it as early as possible, before we open any other file descriptors. I'll push a new commit using setcon_raw(3) shortly.

@WavyEbuilder
Copy link
Author

I think it'll be best to consider how SELinux aware we want dinit to be at this stage. After reading some more man pages and systemd code, it seems like systemd transitioning itself to the new context is an okay option as they already make heavy use of selinux throughout (i.e. in transient units which have an SELinuxContext= option). However, if our only goal (at least in the short-term) is to load the policy, then I think it makes sense to stick with the execve(3) solution.

If we want to be a little bit more SELinux aware, (i.e. if it is not unforeseeable to make use of libselinux more throughout dinit), then I would probably start off by creating an selinux utils header of some point as there is quite a bit of setup, etc that'll need to be done.

My personal opinion would be (for the sake of simplicity) to stick with the execve(3) solution as it'd be a fair amount simpler for now. In the future, if something like the dinit-run and transient units mentioned in #253 (comment) gets implemented, we could add support for launching said units with a specific selinux context (I'd be happy to work on something like that), and at that point as that'd require a decent amount of setup anyway, it'd make sense to just use setcon_raw(3) at that point. I'm guessing the (relatively) unneeded complexity of correctly transitioning ourself to the right domain after loading the policy is why the other simpler init systems (openrc, sysvinit, etc) that don't really make use of a lot of SELinux's features just load the policy and relaunch themselves.

However, if you'd still like to continue with setting our own context, I can go down that route. It'd require a bit more design though, so it might be worth working on getting some helper functions stubbed out firstly.

@davmac314
Copy link
Owner

for now really just take it as a personal endeavour, I'm not affiliated with any distro officially :)

That's fine but it will need a commitment from you that you will support it going forward, or otherwise make it clear that it's experimental/unsupported in relevant documentation. (Incidentally I think you missed updating the build instructions - that's something that would need to be added).

To be honest I'm not sure I'm following your reasoning in a few ways:

it seems like systemd transitioning itself to the new context is an okay option as they already make heavy use of selinux throughout (i.e. in transient units which have an SELinuxContext= option)

I don't really understand why that makes a difference. What is the reason why this would not be a good option for dinit as well? (I get that Dinit doesn't provide specific support for SELinux features when executing service processes, but why does that make a difference as to the mechanics of how the policy is loaded?)

I'm fine with the policy loading happening very early in dinit's execution. But:

My personal opinion would be (for the sake of simplicity) to stick with the execve(3) solution

If we are going to call execve anyway, I'm not sure I see any point having the functionality in dinit itself. It could just as well be a wrapper (called init) that loads the policy and then execves dinit. Or am I missing something? Having dinit re-exec itself on each boot doesn't seem right to me at all.

I know you had a few other questions for me and I can go back to those, but I really need some clarification on these points. I don't want to be discouraging but it seems like there are a few details you're not really sure of yourself, and that gives me some pause. I'm hesitant to incorporate something where I really don't understand why things have been done the way they have. If there's open questions that you need to sort out, please feel free to take whatever time you need to do that, but let's get them sorted first and talk details of the code then.

@WavyEbuilder
Copy link
Author

Hey,

That's fine but it will need a commitment from you that you will support it going forward, or otherwise make it clear that it's experimental/unsupported in relevant documentation. (Incidentally I think you missed updating the build instructions - that's something that would need to be added).

I can commit to that, but would also be happy mentioning it is experimental.

I don't really understand why that makes a difference. What is the reason why this would not be a good option for dinit as well? (I get that Dinit doesn't provide specific support for SELinux features when executing service processes, but why does that make a difference as to the mechanics of how the policy is loaded?)

I think I phrased that badly earlier, I'll rephrase it a little here now. Systemd makes use of SELinux a lot inside it being quite SELinux aware so it already has a lot of boilerplate that will be used elsewhere. The reason why transitioning to the new context is a little more complex is because it requires a bit more setup, we are in a privileged domain at that point and we are sort of entrusting ourselves to do it right, so it'd require a fair amount more code I would think.

If we are going to call execve anyway, I'm not sure I see any point having the functionality in dinit itself. It could just as well be a wrapper (called init) that loads the policy and then execves dinit. Or am I missing something? Having dinit re-exec itself on each boot doesn't seem right to me at all.

An initramfs can work fine for this (and often is!) or some simpler pid 1 that's only job is to launch dinit properly with the right context, but (at least to me) that feels a little unnecessary.

I know you had a few other questions for me and I can go back to those, but I really need some clarification on these points. I don't want to be discouraging but it seems like there are a few details you're not really sure of yourself, and that gives me some pause.

That's fair, I did phrase it quite badly above. My main reasoning was based off the Let's just keep it simple. comment, so basically the only "problem" is that it's just a bit more work that we need to get right for (at least in my opinion) minimal gain. However there's nothing preventing us from doing that if you wish :) It's just a bit more code, so I thought I'd mention that.

I'm hesitant to incorporate something where I really don't understand why things have been done the way they have. If there's open questions that you need to sort out, please feel free to take whatever time you need to do that, but let's get them sorted first and talk details of the code then.

That makes sense. For now I'll just presume we're going down the route of transitioning ourselves to a new context and I'll push in a bit with an example of that and let you compare.

Incidentally I think you missed updating the build instructions - that's something that would need to be added).
I did miss that, my bad, I'll make sure to update that as well.

@davmac314
Copy link
Owner

davmac314 commented Oct 17, 2024

I think I phrased that badly earlier, I'll rephrase it a little here now. Systemd makes use of SELinux a lot inside it being quite SELinux aware so it already has a lot of boilerplate that will be used elsewhere. The reason why transitioning to the new context is a little more complex is because it requires a bit more setup, we are in a privileged domain at that point and we are sort of entrusting ourselves to do it right, so it'd require a fair amount more code I would think.

This, I guess, is what I don't understand. I can see that opening file descriptors before loading the policy might give access to things that the policy will then disallow but, if we are loading the policy quite early and we have opened file descriptors then in fact we do need those file descriptors. If the policy disallowed that access then that would be a broken policy anyway. Dinit doesn't go around just casually opening files. Likewise any other resource it has accessed, it probably needs. And anyway, as far as I can tell, applying the security label will enforce access against file descriptors that were opened previously anyway.

Eg from https://www.systutorials.com/docs/linux/man/3-setcon_raw/ -

Since access to file descriptors is revalidated upon use by SELinux, the new context must be explicitly authorized in the policy to use the descriptors opened by the old context if that is desired.

If the process was being ptraced at the time of the setcon() operation, ptrace permission will be revalidated against the new context and the setcon() will fail if it is not allowed by policy.

Given those are taken care of (and ptrace shouldn't be an issue anyway), and given that we'd be loading the policy early (before doing just about anything anyway), what would be the concerns in regards to "entrusting ourselves to do it right?" I'm after concrete examples.

An initramfs can work fine for this (and often is!) or some simpler pid 1 that's only job is to launch dinit properly with the right context, but (at least to me) that feels a little unnecessary.

To me it feels unnecessary to me add specific support (including a library dependency) for something in Dinit which can be handled just as well from outside, and re-executing our own process right after we start honestly just feels like a hack. At the moment my position on that is a "no", I would need to be given a good, concrete reason for why that should change.

Applying the security context within the already-running process without re-execing would be acceptable, though.

My main reasoning was based off the Let's just keep it simple. comment, so basically [...] It's just a bit more code, so I thought I'd mention that.

A little bit more code isn't an issue, if we have already gone as far as adding a dependency and providing support for SELinux then we may as well do it properly. My bad for the "keep it simple" comment which caused confusion - I meant, keep it restricted to the specific functionality that you are wanting to implement; we don't need abstraction layers for handling other security frameworks, etc.

But before you said:

so it'd require a fair amount more code I would think.

Is it a bit, or is it a fair amount? (or am I conflating two different things?)
(Edit: what I'm really asking is: is it a matter of say 3-4x the amount of code currently in the PR, all confined to a single function? Or are we talking sweeping changes throughout the codebase? If it's the former, that's totally fine).

@WavyEbuilder
Copy link
Author

WavyEbuilder commented Oct 18, 2024

As for entrusting ourselves to do it right, the main concern surrounds file descriptors. That is why I'm a little worried about the log interface. We don't want to leave any of our file descriptors open when we transition. Then in effect we would have a "context mismatch" otherwise, as the file descriptor inherits the context of the process it was opened by. So we would have issues regarding access control to existing contexts at that point . That's ideally why I'm thinking we load ourselves as early as possible. If we do that, we should be fine.

A good example is this. Imagine we start out with kernel context (what dinit starts out with on my system before the policy is loaded). We have some fd's opened, and the kernel context has permission to use them. Now we load the policy, transition ourselves, and the loaded policy executes us as init_t. In the Gentoo refpolicy, then that kernel context becomes kernel_t. Now SELinux will prevent us from using those open file descriptors.

(Edit: what I'm really asking is: is it a matter of say 3-4x the amount of code currently in the PR, all confined to a single function? Or are we talking sweeping changes throughout the codebase? If it's the former, that's totally fine).

Oh I see what you meant by keeping it simple now :) I was a bit confused with what you were after, but it shouldn't really require any sweeping changes for the codebase, it should all be confined to that function we'll make to load the policy, which will be the thing that's a bit longer. That's all good then, I can make the transition work.

I'll get to work on that, and if there are any concerns please let me know. Thanks for all the time you've given this so far, appreciated a lot.

@WavyEbuilder WavyEbuilder force-pushed the master branch 2 times, most recently from 5ddfc4d to d5295d9 Compare October 18, 2024 04:01
@WavyEbuilder
Copy link
Author

Alright I think I've got this working as desired now. I've just made a function selinux_transition that is above dinit_main for now so you can take a look, if you'd like me to move it let me know.

If anything is unclear/you feel any comments are needed, please let me know, and I'll make sure to add them.

@WavyEbuilder
Copy link
Author

(just updated an error message as i'm not longer using setexeccon_raw(3) and removed a close as i am no longer opening something else). Should be ready for review now

@WavyEbuilder
Copy link
Author

Just fixed another silly mistake, forgot to chance 0 and 1 to true and false as I changed the return type to a bool as you suggested. My apologies, actually ready for review now.

@WavyEbuilder
Copy link
Author

The documentation is in general pretty light. Is there anything more we need to say about SELinux either in the dinit man page or in a separate readme file? Eg: does the SELinux support require anything - does it need special filesystems (/proc and/or /sys) to be mounted before dinit starts, for example?

As for the special filesystems, selinux_init_load_policy(3) handles this all automatically. For example, taking a snippet from it's function body:

	/* Check for an override of the mode via the kernel command line. */
	rc = mount("proc", "/proc", "proc", 0, 0);

Including mounting what it needs (i.e. /sys/fs/selinux isn't needed to be mounted either). Everything else (such as the enforcing=0 cmdline parameter) is outside of the scope of init's documentation imo, as that is init system agnostic and documented elsewhere in SELinux's documentation. I suspect future things I want to work on will make some selinux specific documentation nice for them, but for this commit, it's basically just works outside of the --disable-selinux-policy flag.

@WavyEbuilder
Copy link
Author

Done, ready for review. Just a quick note, this comment:

//   exe - the path that we are invoked with (to calculate our new security context to transition into.)

extends the line wrapping to 105, and I saw you mentioned it was okay up until 110 chars if it means it can be fit on one line. I'm not sure how that applies here as that is part of a bigger comment block, but I think this should be kept all on one line? Let me know if you like me to change this.

@davmac314
Copy link
Owner

As for the special filesystems, selinux_init_load_policy(3) handles this all automatically. For example, taking a snippet from it's function body: [...]
Including mounting what it needs (i.e. /sys/fs/selinux isn't needed to be mounted either).

Holy mackerel. Dinit does not, and will not, do anything even halfway as disruptive to the system as mounting actual filesystems, completely behind the user's back.

On the selinux_transition function, I said:

I would like a reasonably extensive comment on this explaining what it does

I certainly would expect "mounts /proc and /sys" to be part of that comment if those are indeed being mounted by that function. And it must be documented that that is what happens, in the dinit man page.

The documentation is important, and so is the philosophy of the software being transparent about what it does.

@davmac314
Copy link
Owner

davmac314 commented Nov 1, 2024

Done, ready for review. Just a quick note, this comment:

//   exe - the path that we are invoked with (to calculate our new security context to transition into.)

extends the line wrapping to 105, and I saw you mentioned it was okay up until 110 chars if it means it can be fit on one line. I'm not sure how that applies here as that is part of a bigger comment block, but I think this should be kept all on one line? Let me know if you like me to change this.

What I said was: "may extend to 110 in cases where that means the entire comment will fit on a single line". Since this is a small part of the comment, and not the entire comment, then no, it should be maximum 100 characters and should wrap before going beyond that.

Edit: and, since we're mentioning that comment, can you please put it in the format required by CODE-STYLE? No blank lines between short description and "Parameters:" / "Returns:", longer description goes after the "Returns".

src/dinit.cc Outdated
if (enforce > 0) {
cerr << "Failed to load SELinux policy." << endl;
return false;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the load fails (selinux_init_load_policy returns 0) but enforce is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?

Copy link
Author

@WavyEbuilder WavyEbuilder Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the enforcing variable, there are 2 states:

  • = 0 is permissive mode
  • > 0 is enforcing mode

In the case of enforcing, the system is expected to boot in a secure state, i.e. with the system's SELinux policy enforcing. If we fail to load it, we should halt the boot as that is an untrusted state.

In the case of permissive, it's generally better to be more lenient. In permissive mode, actions are expected to be logged, but any failures revolving around the policy aren't expected to have an impact on the system's operation. Other SELinux aware applications, such as load_policy(1) from policycoreutils also reflect this behavior:
https://github.com/SELinuxProject/selinux/blob/main/policycoreutils/load_policy/load_policy.c#L68-L77

Let me know your thoughts, I can add a comment explaining this too if desired

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not answering the question I asked. If the policy fails to load (selinux_init_load_policy returns 0) but enforce is 0, what is this function supposed to do?

The comment says:

If we fail to load the system SELinux policy, return false

But it's not even returning at this point, it's just going on through to the following call to (getcon_raw(&current_context). Isn't that wrong?

Other SELinux aware applications, such as load_policy(1) from policycoreutils also reflect this behavior

That code checks whether loading the policy failed, and if so, prints an error message and retuns an exit code:

https://github.com/SELinuxProject/selinux/blob/main/policycoreutils/load_policy/load_policy.c#L82-L85

That's different to what is here, which just continues as if no error had happened.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And when I say "continues as if no error had happened", I mean that it goes on to try and set a new context (getcon_raw, getfilecon_raw, security_compute_create_raw, etc). I assume that those will fail, since the policy load failed, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that those will fail, since the policy load failed, right?

Those won't fail as it'll just continue on with the kernel context.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know your thoughts

I know that you have done some work on this and are probably trying hard, but in all honesty I am very close to pulling the plug. I need you to clearly answer the questions that I'm asking, not just push more code.

The question I asked was:

So if the load fails (selinux_init_load_policy returns 0) but enforce is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?

I still don't have a satisfactory answer to that. I can't accept this PR without understanding what it is doing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those won't fail as it'll just continue on with the kernel context

Are you saying that setcon_raw will do nothing but still return success?

I decided to clarify this a bit further, so getcon_raw seems to be fine (which makes logical sense), and setcon_raw will be fine, but the issue lies in security_compute_create_raw as that actually attempts to read the policy to calculate something, so that should fail. I tested it by disabling the load temporarily on my machine, and writing a small program to see what would/wouldn't fail regarding those functions. So logically based off all of this, I think the early return makes sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know your thoughts

I know that you have done some work on this and are probably trying hard, but in all honesty I am very close to pulling the plug. I need you to clearly answer the questions that I'm asking, not just push more code.

The question I asked was:

So if the load fails (selinux_init_load_policy returns 0) but enforce is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?

I still don't have a satisfactory answer to that. I can't accept this PR without understanding what it is doing.

Understood. So:

So if the load fails (selinux_init_load_policy returns 0) but enforce is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?

This is accurate, i.e. that wasn't right. I don't think we should fail (as in exit) there, and I incorrectly gathered that it would just fall through (because it'll attempt to set the same context). I overlooked the fact security_compute_create_raw specifically reads from the policy store, so that'd fail. So an early return of true would make sense. Sincere apologies

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sincere apologies

Ok, but: you need to slow down and read my questions/comments properly, and answer them comprehensively. I'm spending far more time on this that I would like, and it seems like it's mostly because I have to repeatedly ask questions until I get a straight answer.

So an early return of true would make sense

"An early return of true" will currently fail the boot. So is that what is supposed to happen?

so getcon_raw seems to be fine (which makes logical sense),

Where does this information come from? (I'm having trouble even finding the source for getcon_raw).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry: an early return of true won't fail the boot, I was thinking an early return of false.
However, an early return of true would break the contract described in the function comment:

// Returns:
//   If we fail to load the system SELinux policy, return false, otherwise, return true.

So that might need to be updated, or expanded on in the descriptive part of the comment.

@WavyEbuilder
Copy link
Author

As for the special filesystems, selinux_init_load_policy(3) handles this all automatically. For example, taking a snippet from it's function body: [...]
Including mounting what it needs (i.e. /sys/fs/selinux isn't needed to be mounted either).

Holy mackerel. Dinit does not, and will not, do anything even halfway as disruptive to the system as mounting actual filesystems, completely behind the user's back.

On the selinux_transition function, I said:

I would like a reasonably extensive comment on this explaining what it does

I certainly would expect "mounts /proc and /sys" to be part of that comment if those are indeed being mounted by that function. And it must be documented that that is what happens, in the dinit man page.

The documentation is important, and so is the philosophy of the software being transparent about what it does.

Sorry about that, I was having a bit of tunnelvision on anything the user would be expected to setup. Noted, I'll make sure to update the documentation to reflect that. Apologies

@WavyEbuilder
Copy link
Author

Done, ready for review. Just a quick note, this comment:

//   exe - the path that we are invoked with (to calculate our new security context to transition into.)

extends the line wrapping to 105, and I saw you mentioned it was okay up until 110 chars if it means it can be fit on one line. I'm not sure how that applies here as that is part of a bigger comment block, but I think this should be kept all on one line? Let me know if you like me to change this.

What I said was: "may extend to 110 in cases where that means the entire comment will fit on a single line". Since this is a small part of the comment, and not the entire comment, then no, it should be maximum 100 characters and should wrap before going beyond that.

Edit: and, since we're mentioning that comment, can you please put it in the format required by CODE-STYLE? No blank lines between short description and "Parameters:" / "Returns:", longer description goes after the "Returns".

Got it, will fix

@WavyEbuilder
Copy link
Author

What I said was: "may extend to 110 in cases where that means the entire comment will fit on a single line". Since this is a small part of the comment, and not the entire comment, then no, it should be maximum 100 characters and should wrap before going beyond that.

Edit: and, since we're mentioning that comment, can you please put it in the format required by CODE-STYLE? No blank lines between short description and "Parameters:" / "Returns:", longer description goes after the "Returns".

I've reread CODE-STYLE and I think that comment should be sorted now. Please let me know if it isn't as desired

@davmac314
Copy link
Owner

I've done a little more investigation of my own. It looks like, while selinux_init_load_policy does indeed mount proc (1) and mount sysfs (2), it actually unmounts proc (3) before returning.

I earlier asked:

does the SELinux support require anything - does it need special filesystems (/proc and/or /sys) to be mounted before dinit starts, for example?

And you answered:

As for the special filesystems, selinux_init_load_policy(3) handles this all automatically.

However, it looks to me like that is only half the story, since it unmounts /proc. But the other functions that have been used, such as getcon_raw, do not work without proc mounted. They error out with: No such file or directory. For getcon_raw that is because it calls getprocattrcon_raw (4) which calls openattr (5) which tries to open files in `/proc' (6).

I tested this to be certain. I definitely see that error when getcon_raw is called, if /proc has not been mounted before dinit is started.

It looks a lot like you gave me incorrect information. Have I missed something?

@WavyEbuilder
Copy link
Author

WavyEbuilder commented Nov 2, 2024

Ok, but: you need to slow down and read my questions/comments properly, and answer them comprehensively. I'm spending far more time on this that I would like, and it seems like it's mostly because I have to repeatedly ask questions until I get a straight answer.

Noted. Apologies once again, I sincerly appreciate all the time and patience you have given me, I won't let this happen again.

Where does this information come from? (I'm having trouble even finding the source for getcon_raw).

This is from a local test. In setest.c:

#include <selinux/selinux.h>
#include <stdio.h>
#include <string.h>

int
main(void)
{
  char *context = NULL;
  if (getcon_raw(&context) < 0) {
    perror("getcon_raw");
    return 1;
  }
  printf("context: %s\n", context);
  return 0;
}

Compiling and running without my system policy loaded, I get:

$ cc -l selinux setest.c -o setest
$ ./setest
context: kernel

This makes sense all things considered, as the context of currently running proccesses are available without the policy loaded. For example, on a system without the policy loaded:

$ cat /proc/1/attr/current && printf '\n'
kernel

Running on a system with the policy loaded, this is the output I get:

$ ./setest
context: sys.id:sys.role:sys.subj:s0

(which is the context of the setest)

So it doesn't fail, but just returns the kernel context, which is expected.

I then made a new setest.c as follows:

#include <selinux/selinux.h>
#include <stdio.h>
#include <string.h>

int
main(int argc, char **argv)
{
  char *current_context = NULL;
  char *file_context = NULL;
  int security_class = 0;
  char *new_context = NULL;

  if (getcon_raw(&current_context) < 0) {
    perror("getcon_raw");
    return 1;
  }
  if (getfilecon_raw(argv[0], &file_context) < 0) {
    perror("getfilecon_raw");
    return 1;
  }
  security_class = string_to_security_class("process");
  if (security_class == 0) {
    fprintf(stderr, "error: string_to_security_class failed\n");
    return 1;
  }
  if (security_compute_create_raw(current_context, file_context, security_class, &new_context) < 0) {
    perror("security_compute_create_raw");
    return 1;
  }

  printf("context: %s\n", new_context);
  return 0;
}

Running without the policy loaded, the following occurs:

$ cc -l selinux setest.c -o setest
$ ./setest
error: string_to_security_class failed

Compared to when the policy is loaded:

$ cc -l selinux setest.c -o setest
$ ./setest
context: sys.id:sys.role:sys.subj:s0

Which succeeds as expected.

From the security_class_to_string(3) man page:

string_to_security_class() returns the class value corresponding to the
string name name, or zero if no such class exists.

This (expectly) returns 0 when the policy isn't loaded, as the class process (https://selinuxproject.org/page/ObjectClassesPerms#process) doesn't exist as the policy isn't loaded.

I hope that is detailed enough and makes sense. Let me know if anything isn't clear.

However, an early return of true would break the contract described in the function comment:
So that might need to be updated, or expanded on in the descriptive part of the comment.

I'll make sure to do that.

It looks a lot like you gave me incorrect information.

I will be honest, yes - this seems like an oversight on my part, I missed that it unmounted /proc. I am interested in why I didn't catch this earlier though, I did actually test this locally and it seemed to be fine. I'm guessing something else was mounting /proc, though I did tell my initramfs not to.

I think it might be worth me spending some time writing some unit/igr tests for this pr to catch these sort of cases. I'll also make sure to detail how I locally test things (like above) in future to ensure this doesn't happen again.

(I'm having trouble even finding the source for getcon_raw).

They should be here:
https://github.com/SELinuxProject/selinux/blob/main/libselinux/include/selinux/selinux.h#L33
https://github.com/SELinuxProject/selinux/blob/main/libselinux/src/procattr.c#L255

@davmac314
Copy link
Owner

Ok, mistakes get made. I think key takeaway here is that this PR is not yet ready, it needs some additional testing and some additional documentation. Please take whatever time you need and "at" me when ready (but I will personally be giving the PR a rest regardless for a couple of weeks at least, so take your time). Before pinging me please remember to double check everything, "self review" and make sure previous points raised are addressed, code style is according to requirements, comments and documentation are complete and accurate; and, specify (via PR comments) what testing has been done. Thanks.

Note:

I think it might be worth me spending some time writing some unit/igr tests for this pr to catch these sort of cases. I'll also make sure to detail how I locally test things (like above) in future to ensure this doesn't happen again.

I think it is going to be hard to write tests for this. Local testing may be the only viable option. That's ok as long as you test thoroughly.

@WavyEbuilder
Copy link
Author

WavyEbuilder commented Nov 3, 2024

Just want to add something else after some more research.

        r = getcon_raw(&con);
        /* getcon_raw can return 0, and still give us a NULL pointer if /proc/self/attr/current is
         * empty. SELinux guarantees this won't happen, but that file isn't specific to SELinux, and may be
         * provided by some other arbitrary LSM with different semantics. */
        if (r == 0 && con) {
                initialized = !streq(con, "kernel");
                freecon(con);
        } else
                initialized = false;

This check likely applies to us too. I think it'd make sense to check for this. While this doesn't directly affect us, it'd make the overall system a bit more robust and ensures we play well if built with SELinux but another LSM is loaded, as mentioned in the comment. Everything else so far seems to match up with what systemd is doing.

I'll wait for any feedback before rushing to push code. Once that is done, logic wise we should be pretty close to being solid I think.

Edit: just seen the above message.

I think key takeaway here is that this PR is not yet ready, it needs some additional testing and some additional documentation. Please take whatever time you need and "at" me when ready (but I will personally be giving the PR a rest regardless for a couple of weeks at least, so take your time). Before pinging me please remember to double check everything, "self review" and make sure previous points raised are addressed, code style is according to requirements, comments and documentation are complete and accurate; and, specify (via PR comments) what testing has been done. Thanks.

Absolutely. I've noted them all down and I'll make sure to.

I think it is going to be hard to write tests for this. Local testing may be the only viable option. That's ok as long as you test thoroughly.

I'll document all my testing I do locally here then for now.

I'm also going to take 1-3 days of a break from a PR just to regroup myself a bit. To be completely honest, I've had a bit of anxiety responding, but you've been fantastic to work with, and your patience has helped a lot. A huge thanks for that.

After that, I'll get too all the changes you mentioned and work on some documentation for it :)

Thanks again

@davmac314
Copy link
Owner

I'll wait for any feedback before rushing to push code.

As I said, take your time, get it ready and get it right (as much as you can). As far as I'm concerned the review is over, please make whatever changes are needed to get it into shape and ping me then, and we'll start fresh.

@WavyEbuilder
Copy link
Author

Hi Davin,

This message isn't a ping (intentionally) as I still am yet to write a bit more documentation, but a small status update.

Just made a few commits. I've added a comment above the check for current_context being a nullptr to explain why, as that seems a bit strange at first glance, so I think it'd be valuable to add for anyone reading the code.

I've clarified a little more about the kernel context to be more accurate in the selinux_transition comment. I haven't updated that comment regarding the early return if selinux_init_load_policy(3) fails in permissive, as while I am confident in the early return, I'm a little on the fence about what the behavior of dinit should be in permissive if it fails. I think as this is behavioral choice with neither option necessarily being right or wrong, I should lay out some points for and against it and let you as the maintainer decide what you want to do.

For early returning true (and continuing the boot if selinux_init_load_policy(3) fails in permissive mode):

  • Would allow users to prod about to investigate why the loading of the policy failed
  • Since permissive was requested its not enforcing anything anyway, so it isn't a huge deal in that regard

For early returning false (and halting the boot if selinux_init_load_policy(3) fails in permissive mode):

  • In the same way a user should expect something to not enforce actions but just log actions in permissive, if they boot in permissive mode, they are expecting that the SELinux policy is loaded but actions are not enforced. The policy failing to load in permissive is still deviating from that expectation.
  • It is common advice in the SELinux community to avoid booting with the selinux=0 kernel cmdline flag, which tells the kernel to not load any SELinux-related components. This is because it can cause files that are modified, and new files that are created to be created with an invalid or non existent context (as SELinux isn't enabled to label said objects). In our case where selinux_init_load_policy(3) has failed in permissive, while the kernel related components of SELinux are enabled, without the policy loaded, said objects will not be labeled correctly. This breaks another expectation of the user, i.e. that as they are booted in permissive mode, objects will be labeled correctly according to the requested policy (that should be loaded).

As for what systemd does, it continues the boot in permissive mode and logs a warning. The code for that can be seen here https://github.com/systemd/systemd/blob/main/src/core/selinux-setup.c#L90-L102.

@davmac314
Copy link
Owner

I think as this is behavioral choice with neither option necessarily being right or wrong, I should lay out some points for and against it and let you as the maintainer decide what you want to do.

This doesn't seem like a decision that has any sort of major impact on the PR so: make a decision yourself, and justify it. From my point of view I want as little work to do, until the PR is ready, as possible. Thanks.

(No need to give status updates either. Just take whatever time you need, finish the work, do it right, do it well).

getcon_raw(3) may return succesfully and still give us a nullptr, so
let's check whether the return value is less than 0 or we have a
nullptr, not mandating that getcon_raw(3) returns with < 0 to error.
If we can successfully load the selinux policy, the inability to
transition to the domain specified by the policy is a policy
misconfiguration, not a dinit runtime issue. Let's continue boot, and
allow the standard SELinux logging mechanics to kick in instead, inline
with what systemd does.
Returning true if we fail to load the policy while the system is in
permissive mode breaks one of the assumptions of permissive mode: that
new files are labelled correctly according to the system policy (unlike
when booting with all SELinux related infrastructure disabled by passing
selinux=0 to the kernel cmdline). Let's always exit if we we fail to
load the policy; if the loading of the fails, something has gone badly
wrong, and the system should be booted with selinux=0 to disable all
selinux related infrastructure.
It's misleading to say that a policy choice is a misconfiguration, the
choice to prevent dinit from transitioning from its inital domain is a
choice made the local system administrator. Let's clarify our comments
to specify that this isn't an error, but a warning, as this is still a
unsual policy choice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Importance: Normal C-dinit Things about the main parts of dinit Enhancement/New Feature Improving things or introduce new feature make Things about Dinit's Make build system meson Things about Dinit's Meson build system P-Linux Things related to Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants