-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init: optionally load the system SELinux policy #400
base: master
Are you sure you want to change the base?
Conversation
5952f94
to
35df218
Compare
Current TODO (just to note I am aware of it) is to link against libselinux in the Makefile. |
I've added an --enable--selinux option to the configure script, however I am not really sure how best to go about linking against libselinux as it is an optional dependency. Any advice in this regard? |
Think I've got that sorted by passing a linker flag. |
I will review properly when I get a chance. I don't know a heap about SELinux so bare with me. A couple of things I will point out now though:
|
Hey! Appreciate the fast response.
A lot of the other init systems mentioned /dev/console not being available at that point in time (which is a reasonable assumption as if the policy fails to load there is a good chance SELinux would block access to /dev/console). I had a quite glance through dinit-log.cc and it appears like it just uses stdout - would this be correct?
I'll make sure to change that to use
Generally the command-line option is provided to the kernel cmdline (which the
I'll make sure to do that |
Also, would you like me to commit any changes as separate commits until you are happy with it so you can see the diffs between changes a bit easier, or would rebasing be preferred? |
Alright took a look at this a bit more thoroughly and I have a few design questions to raise quickly, notably regarding
static int initialize_security(
bool *loaded_policy,
dual_timestamp *security_start_timestamp,
dual_timestamp *security_finish_timestamp,
const char **ret_error_message); Then we could just call our implementation of a similar function once in dinit_main (ideally as early as possible, being security frameworks it makes sense to attempt to load them as early as possible). Now given that we have C++ to hand here, I was wondering how to go about propagating errors back. Being C++11, we don't have anything like
I think that'd be a reasonable way of structuring things, but being my first PR to dinit and my first real work on a C++ project below 14, your input would be greatly appreciated. Thanks! |
Log any relevant message (via
If we're just talking about adding a single method, I don't see any advantage in doing that now, as it can easily be done if and when support for other security frameworks are added. Let's just keep it simple.
Ok, that sounds fine. |
Actually, thinking about this more, there may still be cases where SELinux is enabled but the loading of the policy should not be performed. One example is given by the Systemd code here. Also, can you clarify what this might mean? I want to know what file descriptors SELinux would keep open and in what circumstances. Are you doing this work for a distribution or is it a more personal endeavour? |
(Finally: make sure you have read CONTRIBUTING and CODE-STYLE documents, if you haven't already. Thanks!). |
Got it, thanks :)
Yup, makes sense, should be fairly easy with to deal in the future. Will do!
That's a good example. For that specific initrd case, I had a look at systemd's /* If /etc/initrd-release exists, we're in an initrd.
* This can be overridden by setting SYSTEMD_IN_INITRD=0|1.
*/ Would you like to do the same with an override here? (maybe something more like DINIT_IN_INITRD) If you'd still like to add a flag override of some sorts to tell dinit to not load the selinux policy, I feel like that should be an opt in sort of thing, because while not loading the selinux policy won't load any of the user's policy, it doesn't mean selinux won't be loaded. In that case, everything runs in the
Afaik this is generally in the case of when it needs to audit something related to that fd so it keeps it open for a bit? I remember seeing something similar in the past, but I'm not exactly too happy with that answer as I can't answer it with 100% confidence, so if that's okay I'll do a bit of digging in the libselinux docs and get back to you on that.
I'm currently using Gentoo and dinit on basically all of my systems (which doesn't have upstream Gentoo support currently), however I am hoping to try improving the support (and maybe possibly getting official support) for dinit for both Adelie and Gentoo (nothing offical, just me on my own there, though I have spoken to some developers of the respective distros about that, but absoloutly nothing really offical yet). This specific piece of work was started when I hit a few weird bugs with my SELinux policy while dealing with a few ebuilds for dinit related to SELinux (policy for it), so in the future as Gentoo has offical SELinux support it might be useful, but for now really just take it as a personal endeavour, I'm not affiliated with any distro officially :)
Had a read over them already, but I'll make sure to reread CODE-STYLE |
Did a bit of digging and found this commit systemd/systemd@a3dff21 which seems to explain it quite nicely. |
Hmm reading the
Still seems fine to transition with |
I think it'll be best to consider how SELinux aware we want dinit to be at this stage. After reading some more man pages and systemd code, it seems like systemd transitioning itself to the new context is an okay option as they already make heavy use of selinux throughout (i.e. in transient units which have an SELinuxContext= option). However, if our only goal (at least in the short-term) is to load the policy, then I think it makes sense to stick with the If we want to be a little bit more SELinux aware, (i.e. if it is not unforeseeable to make use of libselinux more throughout dinit), then I would probably start off by creating an selinux utils header of some point as there is quite a bit of setup, etc that'll need to be done. My personal opinion would be (for the sake of simplicity) to stick with the However, if you'd still like to continue with setting our own context, I can go down that route. It'd require a bit more design though, so it might be worth working on getting some helper functions stubbed out firstly. |
That's fine but it will need a commitment from you that you will support it going forward, or otherwise make it clear that it's experimental/unsupported in relevant documentation. (Incidentally I think you missed updating the build instructions - that's something that would need to be added). To be honest I'm not sure I'm following your reasoning in a few ways:
I don't really understand why that makes a difference. What is the reason why this would not be a good option for dinit as well? (I get that Dinit doesn't provide specific support for SELinux features when executing service processes, but why does that make a difference as to the mechanics of how the policy is loaded?) I'm fine with the policy loading happening very early in dinit's execution. But:
If we are going to call I know you had a few other questions for me and I can go back to those, but I really need some clarification on these points. I don't want to be discouraging but it seems like there are a few details you're not really sure of yourself, and that gives me some pause. I'm hesitant to incorporate something where I really don't understand why things have been done the way they have. If there's open questions that you need to sort out, please feel free to take whatever time you need to do that, but let's get them sorted first and talk details of the code then. |
Hey,
I can commit to that, but would also be happy mentioning it is experimental.
I think I phrased that badly earlier, I'll rephrase it a little here now. Systemd makes use of SELinux a lot inside it being quite SELinux aware so it already has a lot of boilerplate that will be used elsewhere. The reason why transitioning to the new context is a little more complex is because it requires a bit more setup, we are in a privileged domain at that point and we are sort of entrusting ourselves to do it right, so it'd require a fair amount more code I would think.
An initramfs can work fine for this (and often is!) or some simpler pid 1 that's only job is to launch dinit properly with the right context, but (at least to me) that feels a little unnecessary.
That's fair, I did phrase it quite badly above. My main reasoning was based off the
That makes sense. For now I'll just presume we're going down the route of transitioning ourselves to a new context and I'll push in a bit with an example of that and let you compare.
|
This, I guess, is what I don't understand. I can see that opening file descriptors before loading the policy might give access to things that the policy will then disallow but, if we are loading the policy quite early and we have opened file descriptors then in fact we do need those file descriptors. If the policy disallowed that access then that would be a broken policy anyway. Dinit doesn't go around just casually opening files. Likewise any other resource it has accessed, it probably needs. And anyway, as far as I can tell, applying the security label will enforce access against file descriptors that were opened previously anyway. Eg from https://www.systutorials.com/docs/linux/man/3-setcon_raw/ -
Given those are taken care of (and ptrace shouldn't be an issue anyway), and given that we'd be loading the policy early (before doing just about anything anyway), what would be the concerns in regards to "entrusting ourselves to do it right?" I'm after concrete examples.
To me it feels unnecessary to me add specific support (including a library dependency) for something in Dinit which can be handled just as well from outside, and re-executing our own process right after we start honestly just feels like a hack. At the moment my position on that is a "no", I would need to be given a good, concrete reason for why that should change. Applying the security context within the already-running process without re-
A little bit more code isn't an issue, if we have already gone as far as adding a dependency and providing support for SELinux then we may as well do it properly. My bad for the "keep it simple" comment which caused confusion - I meant, keep it restricted to the specific functionality that you are wanting to implement; we don't need abstraction layers for handling other security frameworks, etc. But before you said:
Is it a bit, or is it a fair amount? (or am I conflating two different things?) |
As for A good example is this. Imagine we start out with kernel context (what dinit starts out with on my system before the policy is loaded). We have some fd's opened, and the kernel context has permission to use them. Now we load the policy, transition ourselves, and the loaded policy executes us as init_t. In the Gentoo refpolicy, then that kernel context becomes kernel_t. Now SELinux will prevent us from using those open file descriptors.
Oh I see what you meant by keeping it simple now :) I was a bit confused with what you were after, but it shouldn't really require any sweeping changes for the codebase, it should all be confined to that function we'll make to load the policy, which will be the thing that's a bit longer. That's all good then, I can make the transition work. I'll get to work on that, and if there are any concerns please let me know. Thanks for all the time you've given this so far, appreciated a lot. |
5ddfc4d
to
d5295d9
Compare
Alright I think I've got this working as desired now. I've just made a function If anything is unclear/you feel any comments are needed, please let me know, and I'll make sure to add them. |
(just updated an error message as i'm not longer using |
Just fixed another silly mistake, forgot to chance |
As for the special filesystems, /* Check for an override of the mode via the kernel command line. */
rc = mount("proc", "/proc", "proc", 0, 0); Including mounting what it needs (i.e. |
Done, ready for review. Just a quick note, this comment: // exe - the path that we are invoked with (to calculate our new security context to transition into.) extends the line wrapping to 105, and I saw you mentioned it was okay up until 110 chars if it means it can be fit on one line. I'm not sure how that applies here as that is part of a bigger comment block, but I think this should be kept all on one line? Let me know if you like me to change this. |
Holy mackerel. Dinit does not, and will not, do anything even halfway as disruptive to the system as mounting actual filesystems, completely behind the user's back. On the
I certainly would expect "mounts /proc and /sys" to be part of that comment if those are indeed being mounted by that function. And it must be documented that that is what happens, in the dinit man page. The documentation is important, and so is the philosophy of the software being transparent about what it does. |
What I said was: "may extend to 110 in cases where that means the entire comment will fit on a single line". Since this is a small part of the comment, and not the entire comment, then no, it should be maximum 100 characters and should wrap before going beyond that. Edit: and, since we're mentioning that comment, can you please put it in the format required by CODE-STYLE? No blank lines between short description and "Parameters:" / "Returns:", longer description goes after the "Returns". |
src/dinit.cc
Outdated
if (enforce > 0) { | ||
cerr << "Failed to load SELinux policy." << endl; | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the load fails (selinux_init_load_policy
returns 0) but enforce
is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the enforcing
variable, there are 2 states:
- = 0 is permissive mode
- > 0 is enforcing mode
In the case of enforcing, the system is expected to boot in a secure state, i.e. with the system's SELinux policy enforcing. If we fail to load it, we should halt the boot as that is an untrusted state.
In the case of permissive, it's generally better to be more lenient. In permissive mode, actions are expected to be logged, but any failures revolving around the policy aren't expected to have an impact on the system's operation. Other SELinux aware applications, such as load_policy(1)
from policycoreutils also reflect this behavior:
https://github.com/SELinuxProject/selinux/blob/main/policycoreutils/load_policy/load_policy.c#L68-L77
Let me know your thoughts, I can add a comment explaining this too if desired
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not answering the question I asked. If the policy fails to load (selinux_init_load_policy
returns 0) but enforce
is 0, what is this function supposed to do?
The comment says:
If we fail to load the system SELinux policy, return false
But it's not even returning at this point, it's just going on through to the following call to (getcon_raw(¤t_context)
. Isn't that wrong?
Other SELinux aware applications, such as load_policy(1) from policycoreutils also reflect this behavior
That code checks whether loading the policy failed, and if so, prints an error message and retuns an exit code:
That's different to what is here, which just continues as if no error had happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And when I say "continues as if no error had happened", I mean that it goes on to try and set a new context (getcon_raw
, getfilecon_raw
, security_compute_create_raw
, etc). I assume that those will fail, since the policy load failed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that those will fail, since the policy load failed, right?
Those won't fail as it'll just continue on with the kernel context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know your thoughts
I know that you have done some work on this and are probably trying hard, but in all honesty I am very close to pulling the plug. I need you to clearly answer the questions that I'm asking, not just push more code.
The question I asked was:
So if the load fails (selinux_init_load_policy returns 0) but enforce is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?
I still don't have a satisfactory answer to that. I can't accept this PR without understanding what it is doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those won't fail as it'll just continue on with the kernel context
Are you saying that
setcon_raw
will do nothing but still return success?
I decided to clarify this a bit further, so getcon_raw
seems to be fine (which makes logical sense), and setcon_raw
will be fine, but the issue lies in security_compute_create_raw
as that actually attempts to read the policy to calculate something, so that should fail. I tested it by disabling the load temporarily on my machine, and writing a small program to see what would/wouldn't fail regarding those functions. So logically based off all of this, I think the early return makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know your thoughts
I know that you have done some work on this and are probably trying hard, but in all honesty I am very close to pulling the plug. I need you to clearly answer the questions that I'm asking, not just push more code.
The question I asked was:
So if the load fails (selinux_init_load_policy returns 0) but enforce is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?
I still don't have a satisfactory answer to that. I can't accept this PR without understanding what it is doing.
Understood. So:
So if the load fails (selinux_init_load_policy returns 0) but enforce is 0, what is supposed to happen here? It seems like it just ploughs ahead with trying to set the context, but surely that isn't right?
This is accurate, i.e. that wasn't right. I don't think we should fail (as in exit) there, and I incorrectly gathered that it would just fall through (because it'll attempt to set the same context). I overlooked the fact security_compute_create_raw
specifically reads from the policy store, so that'd fail. So an early return of true would make sense. Sincere apologies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sincere apologies
Ok, but: you need to slow down and read my questions/comments properly, and answer them comprehensively. I'm spending far more time on this that I would like, and it seems like it's mostly because I have to repeatedly ask questions until I get a straight answer.
So an early return of true would make sense
"An early return of true" will currently fail the boot. So is that what is supposed to happen?
so getcon_raw seems to be fine (which makes logical sense),
Where does this information come from? (I'm having trouble even finding the source for getcon_raw).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry: an early return of true won't fail the boot, I was thinking an early return of false.
However, an early return of true would break the contract described in the function comment:
// Returns:
// If we fail to load the system SELinux policy, return false, otherwise, return true.
So that might need to be updated, or expanded on in the descriptive part of the comment.
Sorry about that, I was having a bit of tunnelvision on anything the user would be expected to setup. Noted, I'll make sure to update the documentation to reflect that. Apologies |
Got it, will fix |
I've reread CODE-STYLE and I think that comment should be sorted now. Please let me know if it isn't as desired |
I've done a little more investigation of my own. It looks like, while I earlier asked:
And you answered:
However, it looks to me like that is only half the story, since it unmounts I tested this to be certain. I definitely see that error when It looks a lot like you gave me incorrect information. Have I missed something? |
Noted. Apologies once again, I sincerly appreciate all the time and patience you have given me, I won't let this happen again.
This is from a local test. In #include <selinux/selinux.h>
#include <stdio.h>
#include <string.h>
int
main(void)
{
char *context = NULL;
if (getcon_raw(&context) < 0) {
perror("getcon_raw");
return 1;
}
printf("context: %s\n", context);
return 0;
} Compiling and running without my system policy loaded, I get:
This makes sense all things considered, as the context of currently running proccesses are available without the policy loaded. For example, on a system without the policy loaded:
Running on a system with the policy loaded, this is the output I get:
(which is the context of the So it doesn't fail, but just returns the I then made a new #include <selinux/selinux.h>
#include <stdio.h>
#include <string.h>
int
main(int argc, char **argv)
{
char *current_context = NULL;
char *file_context = NULL;
int security_class = 0;
char *new_context = NULL;
if (getcon_raw(¤t_context) < 0) {
perror("getcon_raw");
return 1;
}
if (getfilecon_raw(argv[0], &file_context) < 0) {
perror("getfilecon_raw");
return 1;
}
security_class = string_to_security_class("process");
if (security_class == 0) {
fprintf(stderr, "error: string_to_security_class failed\n");
return 1;
}
if (security_compute_create_raw(current_context, file_context, security_class, &new_context) < 0) {
perror("security_compute_create_raw");
return 1;
}
printf("context: %s\n", new_context);
return 0;
} Running without the policy loaded, the following occurs:
Compared to when the policy is loaded:
Which succeeds as expected. From the security_class_to_string(3) man page:
This (expectly) returns 0 when the policy isn't loaded, as the class process (https://selinuxproject.org/page/ObjectClassesPerms#process) doesn't exist as the policy isn't loaded. I hope that is detailed enough and makes sense. Let me know if anything isn't clear.
I'll make sure to do that.
I will be honest, yes - this seems like an oversight on my part, I missed that it unmounted /proc. I am interested in why I didn't catch this earlier though, I did actually test this locally and it seemed to be fine. I'm guessing something else was mounting /proc, though I did tell my initramfs not to. I think it might be worth me spending some time writing some unit/igr tests for this pr to catch these sort of cases. I'll also make sure to detail how I locally test things (like above) in future to ensure this doesn't happen again.
They should be here: |
Ok, mistakes get made. I think key takeaway here is that this PR is not yet ready, it needs some additional testing and some additional documentation. Please take whatever time you need and "at" me when ready (but I will personally be giving the PR a rest regardless for a couple of weeks at least, so take your time). Before pinging me please remember to double check everything, "self review" and make sure previous points raised are addressed, code style is according to requirements, comments and documentation are complete and accurate; and, specify (via PR comments) what testing has been done. Thanks. Note:
I think it is going to be hard to write tests for this. Local testing may be the only viable option. That's ok as long as you test thoroughly. |
Just want to add something else after some more research. r = getcon_raw(&con);
/* getcon_raw can return 0, and still give us a NULL pointer if /proc/self/attr/current is
* empty. SELinux guarantees this won't happen, but that file isn't specific to SELinux, and may be
* provided by some other arbitrary LSM with different semantics. */
if (r == 0 && con) {
initialized = !streq(con, "kernel");
freecon(con);
} else
initialized = false; This check likely applies to us too. I think it'd make sense to check for this. While this doesn't directly affect us, it'd make the overall system a bit more robust and ensures we play well if built with SELinux but another LSM is loaded, as mentioned in the comment. Everything else so far seems to match up with what systemd is doing. I'll wait for any feedback before rushing to push code. Once that is done, logic wise we should be pretty close to being solid I think. Edit: just seen the above message.
Absolutely. I've noted them all down and I'll make sure to.
I'll document all my testing I do locally here then for now. I'm also going to take 1-3 days of a break from a PR just to regroup myself a bit. To be completely honest, I've had a bit of anxiety responding, but you've been fantastic to work with, and your patience has helped a lot. A huge thanks for that. After that, I'll get too all the changes you mentioned and work on some documentation for it :) Thanks again |
As I said, take your time, get it ready and get it right (as much as you can). As far as I'm concerned the review is over, please make whatever changes are needed to get it into shape and ping me then, and we'll start fresh. |
Hi Davin, This message isn't a ping (intentionally) as I still am yet to write a bit more documentation, but a small status update. Just made a few commits. I've added a comment above the check for I've clarified a little more about the kernel context to be more accurate in the selinux_transition comment. I haven't updated that comment regarding the early return if For early returning true (and continuing the boot if
For early returning false (and halting the boot if
As for what systemd does, it continues the boot in permissive mode and logs a warning. The code for that can be seen here https://github.com/systemd/systemd/blob/main/src/core/selinux-setup.c#L90-L102. |
This doesn't seem like a decision that has any sort of major impact on the PR so: make a decision yourself, and justify it. From my point of view I want as little work to do, until the PR is ready, as possible. Thanks. (No need to give status updates either. Just take whatever time you need, finish the work, do it right, do it well). |
getcon_raw(3) may return succesfully and still give us a nullptr, so let's check whether the return value is less than 0 or we have a nullptr, not mandating that getcon_raw(3) returns with < 0 to error.
If we can successfully load the selinux policy, the inability to transition to the domain specified by the policy is a policy misconfiguration, not a dinit runtime issue. Let's continue boot, and allow the standard SELinux logging mechanics to kick in instead, inline with what systemd does.
Returning true if we fail to load the policy while the system is in permissive mode breaks one of the assumptions of permissive mode: that new files are labelled correctly according to the system policy (unlike when booting with all SELinux related infrastructure disabled by passing selinux=0 to the kernel cmdline). Let's always exit if we we fail to load the policy; if the loading of the fails, something has gone badly wrong, and the system should be booted with selinux=0 to disable all selinux related infrastructure.
It's misleading to say that a policy choice is a misconfiguration, the choice to prevent dinit from transitioning from its inital domain is a choice made the local system administrator. Let's clarify our comments to specify that this isn't an error, but a warning, as this is still a unsual policy choice.
Implements #399 . Currently a draft PR for some of the reasons noted in that issue. Another thing to add:
fprintf
if the SELinux policy fails to load (as/dev/console
is likely unable to be accessed at that point). Would you preferstd::cout
to be used for now in this case?