-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OKD-223: Load custom SELinux rules in SCOS and workaround afterburn failures #1556
OKD-223: Load custom SELinux rules in SCOS and workaround afterburn failures #1556
Conversation
d18f4f0
to
11f7e8c
Compare
@@ -0,0 +1,7 @@ | |||
(typeattributeset cil_gen_require var_run_t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will these rules also persist after the bootimage pivots to the scos image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also say that we could delete this file once a proper fix lands in CS9? I didn't reproduce in a CS9 fresh env with no other modifications as the ones here, but it seems a bug to file there for afterburn or the selinux-policy package you mentioned in the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once we have scos publicly available, yes we would need this. But by that time I hope that we can get the actual issue with selinux-policy sorted out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/fedora-selinux/selinux-policy/pull/1362/files might be the fix to downstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@travier see the two PRs above for #1556 (comment)
@aleskandro: This pull request references OKD-223 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
11f7e8c
to
d82d6ec
Compare
Hum, why is this not failing in FCOS? |
d82d6ec
to
7d26dfa
Compare
I think it is because of previous patches to SELinux-policy that are already available in FCOS. |
Oh here it can be: zpytela/selinux-policy@18e59aa removed the permissive domain, but non-rawhide fedora has it yet: https://github.com/zpytela/selinux-policy/blob/f40/policy/modules/contrib/afterburn.te#L15 |
7d26dfa
to
173a514
Compare
How will all of this work in the initrd? We really need to fix this in Rawhide first. I'm not convinced we should carry SELinux policy modules here. Those domains were in permissive mode before, let's make them permissive again instead. |
173a514
to
d3f25ca
Compare
I've changed the module to set the afterburn_t domain in permissive mode. I could do the same for the coreos installer one so that we have a workaround for the other issue too, until they are are solved upstream and we can revert the two policies' commits. I would still continue to propose having that unit that loops through a folder in /usr for applying custom SELinux rules that can be needed from time to time. We have something similar in okd-machine-os for the FCOS variant. |
fd52060
to
eb2bd38
Compare
/lgtm |
@travier could we get an approval for this? |
Note that in FCOS rawhide, we're currently pinning selinux-policy due to other issues: https://github.com/coreos/fedora-coreos-config/blob/441ccdd5cc3f2e3ac0cb5e8e0bcf71b301a51c7e/manifest-lock.overrides.yaml#L17-L26 But even then, the policy between c9s and rawhide are different and not mean to be kept in sync AIUI. See the conversation in https://github.com/openshift/os/issues/1514.\ I sympathize with this. I'm not opposed in theory, but it's also a much more invasive workaround that might have subtle repercussions. Note also that not many people on the team are familiar with maintaining SELinux policies. What we really need of course is better CI integration with c9s. This should be a big part of the bootable containers effort. Short-term, I guess we can just host the last working version RPM somewhere else and revert #1552 ? (Or, if you have the right channels, try to get a new tag created in the CentOS Stream koji instance with tag2distrepo set up so we can just tag older packages we need in there.) |
What repercussions are we worried about?
Isn't this equivalent to having a workaround here? |
As I asked before:
I don't think this will work as this unit runs after the initrd, only in the real root. |
/hold |
@travier this branch is already being used in the releases that turned green from Sunday thanks to this fix (for AWS).
@jlebon the current version of this PR 'reverts' the SELinux policies to the previous ones for the I also understand that maintaining SELinux policies is not the most exciting thing here. Still, it allows the SCOS release to unblock and OKD/SCOS to have dedicated SELinux policies when needed, especially since, based on what you said, the SELinux policies in Fedora C9S might not always be in sync (and well-tested as they are for RHCOS). To have working and quickly released OKD/SCOS, we will occasionally need more workarounds dedicated to C9S than RHCOS. This PR adds that systemd unit only to C9S builds, with the hope that whoever maintains and reviews a change in that folder will always bind a Jira or GH issue to revert the changes eventually when they are fixed and downstreamed to be consumed as a permanent solution. We had similar cases for OKD/FCOS: https://github.com/openshift/okd-machine-os/blob/master/overlay.d/99okd/usr/lib/okd/selinux-fixes.cil. I would avoid further delays for the release of OKD/SCOS, 4.16 and beyond when similar cases arise. |
@travier @jlebon any suggestions on how this can be unblocked in the short term? I think @aleskandro has given sufficient reasoning above for the approach. |
From my point of view this approach is better than setting up a repo and locking to an old version, we will be missing all of the bug fixes in the newer selinux-policy builds. This is not an issue in the FCOS portion because the bug is fixed upstream. We are missing all the other fixes in selinux-policy if we lock to an old version. The selinux team needs to prioritise backporting the upstream fix. As I understand the issue, we only hit it on reboot when moving to CentOS Stream, which is where the fix is needed. |
@dustymabe @cverna This is blocking OKD releases from going stable cc @preethit |
How about having just the |
eb2bd38
to
b19d7f6
Compare
Hi @jlebon, that is better. Done Thanks for following up. |
/retest-required |
b19d7f6
to
733c199
Compare
Thanks! I've reworked this as follow:
|
Recent changes in the SELinux policy to make more service confined has broken a lot of our code. The SELinux team is working through relaxing the policy, but in the meantime, let's revert back the affected types to permissive mode: 1. afterburn fail when trying to write to `/run`, `/run/metadata` and `/home/$user/.ssh`. See: https://issues.redhat.com/browse/RHEL-49735 2. coreos-installer installation fails due to various denials. See: https://issues.redhat.com/browse/RHEL-38614 3. network functionality that rely on systemd-network-generator is broken due to the latter being unable to create temporary files. See: https://issues.redhat.com/browse/RHEL-47033 Co-authored-by: Jonathan Lebon <[email protected]>
733c199
to
9ba8b71
Compare
@aleskandro: This pull request references OKD-223 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/hold cancel |
We're currently hitting OpenSSL issues there: openshift#1540
I've also inlined the first commit of #1544 here. This PR should now be all we need to unblock building c9s-based bootimages in the CoreOS pipeline. |
CI failing on #1551. I've verified this PR locally. /override ci/prow/rhcos-9-build-test-metal /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aleskandro, jlebon, Prashanth684 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@jlebon: Overrode contexts on behalf of jlebon: ci/prow/rhcos-9-build-test-metal, ci/prow/rhcos-9-build-test-qemu, ci/prow/scos-9-build-test-metal, ci/prow/scos-9-build-test-qemu In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@aleskandro: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Looks good to me! Thanks @jlebon |
Recent changes in the SELinux policy to make more service confined has
broken a lot of our code. The SELinux team is working through relaxing
the policy, but in the meantime, let's revert back the affected types to
permissive mode:
/run
,/run/metadata
and/home/$user/.ssh
.See: https://issues.redhat.com/browse/RHEL-49735
See: https://issues.redhat.com/browse/RHEL-38614
broken due to the latter being unable to create temporary files.
See: https://issues.redhat.com/browse/RHEL-47033