proposal: generate "base rhel" container image, build OCP on top #498

cgwalters · 2021-02-08T16:28:23Z

Now that rpm-ostree is close to supporting "live updates", one thing we could do is move crio/kubelet into a separate machine-os-kubelet container or so, and also move openvswitch as part of e.g. the SDN container.

But these would still be treated as "first class" bits because they'd still be underneath the readonly bind mount in /usr etc. The MCO would learn to pull down this machine-os-kubelet container and apply updates from it too; and we can generalize that to N container images with M RPMs inside (or...perhaps not RPMs at all).

Advantages:

The RHCOS bootimage is basically just RHEL, and this would greatly increase alignment with OKD since we'd use the same approach in both places.

On the bootstrap node, the crio/kubelet in use become exactly the same as the one shipped in cluster.

There wouldn't be anymore "CI -> shipping" gap for kubelet - when a PR merges to that repo it'd get rebuilt and shipped the same way all other containers do and not versioned with RHCOS at all.

Note in this we wouldn't be breaking at all the concept that the cluster owns and manages OS updates; we'd still be testing the OS and kubelet and cluster components all together as a unit in the end. The goal here is just internally split things up more so we can improve the process for CI and building; for example, the RHCOS version number would (mostly) just be a RHEL version number which would greatly increase clarity of how things work. We can be more agile with kubelet/crio etc.

The text was updated successfully, but these errors were encountered:

ashcrow · 2021-02-08T17:14:36Z

I like the idea! It sounds like decoupling into a few classes of streams to make bootstrapping and CI testing easier to manager.

boot strap and cluster (requirements to bootstrap and to run a cluster)
general cluster (more general OS level)

One question is how would we tie these container images together? For example, if the machine-os-kubelet needed to bump and machine-os-content didn't change. Would they be combined by container tags or through the payload manifest at the higher level? etc..

cgwalters · 2021-02-08T17:33:10Z

The machine-os-kubelet would just be another payload image, it would be built with a Dockerfile the same way as everything else in the cluster. The MCO would know how to pull it down and apply it. During cluster bootstrap, bootkube.sh already downloads the MCO container, so we'd extend that to call the MCO to also extract crio/kubelet. For upgrades the MCO would also extract and apply it the same way it does machine-os-content.

IOW the end goal here is that the lifecycle of this container is logically separate; either container can change independently without caring about the other, we just merge the result.

cgwalters · 2021-02-08T17:38:14Z

I think the best way to view this proposal is our workflow for "test RHCOS with new RHEL minor version". With this flow, we produce one machine-os-content:rhel-8X where X is e.g. 4 or 5 - and then we can test and ship that oscontainer across multiple OpenShift versions.

ashcrow · 2021-02-08T18:14:12Z

IOW the end goal here is that the lifecycle of this container is logically separate; either container can change independently without caring about the other, we just merge the result.

This is what I was looking for 👍

vrutkovs · 2021-03-16T14:09:29Z

That would be extremely useful for OKD as we now have to build a full blown image just to ship a few RPMs.

openshift-bot · 2021-06-14T19:50:37Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

LorbusChris · 2021-06-14T23:49:41Z

/remove-lifecycle stale

travier · 2021-07-06T14:54:21Z

/label jira

openshift-ci · 2021-07-06T14:54:22Z

@travier: The label(s) /label jira cannot be applied. These labels are supported: platform/aws, platform/azure, platform/baremetal, platform/google, platform/libvirt, platform/openstack, ga, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, px-approved, docs-approved, qe-approved, downstream-change-needed

In response to this:

/label jira

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-bot · 2021-10-11T21:06:36Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

travier · 2021-10-18T15:46:42Z

/remove-lifecycle stale
/lifecycle frozen

We're working toward that goal, just not there yet but the ostree-ext work might get us there.

cgwalters · 2021-11-16T18:06:26Z

In the end this also kind of requires that we structure inputs to the base image to only come from RHEL for example, so that there's only one version number that matters.

cgwalters · 2022-02-09T22:26:34Z

And a core problem with this is in some cases - specifically e.g. the live ISO, use cases that we have rely on kubelet existing there by default.

That said, it may be the case that we could try to do this at the core - i.e. generate one RHCOS 8.5 build, and then further specialize/derive that build for multiple OCP releases, and generate disk images out of those. If we could get away with only having redhat-release and dropping redhat-release-coreos that would be a huge help for sure. I think we'd just end up injecting the derived OCP version into the disk images or so?

cgwalters · 2022-05-11T13:49:23Z

I have a variant of this in #799 that differs in important technical ways.

jlebon mentioned this issue Mar 16, 2021

Kubernetes v1.24+ container runtime on Fedora CoreOS coreos/fedora-coreos-tracker#767

Open

cgwalters mentioned this issue May 27, 2021

CORS-1650: RHEL 8 Server Worker/Infra Node Support openshift/enhancements#781

Closed

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2021

openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2021

miabbott added the jira label Jul 13, 2021

cgwalters mentioned this issue Sep 16, 2021

UPSTREAM: <carry>: openshift-hack/images/os/Dockerfile: Add io.openshift.build.versions, etc. openshift/kubernetes#963

Merged

cgwalters mentioned this issue Sep 24, 2021

enable additional tests #568

Open

jlebon mentioned this issue Oct 5, 2021

Tracker to stop rebuilding FCOS openshift/okd-machine-os#210

Closed

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 11, 2021

openshift-ci bot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 18, 2021

jlebon mentioned this issue Feb 14, 2022

Bug 2099811: manifest: bump to openvswitch2.16 #715

Merged

LorbusChris mentioned this issue Feb 20, 2022

NO-ISSUE: Add OKD support openshift/assisted-service#3297

Merged

19 tasks

cgwalters changed the title ~~proposal: split crio/kubelet to separate container image~~ proposal: generate "base rhel" container image, build OCP on top May 11, 2022

cgwalters closed this as completed May 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: generate "base rhel" container image, build OCP on top #498

proposal: generate "base rhel" container image, build OCP on top #498

cgwalters commented Feb 8, 2021

ashcrow commented Feb 8, 2021

cgwalters commented Feb 8, 2021

cgwalters commented Feb 8, 2021 •

edited

Loading

ashcrow commented Feb 8, 2021

vrutkovs commented Mar 16, 2021

openshift-bot commented Jun 14, 2021

LorbusChris commented Jun 14, 2021

travier commented Jul 6, 2021

openshift-ci bot commented Jul 6, 2021

openshift-bot commented Oct 11, 2021

travier commented Oct 18, 2021

cgwalters commented Nov 16, 2021

cgwalters commented Feb 9, 2022

cgwalters commented May 11, 2022

proposal: generate "base rhel" container image, build OCP on top #498

proposal: generate "base rhel" container image, build OCP on top #498

Comments

cgwalters commented Feb 8, 2021

ashcrow commented Feb 8, 2021

cgwalters commented Feb 8, 2021

cgwalters commented Feb 8, 2021 • edited Loading

ashcrow commented Feb 8, 2021

vrutkovs commented Mar 16, 2021

openshift-bot commented Jun 14, 2021

LorbusChris commented Jun 14, 2021

travier commented Jul 6, 2021

openshift-ci bot commented Jul 6, 2021

openshift-bot commented Oct 11, 2021

travier commented Oct 18, 2021

cgwalters commented Nov 16, 2021

cgwalters commented Feb 9, 2022

cgwalters commented May 11, 2022

cgwalters commented Feb 8, 2021 •

edited

Loading