Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install: Add ensure-completion verb, wire up ostree-deploy → bootc #915

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cgwalters
Copy link
Collaborator

@cgwalters cgwalters commented Nov 20, 2024

When bootc was created, it started to become a superset of ostree;
in particular things like /usr/lib/bootc/kargs.d and logically
bound images.

However...Anaconda today is still invoking ostree container image deploy.

Main fix

When bootc takes over the /usr/libexec/ostree/ext/ostree-container
entrypoint, make the existing ostree container image deploy CLI actually
just call back into bootc to fix things up. No additional work required other
than getting an updated bootc in the Anaconda ISO.

Old Anaconda ISOs

But, a further problem here is that Anaconda is only updated once
per OS major+minor - e.g. there won't be an update to it for the lifetime
of RHEL 9.5 or Fedora 41. We want the ability to ship new
features and bugfixes in those OSes (especially RHEL9.5).

So given that we have a newer bootc in the target container, we can
do this:

%post --erroronfail
bootc install ensure-completion
%end

And will fix things up. Of course there's fun $details here...the
way Anaconda implements %post is via a hand-augmented chroot
i.e. a degenerate container, and we need to escape that and
fix some things up (such as a missing cgroupfs mount).

Summmary

  • With a newer bootc in the ISO, everything just works
  • For older ISOs, one can add the %post above as a workaround.

Implementation details: Cross-linking bootc and ostree-rs-ext

This whole thing is very confusing because now, the linkage
between bootc and ostree-rs-ext is bidirectional. In the case
of bootc install to-filesystem, we end up calling into ostree-rs-ext,
and we must not recurse back into bootc, because at least for
kernel arguments we might end up applying them twice. We do
this by passing a CLI argument.

The second problem is the crate-level dependency; right now they're
independent crates so we can't have ostree-rs-ext actually
call into bootc directly, as convenient as that would be. So we
end up forking ourselves as a subprocess. But that's not too bad
because we need to carry a subprocess-based entrypoint anyways
for the Anaconda %post case.

Implementation details: /etc/resolv.conf

There's some surprising stuff going on in how Anaconda handles
/etc/resolv.conf in the target root that I got burned by. In
Fedora it's trying to query if systemd-resolved is enabled in
the target or something?

I ended up writing some code to just try to paper over this
to ensure we have networking in the %post where we need
it to fetch LBIs.

Signed-off-by: Colin Walters [email protected]


@cgwalters
Copy link
Collaborator Author

cgwalters commented Nov 20, 2024

This seems to be working well in my hand-rolled testing.
(BTW I am testing via virt-manager/virt-manager#739 (comment) )

Still TODO:

@cgwalters
Copy link
Collaborator Author

OK I've rebased this on top of #860 and we successfully pull LBIs at Anaconda install time too now.

@cgwalters
Copy link
Collaborator Author

There were a surprising number of things I hit. One of them for example is that anaconda's hand-rolled chroot/container doesn't mount cgroupfs which makes podman quite unhappy so we do so manually in https://github.com/containers/bootc/pull/915/files#diff-66bc72c28514e2546fbe456aee74a321866d5a9147136ef99251eec1e08be8ddR107

@cgwalters cgwalters force-pushed the install-fixup branch 2 times, most recently from 21467e0 to 4b111dd Compare November 24, 2024 15:04
@cgwalters cgwalters changed the title install: Add hidden ensure-completion verb install: Add ensure-completion verb, wire up ostree-deploy → bootc Nov 24, 2024
@cgwalters cgwalters marked this pull request as ready for review November 25, 2024 12:59
@cgwalters
Copy link
Collaborator Author

The plus side of this PR as is is that it has near-zero risk unless explicitly turned on in the two places right now. The anaconda %post path is obviously opt-in. And the extensions to the ostree container image deploy path also only turn on when bootc takes those over from e.g. rpm-ostree today.

I've given this a fair bit of manual testing, but I think what will help here is to get this into e.g. Fedora rawhide and that'll get things running through the daily integration testing.

@cgwalters
Copy link
Collaborator Author

https://github.com/cgwalters/playground/blob/main/202411/inst.sh and inst.ks are how I was testing this

@omertuc
Copy link
Contributor

omertuc commented Nov 26, 2024

cgwalters/playground@main/202411/inst.sh and inst.ks are how I was testing this

Broken link

@cgwalters
Copy link
Collaborator Author

Broken link

Hmm, does it work for anyone else?

@omertuc
Copy link
Contributor

omertuc commented Nov 26, 2024

I think the repo is just private

@cgwalters
Copy link
Collaborator Author

cgwalters commented Nov 26, 2024

Duh sorry, here's the raw scripts. But yes ideally these bits are generalized into virt-manager/virt-manager#739

#!/bin/bash
set -xeuo pipefail

#cd=/var/srv/walters/machine-images/centos-stream9/CentOS-Stream-9-20241118.0-x86_64-boot.iso
cd=/var/srv/walters/machine-images/fedora/Fedora-Everything-netinst-x86_64-Rawhide-20241117.n.0.iso
virt-install --connect qemu:///session \
--name anaconda-bootc-test \
--vcpus 4 \
--ram 8192 \
--disk size=40 \
--location ${cd} \
--os-variant rhel9.4 \
--initrd-inject=./inst.ks \
--memorybacking=source.type=memfd,access.mode=shared \
--filesystem=/var/srv/,host-var-srv,driver.type=virtiofs \
--noautoconsole \
--extra-args "inst.ks=file:/inst.ks"
%pre --erroronfail
mkdir -p /mnt/host-var-srv
mount -t virtiofs -o ro host-var-srv /mnt/host-var-srv
# This code actually takes over things
cp -p /mnt/host-var-srv/walters/bootc.fedora /usr/bin/bootc
ln -sfr /usr/bin/bootc /usr/libexec/libostree/ext/ostree-container
%end

# Basic setup
text
network --bootproto=dhcp --device=link --activate
# Basic partitioning
clearpart --all --initlabel --disklabel=gpt
reqpart --add-boot
part / --grow --fstype xfs

ostreecontainer --transport oci --url /mnt/host-var-srv/walters/oci:bootc-lbi

rootpw <password>
sshkey --username root "<key>"
#reboot

%post
curl -L --head https://quay.io
%end

%post --erroronfail
bootc install ensure-completion
%end

When bootc was created, it started to become a superset of ostree;
in particular things like `/usr/lib/bootc/kargs.d` and logically
bound images.

However...Anaconda today is still invoking `ostree container image deploy`.

Main fix
--------

When bootc takes over the `/usr/libexec/ostree/ext/ostree-container`
entrypoint, make the existing `ostree container image deploy` CLI actually
just call back into bootc to fix things up. No additional work required other
than getting an updated bootc in the Anaconda ISO.

Old Anaconda ISOs
-----------------

But, a further problem here is that Anaconda is only updated once
per OS major+minor - e.g. there won't be an update to it for the lifetime
of RHEL 9.5 or Fedora 41. We want the ability to ship new
features and bugfixes in those OSes (especially RHEL9.5).

So given that we have a newer bootc in the target container, we can
do this:

```
%post --erroronfail
bootc install ensure-completion
%end
```

And will fix things up. Of course there's fun $details here...the
way Anaconda implements `%post` is via a hand-augmented `chroot`
i.e. a degenerate container, and we need to escape that and
fix some things up (such as a missing cgroupfs mount).

Summmary
--------

- With a newer bootc in the ISO, everything just works
- For older ISOs, one can add the `%post` above as a workaround.

Implementation details: Cross-linking bootc and ostree-rs-ext
-------------------------------------------------------------

This whole thing is very confusing because now, the linkage
between bootc and ostree-rs-ext is bidirectional. In the case
of `bootc install to-filesystem`, we end up calling into ostree-rs-ext,
and we *must not* recurse back into bootc, because at least for
kernel arguments we might end up applying them *twice*. We do
this by passing a CLI argument.

The second problem is the crate-level dependency; right now they're
independent crates so we can't have ostree-rs-ext actually
call into bootc directly, as convenient as that would be. So we
end up forking ourselves as a subprocess. But that's not too bad
because we need to carry a subprocess-based entrypoint *anyways*
for the Anaconda `%post` case.

Implementation details: /etc/resolv.conf
----------------------------------------

There's some surprising stuff going on in how Anaconda handles
`/etc/resolv.conf` in the target root that I got burned by. In
Fedora it's trying to query if systemd-resolved is enabled in
the target or something?

I ended up writing some code to just try to paper over this
to ensure we have networking in the `%post` where we need
it to fetch LBIs.

Signed-off-by: Colin Walters <[email protected]>
@cgwalters
Copy link
Collaborator Author

Rebased 🏄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to `bootc install`
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants