This document captures the design decisions made after discussion in issues. When a design issue is closed, the conclusion should be summarized here with a link to the issue.
- OSTree Delivery Format
- Release Streams
- Disk Layout
- Approach towards shipping Python
- Identification in
/etc/os-release
- Firewall management
- Cloud Agents
- Supported Ignition Versions
- Configuration Language and Transpiler
- Security policies
- Bucket layout
- Version numbers
- Originally discussed in issue #23.
There are three proposed delivery models for delivering content to end user systems:
- OSTree Repo: OSTree commits are stored in an OSTree repo (like a git repo) on a server and fetched via HTTP requests.
- rojig: uses a special rojig RPM and re-assembles OSTree commit from RPMs already on mirrors.
- OCI: OSTree commits are packaged up in OCI container images and delivered via a container registry.
Currently the plan in Fedora CoreOS is to deliver content via a plain OSTree Repo and augment our strategy with either rojig or OCI if it proves useful or necessary.
- Originally discussed in #22.
Fedora CoreOS will have several refs for use on production machines. At any given time, each ref will be downstream of a particular Fedora branch, and will consist of a snapshot of Fedora packages plus occasionally a backported fix.
testing
: Periodic snapshot of the current Fedora release plus Bodhiupdates
.stable
: Promotion of atesting
release, including any needed fixes.next
: Thenext
stream represents the future. It will often be used to experiment with new features and also test out rebases of our platform on top of the next major version of Fedora. See Major Fedora Version Rebases for more info.
All of these refs will be unversioned, in the sense that their names will not include the current Fedora major version. The stream cadences are not contractual, but will initially have two weeks between releases. The stream maintenance policies are also not contractual and may evolve from those described above, but changes will preserve the use cases and intended stability of each stream.
Users will be encouraged to run most of their production systems on stable
, and a few percent of their systems on each of next
and testing
to catch regressions before they reach stable
.
Development for the next testing
and next
releases will occur in development refs. These refs will be public, but will be stored in a different ostree repo from production refs.
testing-devel
: Nightly build of the package set that will be snapshotted for the nexttesting
release.next-devel
: Nightly build of the package set that will be snapshotted for the nextnext
release.
There will also be some additional unversioned refs for the convenience of Fedora CoreOS developers. These will be public and stored in the same ostree repo as development refs. Unlike production and development refs, mechanical refs are not curated; they're simply a snapshot of the corresponding Bodhi repos, with no package pinning and no backports of fixes. None of these refs are contractual; they might go away if we don't find them useful.
rawhide
: Nightly snapshot of rawhide.branched
: Nightly snapshot of the upcoming Fedora release after it is branched.
Due to the promotion structure described above, stable
can contain packages that are as much as four weeks out of date. Sometimes, however, there will be an important bugfix or security fix that cannot wait a month to reach stable
(or two weeks to reach next
or testing
). In that case, the fix will be incorporated into out-of-cycle releases on affected streams. These releases will not affect the regular promotion schedules; for example, a fix might sit in testing
for only a few days before it is promoted to stable
.
If a fix is important enough for an out-of-cycle stable
release, other affected release streams should be updated as well.
In some cases it may make sense to apply a fix to testing
but not issue an out-of-cycle release, allowing the fix to be picked up automatically when testing
promotes to stable
.
The release process integrates with Fedora's release milestones in the following ways:
- Fedora Beta Release
- The
next
stream is switched over to the new release.
- The
- Fedora Final Freeze
- The
next
stream switches to weekly releases to closely track the GA content set.
- The
- Fedora General Availability
- Fedora CoreOS re-orients its release schedule in the following way:
- Week -1 (Fedora "Go" Decision):
next
release:next
release with final Fedora GA content
- Week 0 (GA release): triple release:
testing
release promoted from previousnext
next
release contains latest Fedora N content, including Bodhi updates
- Week 2: triple release:
stable
release promoted from previoustesting
, now fully rebased to Fedora Ntesting
andnext
are now in sync
- Week -1 (Fedora "Go" Decision):
- Fedora CoreOS re-orients its release schedule in the following way:
We have a checklist to track the exact steps followed during a rebase.
Because production refs are unversioned, users will seamlessly upgrade between Fedora major releases, so compatibility must be maintained. Removal of functionality will require explicitly announced deprecations, potentially with long deprecation windows.
- Originally discussed in issue #18. See also dustymabe's comment summarizing the discussion in the FCOS meeting.
- Filesystem details were discussed in #33. We will use XFS by default.
- FCOS will use a "dd-able" image and ship with a standard partition layout.
- The bare metal image and cloud images have the same layout.
- The
/var
and root (/
) filesystems will be XFS by default. - Anaconda will not be used for installation.
- FCOS should not use the GPT generator.
- LVM should be supported, but not used by default.
- Ignition will be used to customize disk layouts.
FCOS should have a fixed partition layout that Ignition can modify on first boot. The installer will be similar to the Container Linux installer; the core of it will be dd'ing an image to the disk.
The partition layout will support "dual EFI/BIOS" on x86_64, and will have a single root partition as XFS by default. We will support changing the root filesystem storage (but not /boot
) via Ignition.
- What do we do about 4k sector disks? We could make a "hybrid" disk image, but it technically breaks the GPT spec and may not work with poorly implemented UEFIs.
- What is the exact partition layout?
- Do we make /etc a ro bind mount?
- Originally discussed in #32.
TL;DR
Fedora CoreOS group would really like to not ship python, but if we choose that we want to keep a tool or a few tools in Fedora CoreOS that use python then we should use an approach that makes python only available to the operating system and not to end users.
Note that this does not say we will ship python.
Details
Container Linux has not shipped python in the past. Fedora is python heavy and thus python has been shipped in the past in Fedora Atomic Host. There are several reasons we've identified as reasons to not ship python in Fedora CoreOS:
- prevent users from running scripts directly on the host
- prevent shipping/maintaining python
- prevent issues where user's python script needs library X that isn't installed
- prevent security issues in python requiring a respin
- less space used on disk + less data transmitted for updates
- better perception we're a minimal OS
Out of those we decided #1
and #3
were our primary concerns with
shipping python. For #4
we determined there was not a significant
number of security issues to make shipping python prohibitive. We can
achieve the goals for #1
and #3
by shipping a system python that
is only accessible to operating system tools and not to end users.
Originally discussed in #21.
We will identify a Fedora CoreOS server using the ID=fedora
and VARIANT_ID=coreos
fields in the /etc/os-release
file.
Originally discussed in #26.
- FCOS will ship without any ad-hoc filtering rules. By default, nodes will boot without firewall.
- Components for both iptables and nft filtering will be provided (namely
iptables
,nftables
, andiptables-nft
packages, plus related kernel modules). - It will be possible to set up static rules (i.e. meant to be valid and unchanged for the whole node lifetime) via Ignition.
- Dynamic rules (i.e. mutable at runtime) are out of scope for FCOS own toolings. Container runtimes and orchestrators take ownership of those via their own (containerized) rules managers.
Originally discussed in #12.
- FCOS will not ship cloud agents whenever possible.
- Some clouds require the OS perform tasks like signaling boot completion. For those we will re-implement that functionality in Ignition or coreos-metadata.
- For the short term, if we need to include an agent we will bake it into the image. We will not have any specific mechanism for including agents.
Originally discussed in #66.
- AWS does not require a cloud agent but does require NVME EBS udev rules
- The udev rules and script will be packaged in an RPM and included in FCOS with work being tracked in #104
Originally discussed in #65.
- We've identified one major gap with not shipping the Microsoft Azure Linux Agent: the machine will not check-in and will eventually be culled by Azure for being stuck in the creation process.
- This gap will be covered by work done in coreos-metadata.
- One additional gap which will not be covered is a lack of ephemeral disk support. We plan to ship udev rules but will not have a service which formats the disk unless we receive feature requests in the future. This was discussed in #97.
- As a cosmetic issue, we should also ship a rule to ignore SR-IOV interfaces.
Originally discussed in #71.
- DigitalOcean has an agent that provides instance metrics back to DO. We will not ship it.
- DigitalOcean does not generally offer DHCP. Network configuration is obtained from an HTTP metadata service on a link-local address. On other platforms this is handled by cloud-init.
- Networking should be configured by coreos-metadata running in the initramfs, but coreos-metadata may need to learn to configure NetworkManager or nm-state depending on the outcome of #24.
Originally discussed in #68.
- OpenStack environments do not require a cloud agent
- We will provide any base level of functionality with ignition and coreos-metadata
Originally discussed in #69.
- On the first boot, Packet requires the machine to phone home to report a successful boot. This will be handled by coreos-metadata.
- Packet provides the IPv4 public address via DHCP, allowing a machine to acquire network via standard mechanisms. However, to obtain a private IPv4 address or a public IPv6 address (on the same interface), networking must be configured using metadata from an HTTP metadata service. This can be handled by coreos-metadata in the initramfs, but it may need to learn to configure NetworkManager or nm-state depending on the outcome of #24.
- Packet needs the serial console on x86 to be directed to
ttyS1
, notttyS0
, requiring cloud-specific bootloader configuration. A different serial console configuration is required on ARM64. - On many Linux OSes, Packet sets a randomized root password which is then available from the Packet console for 24 hours. This allows the serial (SOS) console to be used for interactive debugging. Container Linux, instead, enables autologin on the console by default. To avoid surprising users, Fedora CoreOS will do neither. For interactive console access, users can use Ignition to enable autologin or to set a password on the
core
account, and we'll document how to do that.
- What do we do about VMware, which has a very involved and intrusive "agent"?
Originally discussed in #31.
- FCOS will only support Ignition spec 3.0.0 and up.
- Ignition spec 3.0.0 will break compatibilty with spec 2.x.y, although most configs will only require minor changes.
- Tooling should exist to aid converting 2.x.y configs to 3.0.0 configs, although perfect automated translation will not be possible.
Fedora CoreOS will have a configuration language similar to the Container Linux Configuration Language named the Fedora CoreOS Configuration Language (FCCL). There will be a tool, the Fedora CoreOS Configuration Transpiler (FCCT) to convert Fedora CoreOS Configs (FCCs) to Ignition configs.
The FCCL will be versioned using semver, similar to how the Ignition spec is versioned. FCCT will accept all versions of the FCCL. Each FCCL version will target exactly one Ignition spec version. This means:
- Old FCCs will continue to work with new versions of FCCT without modification.
- Each FCCL version will always emit the same version of Ignition config, regardless of what version of FCCT was used to transpile it.
- Since FCOS will accept old (down to 3.0.0) versions of Ignition configs, old FCCs will continue to work with new FCOS releases without modification.
- To use new features in new FCCT releases, users must update their configs to use the new FCCL spec.
Originally discussed in #114.
We will not enable autologin on serial or VGA consoles by default, even on platforms (e.g. Azure, DigitalOcean, GCP, Packet) which provide authenticated console access. Doing so would provide an access vector that could surprise users unfamiliar with their platform's console access mechanism and access control policy. For users who wish to use the console for debugging, we will provide documentation for using Ignition to enable autologin or to set a user password.
Originally discussed in #181.
There have been multiple rounds of CPU vulnerabilities (L1TF and MDS) which cannot be completely mitigated without disabling Simultaneous Multi-Threading on affected processors. Disabling SMT has a cost: it reduces system performance and changes the apparent number of processors on the system. However, enabling SMT on affected systems would be an insecure default.
By default, Fedora CoreOS will configure the kernel to disable SMT on vulnerable machines. This conditional approach avoids incurring the performance cost on systems that aren't vulnerable. However, it fails to protect systems affected by undisclosed SMT vulnerabilities, and it allows future OS updates to disable SMT without notice if new vulnerabilities become known.
We will document this policy and its consequences, and provide instructions for unconditionally enabling or disabling SMT for users who prefer a different policy.
Originally discussed in #189.
The fcos-builds
bucket, fronted by http://builds.coreos.fedoraproject.org/ will be structured as follows:
/
prod/
streams/
stable/
releases.json
builds/
builds.json
30.1234-5/
release.json
x86_64/
meta.json
commitmeta.json
fedora-coreos-30.8-qemu.x86_64.qcow2.gz
ostree-commit-object
ostree-commit.tar
...
ppc64le/
...
...
testing/
next/
...
streams/
stable.json
testing.json
...
The artifacts under e.g. 30.1234-5/x86_64/
come directly from coreos-assembler. The /streams/*.json
, release.json
, and releases.json
are higher-level generated metadata objects. See #98 and #207 for more information about those.
The stream metadata format (under /streams
) is intended to be stable, and stream metadata objects will contain links to artifacts in the release bucket. Everything else about the bucket layout, including its directory structure and the formats of other metadata objects, is subject to change without notice. Third-party tooling should not rely on this structure, and should instead read metadata and artifact URLs directly from stream metadata at the officially documented URL.
Originally discussed in #81 and #211.
Fedora CoreOS versions will have the form X.Y.Z.A
:
- X is the Fedora major version, e.g.
31
. - Y is the datestamp that the package set was snapshotted from Fedora, e.g.
20191014
. For mechanical streams, this is the build date. For development and production streams, it's the date of the snapshot that was promoted. - For official builds, Z is a code number corresponding to the stream:
Stream | Z version |
---|---|
next | 1 |
testing | 2 |
stable | 3 |
next-devel | 10 |
testing-devel | 20 |
rawhide | 91 |
branched | 92 |
bodhi-updates-testing | 93 |
bodhi-updates | 94 |
For developer builds (those not produced by the official pipeline), Z is always dev
.
These Z codes were chosen to make production versions short and simple, development versions clearly related to production versions, and mechanical versions clearly separated into a distinct group.
- A is a revision number, which starts at 0 and is incremented for each new build with the same X.Y.Z parameters as an existing build.
Some examples:
Stream | Version | Comment |
---|---|---|
next | 32.20191018.1.0 | F32-based, first release from this snapshot |
testing | 31.20191018.2.1 | F31-based, second release from this snapshot |
stable | 31.20191001.3.1 | Second stable release from the 20191001 snapshot |
next-devel | 31.20191018.10.10 | 11th build of the day |
testing-devel | 31.20191018.20.0 | |
rawhide | 33.20191018.91.0 | F33-based, first build of the day |
branched | 32.20191018.92.0 | |
bodhi-updates-testing | 31.20191018.93.0 | |
bodhi-updates | 31.20191018.94.0 | |
(any developer build) | 31.20191018.dev.2 | Third build of the day |
We are not committing to this version scheme indefinitely, and may change it in future if it proves unworkable. A new Fedora major release (X bump) would be a good time to make such a change. We don't intend Fedora CoreOS version numbers to be parsed by machine; they're meant to help humans quickly determine the salient properties of a release.