Skip to content

Commit

Permalink
sysroot: Rework /var handling to act like Docker VOLUME /var
Browse files Browse the repository at this point in the history
We've long struggled with semantics for `/var`.  Our stance of
"/var should start out empty and be managed by the OS" is a strict
one, that pushes things closer to the original systemd upstream
ideal of the "OS state is in /usr".

However...well, a few things.  First, we had some legacy bits
here which were always populating the deployment `/var`.  I don't
think we need that if systemd is in use, so detect if the tree
has `usr/lib/tmpfiles.d`, and don't create that stuff at
`ostree admin stateroot-init` time if so.

Building on that then, we have the stateroot `var` starting out
actually empty.

When we do a deployment, if the stateroot `var` is empty,
make a copy (reflink if possible of course) of the commit's `/var`
into it.

This matches the semantics that Docker created with volumes,
and this is sufficiently simple and easy to explain that I think
it's closer to the right thing to do.

Crucially...it's just really handy to have some pre-existing
directories in `/var` in container images, because Docker (and podman/kube/etc)
don't run systemd and hence don't run `tmpfiles.d` on startup.

I really hit on the fact that we need `/var/tmp` in our container
images by default for example.

So there's still some overlap here with e.g. `/usr/lib/tmpfiles.d/var.conf`
as shipped by systemd, but that's fine - they don't actually conflict
per se.
  • Loading branch information
cgwalters committed Feb 9, 2024
1 parent 1c18bd2 commit f81b9fa
Show file tree
Hide file tree
Showing 9 changed files with 238 additions and 54 deletions.
1 change: 1 addition & 0 deletions Makefile-tests.am
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ _installed_or_uninstalled_test_scripts = \
tests/test-admin-deploy-syslinux.sh \
tests/test-admin-deploy-bootprefix.sh \
tests/test-admin-deploy-composefs.sh \
tests/test-admin-deploy-var.sh \
tests/test-admin-deploy-2.sh \
tests/test-admin-deploy-karg.sh \
tests/test-admin-deploy-switch.sh \
Expand Down
3 changes: 2 additions & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,8 @@ LT_INIT([disable-static])
dnl We have an always-on feature now to signify the fix for
dnl https://github.com/ostreedev/ostree/pull/2874/commits/de6fddc6adee09a93901243dc7074090828a1912
dnl "commit: fix ostree deployment on 64-bit inode fs"
OSTREE_FEATURES="inode64"
dnl initial-var signifies this version of ostree propagates /var
OSTREE_FEATURES="inode64 initial-var"
AC_SUBST([OSTREE_FEATURES])

GLIB_TESTS
Expand Down
27 changes: 26 additions & 1 deletion docs/var.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,31 @@ nav_order: 6
1. TOC
{:toc}

## Default commit/image /var handling

As of OSTree 2024.3, when a commit is "deployed" (queued to boot),
the initial content of `/var` in a commit will be placed into the
"stateroot" (default `var`) if the stateroot `var` is empty.

The semantics of this are intended to match that of Docker "volumes";
consider that ostree systems have the equivalent of
`VOLUME /var`
by default.

It is still strongly recommended to use systemd `tmpfiles.d` snippets
to populate directory structure and the like in `/var` on firstboot,
because this is more resilent.

Even better, use `StateDirectory=` for systemd units.

### ostree container /var

Some earlier versions of the ostree-container stack migrated content in `/var`
in container images into `/usr/share/factory/var` (per below). This has
been reverted, and the semantics defer to the above ostree semantic.

## Previous /var handling via /usr/share/factory/var

As of OSTree 2023.8, the `/usr/lib/tmpfiles.d/ostree-tmpfiles.conf` file gained this snippet:

```text
Expand All @@ -18,7 +43,7 @@ As of OSTree 2023.8, the `/usr/lib/tmpfiles.d/ostree-tmpfiles.conf` file gained
C+! /var - - - - -
```

This is inert by default. However, there is a pending change in the ostree-container stack which will move all files in `/var` from fetched container images into `/usr/share/factory/var`. And other projects in the ostree ecosystem are now recommended do this by default.
This is inert by default. As of version 0.13 of the ostree-ext project, content in `/var` in fetched container images is moved to `/usr/share/factory/var`. This is no longer recommended.

Together, this will have the semantic that on OS updates, on the next boot (early in boot), any new files/directories will be copied. For more information on this, see [`man tmpfiles.d`](https://man7.org/linux/man-pages/man5/tmpfiles.d.5.html).

Expand Down
4 changes: 2 additions & 2 deletions src/boot/ostree-tmpfiles.conf
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ d /run/ostree 0755 root root -
# https://github.com/ostreedev/ostree/issues/393
R! /var/tmp/ostree-unlock-ovl.*
# Automatically propagate all /var content from /usr/share/factory/var;
# the ostree-container stack is being changed to do this, and we want to
# encourage ostree use cases in general to follow this pattern.
# NOTE: This is now considered a mistake, and will likely be reverted.
# As of OSTree 2024.3, content from the initial deployment is used.
C+! /var - - - - -
96 changes: 74 additions & 22 deletions src/libostree/ostree-sysroot-deploy.c
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@
#include "ostree.h"
#include "otcore.h"

// The path to the systemd tmpfiles.d directory.
#define USRLIB_TMPFILES "usr/lib/tmpfiles.d"

#ifdef HAVE_LIBSYSTEMD
#define OSTREE_VARRELABEL_ID \
SD_ID128_MAKE (da, 67, 9b, 08, ac, d3, 45, 04, b7, 89, d9, 6f, 81, 8e, a7, 81)
Expand Down Expand Up @@ -657,6 +660,7 @@ checkout_deployment_tree (OstreeSysroot *sysroot, OstreeRepo *repo, OstreeDeploy

/* Generate hardlink farm, then opendir it */
OstreeRepoCheckoutAtOptions checkout_opts = { .process_passthrough_whiteouts = TRUE };

if (!ostree_repo_checkout_at (repo, &checkout_opts, osdeploy_dfd, checkout_target_name, csum,
cancellable, error))
return FALSE;
Expand Down Expand Up @@ -3088,33 +3092,81 @@ _ostree_deployment_set_bootconfig_from_kargs (OstreeDeployment *deployment,
}
}

// Perform some basic static analysis and emit warnings for things
// that are likely to fail later. This function only returns
// a hard error if something unexpected (e.g. I/O error) occurs.
// If the stateroot /var is uninitialized, copy the /var content from the deployment.
// This is intended to mirror the semantics of Docker volumes.
static gboolean
lint_deployment_fs (OstreeSysroot *self, OstreeDeployment *deployment, int deployment_dfd,
GCancellable *cancellable, GError **error)
prepare_deployment_var (OstreeSysroot *self, OstreeDeployment *deployment, int deployment_dfd,
GCancellable *cancellable, GError **error)
{
g_auto (GLnxDirFdIterator) dfd_iter = {
0,
};
gboolean exists;
GLNX_AUTO_PREFIX_ERROR ("Preparing deployment /var", error);

// Does the deployment have a var? If not, we're done. (Though this will probably
// cause problems at boot time)
{
g_auto (GLnxDirFdIterator) dfd_iter = {
0,
};
gboolean exists;

if (!ot_dfd_iter_init_allow_noent (deployment_dfd, "var", &dfd_iter, &exists, error))
if (!ot_dfd_iter_init_allow_noent (deployment_dfd, "var", &dfd_iter, &exists, error))
return FALSE;
if (!exists)
{
g_debug ("deployment has no /var");
return TRUE;
}
}

// Open the stateroot which holds the shared /var
const char *stateroot = ostree_deployment_get_osname (deployment);
g_autofree char *stateroot_path = g_build_filename ("ostree/deploy/", stateroot, "var", NULL);
glnx_autofd int stateroot_dfd = -1;
if (!glnx_opendirat (self->sysroot_fd, stateroot_path, FALSE, &stateroot_dfd, error))
return glnx_prefix_error (error, "Opening stateroot");

// Check if the stateroot is empty
{
g_auto (GLnxDirFdIterator) dfd_iter = {
0,
};

if (!glnx_dirfd_iterator_init_at (stateroot_dfd, ".", FALSE, &dfd_iter, error))
return FALSE;

struct dirent *dent;

if (!glnx_dirfd_iterator_next_dent (&dfd_iter, &dent, cancellable, error))
return FALSE;

if (dent != NULL)
{
// We found existing content in the stateroot var, so we're done.
// There is no merge semantics.
g_debug ("Stateroot %s is non-empty", stateroot);
return TRUE;
}
}

g_debug ("Copying initial deployment /var");
// At this point we should initialize the stateroot var with the content from
// the commit/image. Note we need to force a copy; hopefully reflinks are available.
OstreeRepoCheckoutAtOptions co_opts
= { .force_copy = TRUE,
.subpath = "/var",
.overwrite_mode = OSTREE_REPO_CHECKOUT_OVERWRITE_UNION_FILES };
if (!ostree_repo_checkout_at (self->repo, &co_opts, stateroot_dfd, ".",
ostree_deployment_get_csum (deployment), cancellable, error))
return FALSE;
while (exists)
{
struct dirent *dent;

if (!glnx_dirfd_iterator_next_dent (&dfd_iter, &dent, cancellable, error))
if (!glnx_fstatat_allow_noent (deployment_dfd, USRLIB_TMPFILES, NULL, AT_SYMLINK_NOFOLLOW, error))
return glnx_prefix_error (error, "Querying %s", USRLIB_TMPFILES);
if (errno == ENOENT)
{
g_debug ("deployment has no %s", USRLIB_TMPFILES);
// OK, this OS doesn't appear to use systemd (or tmpfiles.d at least). For full
// backwards compatibility we create some standard things in the stateroot var.
if (!_ostree_sysroot_stateroot_legacy_var_init (stateroot_dfd, error))
return FALSE;
if (dent == NULL)
break;

fprintf (stderr,
"note: Deploying commit %s which contains content in /var/%s that should be in "
"/usr/share/factory/var\n",
ostree_deployment_get_csum (deployment), dent->d_name);
}

return TRUE;
Expand Down Expand Up @@ -3182,7 +3234,7 @@ sysroot_initialize_deployment (OstreeSysroot *self, const char *osname, const ch
if (!prepare_deployment_etc (self, repo, new_deployment, deployment_dfd, cancellable, error))
return FALSE;

if (!lint_deployment_fs (self, new_deployment, deployment_dfd, cancellable, error))
if (!prepare_deployment_var (self, new_deployment, deployment_dfd, cancellable, error))
return FALSE;

ot_transfer_out_value (out_new_deployment, &new_deployment);
Expand Down
2 changes: 2 additions & 0 deletions src/libostree/ostree-sysroot-private.h
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ char *_ostree_sysroot_get_deployment_backing_relpath (OstreeDeployment *deployme
gboolean _ostree_sysroot_rmrf_deployment (OstreeSysroot *sysroot, OstreeDeployment *deployment,
GCancellable *cancellable, GError **error);

gboolean _ostree_sysroot_stateroot_legacy_var_init (int dfd, GError **error);

char *_ostree_sysroot_get_runstate_path (OstreeDeployment *deployment, const char *key);

gboolean _ostree_sysroot_run_in_deployment (int deployment_dfd, const char *const *bwrap_argv,
Expand Down
65 changes: 40 additions & 25 deletions src/libostree/ostree-sysroot.c
Original file line number Diff line number Diff line change
Expand Up @@ -1792,6 +1792,46 @@ ostree_sysroot_lock_finish (OstreeSysroot *self, GAsyncResult *result, GError **
return g_task_propagate_boolean ((GTask *)result, error);
}

// This is a legacy subset of what happens normally via systemd tmpfiles.d;
// it is only run in the case that the deployment it self comes without
// usr/lib/tmpfiles.d
gboolean
_ostree_sysroot_stateroot_legacy_var_init (int dfd, GError **error)
{
GLNX_AUTO_PREFIX_ERROR ("Legacy mode stateroot var initialization", error);

/* This is a bit of a legacy hack...but we have to keep it around
* now. We're ensuring core subdirectories of /var exist.
*/
if (!glnx_ensure_dir (dfd, "tmp", 0777, error))
return FALSE;

if (fchmodat (dfd, "tmp", 01777, 0) < 0)
return glnx_throw_errno_prefix (error, "fchmod %s", "var/tmp");

if (!glnx_ensure_dir (dfd, "lib", 0777, error))
return FALSE;

/* This needs to be available and properly labeled early during the boot
* process (before tmpfiles.d kicks in), so that journald can flush logs from
* the first boot there. https://bugzilla.redhat.com/show_bug.cgi?id=1265295
* */
if (!glnx_ensure_dir (dfd, "log", 0755, error))
return FALSE;

if (!glnx_fstatat_allow_noent (dfd, "run", NULL, AT_SYMLINK_NOFOLLOW, error))
return FALSE;
if (errno == ENOENT && symlinkat ("../run", dfd, "run") < 0)
return glnx_throw_errno_prefix (error, "Symlinking %s", "var/run");

if (!glnx_fstatat_allow_noent (dfd, "lock", NULL, AT_SYMLINK_NOFOLLOW, error))
return FALSE;
if (errno == ENOENT && symlinkat ("../run/lock", dfd, "lock") < 0)
return glnx_throw_errno_prefix (error, "Symlinking %s", "var/lock");

return TRUE;
}

/**
* ostree_sysroot_init_osname:
* @self: Sysroot
Expand Down Expand Up @@ -1823,31 +1863,6 @@ ostree_sysroot_init_osname (OstreeSysroot *self, const char *osname, GCancellabl
if (mkdirat (dfd, "var", 0777) < 0)
return glnx_throw_errno_prefix (error, "Creating %s", "var");

/* This is a bit of a legacy hack...but we have to keep it around
* now. We're ensuring core subdirectories of /var exist.
*/
if (mkdirat (dfd, "var/tmp", 0777) < 0)
return glnx_throw_errno_prefix (error, "Creating %s", "var/tmp");

if (fchmodat (dfd, "var/tmp", 01777, 0) < 0)
return glnx_throw_errno_prefix (error, "fchmod %s", "var/tmp");

if (mkdirat (dfd, "var/lib", 0777) < 0)
return glnx_throw_errno_prefix (error, "Creating %s", "var/lib");

/* This needs to be available and properly labeled early during the boot
* process (before tmpfiles.d kicks in), so that journald can flush logs from
* the first boot there. https://bugzilla.redhat.com/show_bug.cgi?id=1265295
* */
if (mkdirat (dfd, "var/log", 0755) < 0)
return glnx_throw_errno_prefix (error, "Creating %s", "var/log");

if (symlinkat ("../run", dfd, "var/run") < 0)
return glnx_throw_errno_prefix (error, "Symlinking %s", "var/run");

if (symlinkat ("../run/lock", dfd, "var/lock") < 0)
return glnx_throw_errno_prefix (error, "Symlinking %s", "var/lock");

if (!_ostree_sysroot_bump_mtime (self, error))
return FALSE;

Expand Down
14 changes: 11 additions & 3 deletions tests/kolainst/destructive/deployment-lint
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,16 @@ set -xeuo pipefail
require_writable_sysroot
prepare_tmpdir

mkdir -p rootfs/var/shouldntdothis/subdir
mkdir -p rootfs/var/testcontent
ostree commit -b testlint --no-bindings --selinux-policy-from-base --tree=ref="${host_refspec}" --consume --tree=dir=rootfs
ostree admin deploy testlint 2>err.txt
assert_file_has_content err.txt 'Deploying commit.*which contains content in /var/shouldntdothis'
echo "ok content in /var"
assert_not_file_has_content err.txt 'Deploying commit.*which contains content in /var/testcontent'
test '!' -d /var/testcontent
echo "ok deploy var"

ostree admin stateroot-init newstatedir
ostree admin deploy --stateroot=newstatedir testlint
ls -al /sysroot/ostree/deploy/newstatedir/var
test -d /sysroot/ostree/deploy/newstatedir/var/testcontent

echo "ok deploy var new stateroot"
80 changes: 80 additions & 0 deletions tests/test-admin-deploy-var.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
#!/bin/bash
#
# Copyright (C) 2024 Red Hat, Inc.
#
# SPDX-License-Identifier: LGPL-2.0+
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library. If not, see <https://www.gnu.org/licenses/>.

set -euox pipefail

. $(dirname $0)/libtest.sh

if ! echo "$OSTREE_FEATURES" | grep --quiet --no-messages "initial-var"; then
fatal missing initial-var
fi

# Exports OSTREE_SYSROOT so --sysroot not needed.
setup_os_repository "archive" "syslinux"

echo "initial ls"
ls -R sysroot/ostree/deploy/testos/var

cd osdata
mkdir -p var/lib/
echo somedata > var/lib/somefile
${CMD_PREFIX} ostree --repo=${test_tmpdir}/testos-repo commit -b testos/buildmain/x86_64-runtime
cd -
${CMD_PREFIX} ostree --repo=sysroot/ostree/repo pull-local --remote=testos testos-repo testos/buildmain/x86_64-runtime

${CMD_PREFIX} ostree admin deploy --os=testos testos:testos/buildmain/x86_64-runtime
ls -R sysroot/ostree/deploy/testos/var
assert_file_has_content sysroot/ostree/deploy/testos/var/lib/somefile somedata
# We don't have tmpfiles here yet
assert_not_has_dir sysroot/ostree/deploy/*.0/usr/lib/tmpfiles.d
if ${CMD_PREFIX} ostree --repo=sysroot/ostree/repo ls testos/buildmain/x86_64-runtime /var/log; then
fatal "var/log in commit"
fi
# This one is created via legacy init w/o tmpfiles.d
assert_has_dir sysroot/ostree/deploy/testos/var/log

tap_ok deployment var init

cd osdata
echo someotherdata > var/lib/someotherfile
${CMD_PREFIX} ostree --repo=${test_tmpdir}/testos-repo commit -b testos/buildmain/x86_64-runtime
cd -
${CMD_PREFIX} ostree --repo=sysroot/ostree/repo pull-local --remote=testos testos-repo testos/buildmain/x86_64-runtime
${CMD_PREFIX} ostree admin deploy --os=testos testos:testos/buildmain/x86_64-runtime
assert_not_has_file sysroot/ostree/deploy/testos/var/lib/someotherfile

tap_ok deployment var not updated

${CMD_PREFIX} ostree admin undeploy 0
${CMD_PREFIX} ostree admin undeploy 0
rm sysroot/ostree/deploy/testos/var/* -rf

cd osdata
mkdir -p usr/lib/tmpfiles.d
${CMD_PREFIX} ostree --repo=${test_tmpdir}/testos-repo commit -b testos/buildmain/x86_64-runtime
cd -
${CMD_PREFIX} ostree --repo=sysroot/ostree/repo pull-local --remote=testos testos-repo testos/buildmain/x86_64-runtime
${CMD_PREFIX} ostree admin deploy --os=testos testos:testos/buildmain/x86_64-runtime

# Not in the commit, so not created via legacy init because we have tmpfiles.d
assert_not_has_dir sysroot/ostree/deploy/testos/var/log

tap_ok deployment var w/o legacy

tap_end

0 comments on commit f81b9fa

Please sign in to comment.