Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

centos-bootc/bootc-image-builder:latest build fedora-bootc:40 aarch64 ami image on x86_64 machine failed #619

Open
henrywang opened this issue Aug 24, 2024 · 6 comments

Comments

@henrywang
Copy link
Member

henrywang commented Aug 24, 2024

centos-bootc/bootc-image-builder:latest build fedora-bootc:40 aarch64 ami image on x86_64 machine failed.

Run sudo podman run --rm -it --privileged --pull=newer --tls-verify=false --security-opt label=type:unconfined_t -v /var/lib/containers/storage:/var/lib/containers/storage --env AWS_ACCESS_KEY_ID=***** --env AWS_SECRET_ACCESS_KEY=***** quay.io/centos-bootc/bootc-image-builder:latest --type ami --target-arch aarch64 --aws-ami-name bootc-bib-fedora-40-aarch64-6pjc --aws-bucket bootc-bib-images-test --aws-region us-west-2 --rootfs xfs quay.io/bootc-test/*****:6pjc failed with error:

org.osbuild.ostree.deploy.container: df91eb12effd7700c49a6285f6f8f991d0e4949bcfcb68fedf80de475fa37a40 {
  "osname": "default",
  "kernel_opts": [
    "rw",
    "console=tty0",
    "console=ttyS0"
  ],
  "target_imgref": "ostree-unverified-registry:quay.io/bootc-test/*****:6pjc",
  "rootfs": {
    "label": "root"
  },
  "mounts": [
    "/boot",
    "/boot/efi"
  ]
}
ostree container image deploy --imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]5e4fa291f946c587a8372bdeb862a2c3041ee8253824da387c7ef4ca9c8b1452 --stateroot=default --target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:6pjc --karg=rw --karg=console=tty0 --karg=console=ttyS0 --karg=root=LABEL=root --sysroot=/run/osbuild/tree
error: Performing deployment: Creating importer: Function not implemented (os error 38)
Traceback (most recent call last):
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 72, in <module>
    r = main(stage_args["tree"],
        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 67, in main
    ostree_container_deploy(tree, inputs, osname, target_imgref, kopts)
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 41, in ostree_container_deploy
    ostree.cli("container", "image", "deploy",
  File "/run/osbuild/lib/osbuild/util/ostree.py", line 205, in cli
    return subprocess.run(["ostree"] + args,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ostree', 'container', 'image', 'deploy', '--imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]5e4fa291f946c587a8372bdeb862a2c3041ee8253824da387c7ef4ca9c8b1452', '--stateroot=default', '--target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:6pjc', '--karg=rw', '--karg=console=tty0', '--karg=console=ttyS0', '--karg=root=LABEL=root', '--sysroot=/run/osbuild/tree']' returned non-zero exit status 1.

⏱  Duration: 26s
manifest - failed

bib image digest: sha256:86d0088e161db6a189d89065a233c83f8c4e63414f1ab50b4b36075ad60966db

Detailed log: https://artifacts.osci.redhat.com/testing-farm/6ba684cb-76ba-4034-b817-74990d9bcbd7/

@mvo5
Copy link
Collaborator

mvo5 commented Aug 27, 2024

I looked at this today and it turns out to be another missing syscall in qemu-user:

$ git diff
diff --git a/osbuild/buildroot.py b/osbuild/buildroot.py
index 02b1b9f2..7dad22e8 100644
--- a/osbuild/buildroot.py
+++ b/osbuild/buildroot.py
@@ -306,6 +306,7 @@ class BuildRoot(contextlib.AbstractContextManager):
 
         # Setup a new environment for the container.
         env = {
+            "QEMU_LOG": "unimp",
             "container": "bwrap-osbuild",
             "LC_CTYPE": "C.UTF-8",
             "PATH": "/usr/sbin:/usr/bin",
$ sudo python3 -m osbuild --libdir . /tmp/c9-arm64.manifest  --output-directory /tmp/out --export image
...
Unsupported syscall: 437
error: Performing deployment: Creating importer: Function not implemented (os error 38)
...
$ scmp_sys_resolver -a aarch64 437
openat2

I will look into providing an upstream fix to qemu.

@mvo5
Copy link
Collaborator

mvo5 commented Aug 28, 2024

I made some progress on qemu-user and https://github.com/qemu/qemu/compare/master...mvo5:support-openat2-clean?expand=1 is my current WIP branch. With that I can do a cross arch build again and the test works:

$ sudo pytest -s -vv './test/test_build.py::test_image_boots[quay.io/centos-bootc/centos-bootc:stream9,raw,arm64]'
...
CentOS Stream 9
Kernel 5.14.0-503.el9.aarch64 on an aarch64

enp0s1: 10.0.2.15 fe80::c432:7d1f:4347:2a52
localhost login: 
...
PASSED

[edit: also send to the qemu-devel list now link]

mvo5 added a commit to mvo5/qemu that referenced this issue Aug 28, 2024
This commit adds support for the `openat2()` syscall in the
`linux-user` userspace emulator.

It is implemented by extracting a new helper `maybe_do_fake_open()`
out of the exiting `do_guest_openat()` and share that with the
new `do_guest_openat2()`. Unfortunatly we cannot just make
do_guest_openat2() a superset of do_guest_openat() because the
openat2() syscall is stricter with the argument checking and
will return an error for invalid flags or mode combinations (which
open()/openat() will ignore).

Note that in this commit using openat2() for a "faked" file in
/proc will ignore the "resolve" flags. This is not great but it
seems similar to the exiting behavior when openat() is called
with a dirfd to "/proc". Here too the fake file lookup will
not catch the special file. Alternatively we could simply
fail with `-TARGET_ENOSYS` (or similar) if `resolve` flags
are passed and we found something that looks like a file that
needs faking.

Signed-off-by: Michael Vogt <[email protected]>

Buglink: osbuild/bootc-image-builder#619
mvo5 added a commit to mvo5/qemu that referenced this issue Aug 28, 2024
This commit adds support for the `openat2()` syscall in the
`linux-user` userspace emulator.

It is implemented by extracting a new helper `maybe_do_fake_open()`
out of the exiting `do_guest_openat()` and share that with the
new `do_guest_openat2()`. Unfortunatly we cannot just make
do_guest_openat2() a superset of do_guest_openat() because the
openat2() syscall is stricter with the argument checking and
will return an error for invalid flags or mode combinations (which
open()/openat() will ignore).

Note that in this commit using openat2() for a "faked" file in
/proc will ignore the "resolve" flags. This is not great but it
seems similar to the exiting behavior when openat() is called
with a dirfd to "/proc". Here too the fake file lookup may
not catch the special file because "realpath()" is used to
determine if the path is in /proc. Alternatively to ignoring
we could simply fail with `-TARGET_ENOSYS` (or similar) if
`resolve` flags are passed and we found something that looks
like a file in /proc that needs faking.

Signed-off-by: Michael Vogt <[email protected]>

Buglink: osbuild/bootc-image-builder#619
mvo5 added a commit to mvo5/qemu that referenced this issue Aug 28, 2024
This commit adds support for the `openat2()` syscall in the
`linux-user` userspace emulator.

It is implemented by extracting a new helper `maybe_do_fake_open()`
out of the exiting `do_guest_openat()` and share that with the
new `do_guest_openat2()`. Unfortunatly we cannot just make
do_guest_openat2() a superset of do_guest_openat() because the
openat2() syscall is stricter with the argument checking and
will return an error for invalid flags or mode combinations (which
open()/openat() will ignore).

Note that in this commit using openat2() for a "faked" file in
/proc will ignore the "resolve" flags. This is not great but it
seems similar to the exiting behavior when openat() is called
with a dirfd to "/proc". Here too the fake file lookup may
not catch the special file because "realpath()" is used to
determine if the path is in /proc. Alternatively to ignoring
we could simply fail with `-TARGET_ENOSYS` (or similar) if
`resolve` flags are passed and we found something that looks
like a file in /proc that needs faking.

Signed-off-by: Michael Vogt <[email protected]>

Buglink: osbuild/bootc-image-builder#619
mvo5 added a commit to mvo5/qemu that referenced this issue Aug 29, 2024
This commit adds support for the `openat2()` syscall in the
`linux-user` userspace emulator.

It is implemented by extracting a new helper `maybe_do_fake_open()`
out of the exiting `do_guest_openat()` and share that with the
new `do_guest_openat2()`. Unfortunatly we cannot just make
do_guest_openat2() a superset of do_guest_openat() because the
openat2() syscall is stricter with the argument checking and
will return an error for invalid flags or mode combinations (which
open()/openat() will ignore).

Note that in this commit using openat2() for a "faked" file in
/proc will ignore the "resolve" flags. This is not great but it
seems similar to the exiting behavior when openat() is called
with a dirfd to "/proc". Here too the fake file lookup may
not catch the special file because "realpath()" is used to
determine if the path is in /proc. Alternatively to ignoring
we could simply fail with `-TARGET_ENOSYS` (or similar) if
`resolve` flags are passed and we found something that looks
like a file in /proc that needs faking.

Signed-off-by: Michael Vogt <[email protected]>

Buglink: osbuild/bootc-image-builder#619
mvo5 added a commit to mvo5/qemu that referenced this issue Aug 29, 2024
This commit adds support for the `openat2()` syscall in the
`linux-user` userspace emulator.

It is implemented by extracting a new helper `maybe_do_fake_open()`
out of the exiting `do_guest_openat()` and share that with the
new `do_guest_openat2()`. Unfortunatly we cannot just make
do_guest_openat2() a superset of do_guest_openat() because the
openat2() syscall is stricter with the argument checking and
will return an error for invalid flags or mode combinations (which
open()/openat() will ignore).

Note that in this commit using openat2() for a "faked" file in
/proc will ignore the "resolve" flags. This is not great but it
seems similar to the exiting behavior when openat() is called
with a dirfd to "/proc". Here too the fake file lookup may
not catch the special file because "realpath()" is used to
determine if the path is in /proc. Alternatively to ignoring
we could simply fail with `-TARGET_ENOSYS` (or similar) if
`resolve` flags are passed and we found something that looks
like a file in /proc that needs faking.

Signed-off-by: Michael Vogt <[email protected]>

Buglink: osbuild/bootc-image-builder#619
mvo5 added a commit to mvo5/qemu that referenced this issue Aug 29, 2024
This commit adds support for the `openat2()` syscall in the
`linux-user` userspace emulator.

It is implemented by extracting a new helper `maybe_do_fake_open()`
out of the exiting `do_guest_openat()` and share that with the
new `do_guest_openat2()`. Unfortunatly we cannot just make
do_guest_openat2() a superset of do_guest_openat() because the
openat2() syscall is stricter with the argument checking and
will return an error for invalid flags or mode combinations (which
open()/openat() will ignore).

Note that in this commit using openat2() for a "faked" file in
/proc will ignore the "resolve" flags. This is not great but it
seems similar to the exiting behavior when openat() is called
with a dirfd to "/proc". Here too the fake file lookup may
not catch the special file because "realpath()" is used to
determine if the path is in /proc. Alternatively to ignoring
we could simply fail with `-TARGET_ENOSYS` (or similar) if
`resolve` flags are passed and we found something that looks
like a file in /proc that needs faking.

Signed-off-by: Michael Vogt <[email protected]>
Buglink: osbuild/bootc-image-builder#619

Thanks for Richard Henderson and Florian Schueller for their
feedback.
@henrywang
Copy link
Member Author

henrywang commented Aug 30, 2024

fedora-bootc:41 corss arch build failed with this error. Should be the same reason. Thank!

org.osbuild.ostree.deploy.container: c3b6f76397267352af8b8372dfced2464874109e7a5f7897f5a2fe8118686fc8 {
  "osname": "default",
  "kernel_opts": [
    "rw",
    "console=tty0",
    "console=ttyS0"
  ],
  "target_imgref": "ostree-unverified-registry:quay.io/bootc-test/*****:5jij",
  "rootfs": {
    "label": "root"
  },
  "mounts": [
    "/boot",
    "/boot/efi"
  ]
}
ostree container image deploy --imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]505e426390213d8f8b2cf4578e86c8f413b242a241bb3dfa6103de76a13bd820 --stateroot=default --target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:5jij --karg=rw --karg=console=tty0 --karg=console=ttyS0 --karg=root=LABEL=root --sysroot=/run/osbuild/tree
error: Performing deployment: Creating importer: Function not implemented (os error 38)
Traceback (most recent call last):
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 72, in <module>
    r = main(stage_args["tree"],
             stage_args["inputs"],
             stage_args["options"])
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 67, in main
    ostree_container_deploy(tree, inputs, osname, target_imgref, kopts)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 41, in ostree_container_deploy
    ostree.cli("container", "image", "deploy",
    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               *extra_args, sysroot=tree, *kargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/lib/osbuild/util/ostree.py", line 205, in cli
    return subprocess.run(["ostree"] + args,
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
                          encoding="utf8",
                          ^^^^^^^^^^^^^^^^
                          stdout=subprocess.PIPE,
                          ^^^^^^^^^^^^^^^^^^^^^^^
                          input=_input,
                          ^^^^^^^^^^^^^
                          check=True)
                          ^^^^^^^^^^^
  File "/usr/lib64/python3.13/subprocess.py", line 577, in run
    raise CalledProcessError(retcode, process.args,
                             output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ostree', 'container', 'image', 'deploy', '--imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]505e426390213d8f8b2cf4578e86c8f413b242a241bb3dfa6103de76a13bd820', '--stateroot=default', '--target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:5jij', '--karg=rw', '--karg=console=tty0', '--karg=console=ttyS0', '--karg=root=LABEL=root', '--sysroot=/run/osbuild/tree']' returned non-zero exit status 1.

mvo5 added a commit to mvo5/qemu that referenced this issue Aug 30, 2024
This commit adds support for the `openat2()` syscall in the
`linux-user` userspace emulator.

It is implemented by extracting a new helper `maybe_do_fake_open()`
out of the exiting `do_guest_openat()` and share that with the
new `do_guest_openat2()`. Unfortunatly we cannot just make
do_guest_openat2() a superset of do_guest_openat() because the
openat2() syscall is stricter with the argument checking and
will return an error for invalid flags or mode combinations (which
open()/openat() will ignore).

The implementation is similar to SYSCALL_DEFINE(openat2), i.e.
a new `copy_struct_from_user()` is usef that works the same
as the kernels version to support backwards-compatibility
for struct syscall argument.

Instead of including openat2.h we create a copy of `open_how`
as `open_how_ver0` to ensure that if the structure grows we
can log a LOG_UNIMP warning.

Note that in this commit using openat2() for a "faked" file in
/proc will ignore the "resolve" flags. This is not great but it
seems similar to the exiting behavior when openat() is called
with a dirfd to "/proc". Here too the fake file lookup may
not catch the special file because "realpath()" is used to
determine if the path is in /proc. Alternatively to ignoring
we could simply fail with `-TARGET_ENOSYS` (or similar) if
`resolve` flags are passed and we found something that looks
like a file in /proc that needs faking.

Signed-off-by: Michael Vogt <[email protected]>
Buglink: osbuild/bootc-image-builder#619
@chunfuwen
Copy link

it is easily reproduced on fedora 40 cross build with below command:

sudo podman run --rm -it --privileged --pull=newer --security-opt label=type:unconfined_t -v /var/lib/libvirt/images/output:/output -v /var/lib/libvirt/images/config.json:/config.json   -v /var/lib/libvirt/images/auth.json:/run/containers/0/auth.json  quay.io/centos-bootc/bootc-image-builder:latest  --type qcow2 --tls-verify=true  --config /config.json  --target-arch=aarch64  quay.io/centos-bootc/centos-bootc:stream10
...
ostree container image deploy --imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]f03bba6c34db7fe7454371f32230f12349358da7872bbb461ad72da5048cb01d --stateroot=default --target-imgref=ostree-unverified-registry:quay.io/centos-bootc/centos-bootc:stream10 --karg=rw --karg=console=tty0 --karg=console=ttyS0 --karg=root=LABEL=root --sysroot=/run/osbuild/tree
error: Performing deployment: Creating importer: Function not implemented (os error 38)
Traceback (most recent call last):
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 72, in <module>
    r = main(stage_args["tree"],
        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 67, in main
    ostree_container_deploy(tree, inputs, osname, target_imgref, kopts)
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 41, in ostree_container_deploy
    ostree.cli("container", "image", "deploy",
  File "/run/osbuild/lib/osbuild/util/ostree.py", line 205, in cli
    return subprocess.run(["ostree"] + args,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ostree', 'container', 'image', 'deploy', '--imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]f03bba6c34db7fe7454371f32230f12349358da7872bbb461ad72da5048cb01d', '--stateroot=default', '--target-imgref=ostree-unverified-registry:quay.io/centos-bootc/centos-bootc:stream10', '--karg=rw', '--karg=console=tty0', '--karg=console=ttyS0', '--karg=root=LABEL=root', '--sysroot=/run/osbuild/tree']' returned non-zero exit status 1.

⏱  Duration: 68s
manifest - failed
Failed
2024/08/31 09:29:06 error: cannot run osbuild: running osbuild failed: exit status 1

@mvo5
Copy link
Collaborator

mvo5 commented Sep 9, 2024

Thanks, I'm 75-80% confident that https://lists.nongnu.org/archive/html/qemu-devel/2024-09/msg00976.html will fix this, to be sure I would have to run it again with "QEMU_LOG": "unimp", which unfortunately we cannot make default as it will complain about some "harmless" unimplemented ioctl/syscalls but that is enough to taint the output.

@cdrage
Copy link
Contributor

cdrage commented Sep 9, 2024

Getting the same error when not doing cross-arch as well in here: #641

@cgwalters unsure if this is also related too?

would it be 2 fixes, 1 for fixing bootc in centos image, other is the qemu update?

EDIT: You are right. cross-arch is the only part that is failing. Creating the arch native to the system works fine. Only when it's building amd64 it runs into the qemu issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants