Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update binfmt version to 0.8 to resolve bugs #931

Closed
wants to merge 1 commit into from
Closed

fix: update binfmt version to 0.8 to resolve bugs #931

wants to merge 1 commit into from

Conversation

jessebye
Copy link

@jessebye jessebye commented Sep 27, 2021

Update binfmt version to resolve bugs when using docker buildx for cross-platform builds. See for example: https://gitlab.alpinelinux.org/alpine/aports/-/issues/12406

@keithduncan
Copy link
Contributor

Thanks for opening this pull request @jessebye 😄

I think I’ve run into similar issues trying to adopt multi-arch builds for our buildkite/agent images in buildkite/agent#1503

I have it working for linux/arm64 and linux/amd64 platforms in buildkite/agent#1502 but when I add linux/arm/v7 I get errors like this (from the alpine arm/v7 build):

Dockerfile:6
--
  | --------------------
  | 5 \|
  | 6 \| >>> RUN apk add --no-cache \
  | 7 \| >>>       bash \
  | 8 \| >>>       curl \
  | 9 \| >>>       docker-cli \
  | 10 \| >>>       docker-compose \
  | 11 \| >>>       git \
  | 12 \| >>>       jq \
  | 13 \| >>>       libc6-compat \
  | 14 \| >>>       openssh-client \
  | 15 \| >>>       perl \
  | 16 \| >>>       py-pip \
  | 17 \| >>>       rsync \
  | 18 \| >>>       run-parts \
  | 19 \| >>>       su-exec \
  | 20 \| >>>       tini \
  | 21 \| >>>       tzdata
  | 22 \|
  | --------------------
  | error: failed to solve: rpc error: code = Unknown desc = process "/dev/.buildkit_qemu_emulator /bin/sh -c apk add --no-cache       bash       curl       docker-cli       docker-compose       git       jq       libc6-compat       openssh-client       perl       py-pip       rsync       run-parts       su-exec       tini       tzdata" did not complete successfully: exit code: 1
  | 🚨 Error: The command exited with status 1
  | user command error: exit status 1

Could you shed some light on what this pull request changes that resolves the cross-platform bugs you’re seeing? I’m also wondering if it’s possible to fix it with the existing qemu script we invoke just above, either with modified args or a newer version?

@jessebye
Copy link
Author

jessebye commented Sep 28, 2021

@keithduncan Ah, yeah that looks just like the errors we're seeing too!

For example, here's one we were seeing:

--
  | Dockerfile:3
  | --------------------
  | 2 \|
  | 3 \| >>> RUN apk --update --no-cache add \
  | 4 \| >>>         curl \
  | 5 \| >>>     && install -d -o node -g node -p /app \
  | 6 \| >>>     && curl -Ss https://raw.githubusercontent.com/eficode/wait-for/master/wait-for > /usr/local/bin/wait-for \
  | 7 \| >>>     && chmod +x /usr/local/bin/wait-for
  | 8 \|
  | --------------------
  | error: failed to solve: rpc error: code = Unknown desc = process "/dev/.buildkit_qemu_emulator /bin/sh -c apk --update --no-cache add         curl     && install -d -o node -g node -p /app     && curl -Ss https://raw.githubusercontent.com/eficode/wait-for/master/wait-for > /usr/local/bin/wait-for     && chmod +x /usr/local/bin/wait-for" did not complete successfully: exit code: 77
  | 🚨 Error: The command exited with status 1
  | user command error: exit status 1

Looks pretty much identical.

This Alpine issue comment indicates that the error happens when using Docker's built-in, outdated qemu version.

By running docker run --rm --privileged linuxkit/binfmt:v0.8 (in conjunction with using the --driver docker-container flag with buildx create), a newer qemu image is pulled and used for the build.

HOWEVER, after testing the change in my PR, I found it wasn't actually effective. Running the docker run once on each AMI doesn't actually resolve the problem.

Instead, I now have a step in our build script before the docker buildx create command that runs docker run --rm --privileged linuxkit/binfmt:v0.8. That fixes the error.

I'd love to find a better solution that doesn't require running that every time though.

@jessebye
Copy link
Author

jessebye commented Sep 28, 2021

Here's one other issue I've found on Debian images that include libc-bin in their dependencies:

#13 95.56 Processing triggers for libc-bin (2.31-13) ...
--
  | #13 95.75 qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  | #13 95.76 Segmentation fault
  | #13 95.82 qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  | #13 95.82 Segmentation fault
  | #13 95.82 dpkg: error processing package libc-bin (--configure):
  | #13 95.82  installed libc-bin package post-installation script subprocess returned error exit status 139
  | #13 95.86 Errors were encountered while processing:
  | #13 95.86  libc-bin
  | #13 96.15 E: Sub-process /usr/bin/dpkg returned an error code (1)
  | #13 ERROR: process "/bin/sh -c apt-get update     && apt-get install -y curl     lp-solve" did not complete successfully: exit code: 100
  |  
  | #16 [linux/arm64 lpsolve 2/4] RUN apt-get update     && apt-get install -y     wget     unzip     lp-solve
  | #16 sha256:d4034f9afcf368a2486901b2641f88c198a78b3e63ce87755c48ee36f16154af
  | #16 CANCELED
  | ------
  | > [linux/arm64 stage-1 2/8] RUN apt-get update     && apt-get install -y curl     lp-solve:
  | ------
  | Dockerfile:19
  | --------------------
  | 18 \|
  | 19 \| >>> RUN apt-get update \
  | 20 \| >>>     && apt-get install -y curl \
  | 21 \| >>>     lp-solve
  | 22 \|
  | --------------------
  | error: failed to solve: rpc error: code = Unknown desc = process "/bin/sh -c apt-get update     && apt-get install -y curl     lp-solve" did not complete successfully: exit code: 100
  | 🚨 Error: The command exited with status 1
  | user command error: exit status 1

I found this thread which seems to mention old qemu versions being part of the problem. It looks like that linuxkit/binfmt image still includes qemu 4.x, so that could be part of the trouble.

I tried an alternate binfmt image but didn't see any improvement.

@jessebye jessebye closed this Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants