Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error syncing 'minio/...': Tar file extraction failed for file index: 2, with: EOF when updating minio #2229

Closed
pschichtel opened this issue Jul 21, 2024 · 21 comments · Fixed by minio/minio#20270 or minio/minio#20282
Assignees
Labels
bug fix bug Something isn't working community

Comments

@pschichtel
Copy link
Contributor

pschichtel commented Jul 21, 2024

This is a follow-up to minio/minio#19510.

While the issue of rolling upgrade partially breaking minio has been solved in minio by properly waiting for all nodes to have a consistent version, actual root cause still seems to exist even with the new operator 6.0.0.

Expected Behavior

  1. operator downloads the new minio binary
  2. operator replaces the binary in all minio pods
  3. operator simultaneously restarts all minio processes
  4. operator triggers a rolling restart of the StatefulSet

Current Behavior

What seems to be happening according to the discussion with @harshavardhana:

  1. operator downloads the new minio binary, but fails to unpack the tar archive.
  2. operator replaces the binary in all minio pods skipped
  3. operator simultaneously restarts all minio processes skipped
  4. operator triggers a rolling restart of the StatefulSet

Possible Solution

no idea

Steps to Reproduce (for bugs)

  1. Setup k0s
  2. Setup operator
  3. Setup an any-sized minio cluster via operator
  4. Change the minio version of the tenant

Context

I have two affected setups:

  1. k0s 1.30, single node (k8s + minio), archlinux, containerd/runc
  2. k0s 1.29, 5 nodes (k8s + minio), debian bookworm, containerd/runc

This issue has been around for a while (part of it can be seen in the issue reference at the top and the issues that are referenced by that) across a bunch of k8s/k0s, minio and operator 5.x versions.

My setup is a single node k0s

Your Environment

  • Version used (minio-operator): 6.0.0
  • Environment name and version (e.g. kubernetes v1.17.2): kubernetes 1.30.1
  • Server type and version:
  • Operating System and version (uname -a): Linux server 6.9.10-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 18 Jul 2024 18:06:13 +0000 x86_64 GNU/Linux
  • Link to your deployment file:
@pschichtel
Copy link
Contributor Author

Currently extracting the files from operator container, I calculated md5sums:

$ md5sum /tmp/webhook/v1/update/*
59f91db02665c67c65f126437deab88c  /tmp/webhook/v1/update/207e26731498cf8563a53387d874c6e55cda1834beddeb3fdf1be5c1834ea9bb.tar.gz
fbb6b01b6845bee9d8b8d8e24f557930  /tmp/webhook/v1/update/image.tar
201e622522ea221967b8e63165fe1794  /tmp/webhook/v1/update/minio

@harshavardhana
Copy link
Member

Can you share the size of the files @pschichtel ?

@pschichtel
Copy link
Contributor Author

$ ls -l /tmp/webhook/v1/update/
total 192760
-rwxr-xr-x 1 1000 1000  37036866 Jul 20 14:41 207e26731498cf8563a53387d874c6e55cda1834beddeb3fdf1be5c1834ea9bb.tar.gz
-rwxr-xr-x 1 1000 1000  56932864 Jul 20 14:41 image.tar
-rwxr-xr-x 1 1000 1000 103407768 Jul 20 14:41 minio

@pschichtel
Copy link
Contributor Author

btw this was the upgrade from RELEASE.2024-07-04T14-25-45Z to RELEASE.2024-07-16T23-46-41Z

what file on the host has to say about the files:

# file .../tmp/webhook/v1/update/207e26731498cf8563a53387d874c6e55cda1834beddeb3fdf1be5c1834ea9bb.tar.gz 
.../tmp/webhook/v1/update/207e26731498cf8563a53387d874c6e55cda1834beddeb3fdf1be5c1834ea9bb.tar.gz: gzip compressed data, original size modulo 2^32 103410688
# file .../tmp/webhook/v1/update/image.tar 
.../tmp/webhook/v1/update/image.tar: POSIX tar archive
# file .../tmp/webhook/v1/update/minio 
.../tmp/webhook/v1/update/minio: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=8_-5P-JG3l3M51Pz8zGN/jabXWsuYvxbAw4s24AsS/YwA7QU_twJjoYuIUzNoz/fIqoAz_adNHdvW-en9RN, stripped

Should I attach the actual files?

@harshavardhana
Copy link
Member

You can thanks.

@pschichtel
Copy link
Contributor Author

It was too large for github, so: https://cubyte.cloud/s/CMyXTMKFBirSbLY (the share will expire in 2 days)

@ramondeklein
Copy link
Contributor

Mirror here (will be available for 6 weeks): https://ramon.transferxl.com/details/08jXskbVkwJw02/

@cesnietor cesnietor added the bug Something isn't working label Jul 22, 2024
@cesnietor cesnietor self-assigned this Jul 22, 2024
@cesnietor
Copy link
Contributor

we'll try to reproduce and share our findings

@pschichtel
Copy link
Contributor Author

pschichtel commented Jul 23, 2024

I've reproduced this again on my multi-node production cluster. Uploading its file as well doesn't seem to provide much value. As this might be related to k0s: Would it be helpful to provide a minimal k0sctl configuration to quickly setup a single-node cluster for testing?

@harshavardhana
Copy link
Member

I've reproduced this again on my multi-node production cluster. Uploading its file as well doesn't seem to provide much value. As this might be related to k0s: Would it be helpful to provide a minimal k0sctl configuration to quickly setup a single-node cluster for testing?

yes can you provide that?

@pschichtel
Copy link
Contributor Author

@harshavardhana https://gist.github.com/pschichtel/c1f45409a797d71eeb588e434947721b

that's a bash script that creates a minimal setup with the operator and 1 tenant. the patch command printed by the script at the end will trigger the problem. the script embeds all yaml files for customizations, so you can override images however you need.

@pschichtel
Copy link
Contributor Author

the setup doesn't use k0sctl, it executes k0s directly with the same options k0sctl would render into the systemd service. It's derived from a setup of mine.

@ramondeklein
Copy link
Contributor

ramondeklein commented Aug 15, 2024

I encountered the same issue when upgrading from 2024-07-29T22-14-52Z to 2024-08-03T04-33-23Z. I could also reproduce it when running the operator on my local developer machine, where it saved the tar file to /tmp/webhook/v1/update/image.tar`. There are two issues:

  • Incorrect permission when creating this folder (this probably works on most systems, where MinIO is running as root). When it's not, then it fails to open the file due to bad permissions. I guess this was meant to be 0o777 instead (see Use proper permission for the update directory #2277).
  • It seems our latest distributions are missing the minio.sha256sum, minio.minisig files from the image.

Some more details

The image.tar has the following contents:

❯ tar tvf image.tar
-rw-r--r-- 0/0            8193 1970-01-01 01:00 sha256:07520491faf15698a4853bcd6dd10fd05935f64ee5ed10a9fcfdfeefc6ae2d3a
-rw-r--r-- 0/0         7261669 1970-01-01 01:00 5f328c14e09dcf260227fb3a7eb4d0ef531ad567bd2868d0c5cef9f0190d1f5f.tar.gz
-rw-r--r-- 0/0          126709 1970-01-01 01:00 1f53073e01a069a457b592eaa38eddd12a5ff8df351a27085d5875b356cbe3c5.tar.gz
-rw-r--r-- 0/0        36999202 1970-01-01 01:00 44cdc501a5a3f3c2ee4b03b01f4ad06e472f766d5e10ca1b95ce16a2058df041.tar.gz
-rw-r--r-- 0/0         9920474 1970-01-01 01:00 a5140749f35fe0b43df9a368c47f422a879e4bc2afddf0f3a362ade19e3fb5ad.tar.gz
-rw-r--r-- 0/0         2368479 1970-01-01 01:00 5977bc78f8400626b02d7c8ee659e9f22777abfc20ba480612bb76046e2eadf1.tar.gz
-rw-r--r-- 0/0          188999 1970-01-01 01:00 a7c0c330019377d8ba92526bec31fa1e23221e24b25570ad812193dcce8a90af.tar.gz
-rw-r--r-- 0/0           11877 1970-01-01 01:00 4086fb25cb3aecb8c62916a2063233d68d8f66df8372dde579338d24347c2cf1.tar.gz
-rw-r--r-- 0/0             499 1970-01-01 01:00 9ebf38c2ce8c2b2b3698f0e8d7ba7356835275ded5e253d8ca9c7080be05e5a5.tar.gz
-rw-r--r-- 0/0             745 1970-01-01 01:00 manifest.json

The manifest contains (formatted):

[
  {
    "Config": "sha256:07520491faf15698a4853bcd6dd10fd05935f64ee5ed10a9fcfdfeefc6ae2d3a",
    "RepoTags": [
      "minio/minio:RELEASE.2024-08-03T04-33-23Z"
    ],
    "Layers": [
      "5f328c14e09dcf260227fb3a7eb4d0ef531ad567bd2868d0c5cef9f0190d1f5f.tar.gz",
      "1f53073e01a069a457b592eaa38eddd12a5ff8df351a27085d5875b356cbe3c5.tar.gz",
      "44cdc501a5a3f3c2ee4b03b01f4ad06e472f766d5e10ca1b95ce16a2058df041.tar.gz",
      "a5140749f35fe0b43df9a368c47f422a879e4bc2afddf0f3a362ade19e3fb5ad.tar.gz",
      "5977bc78f8400626b02d7c8ee659e9f22777abfc20ba480612bb76046e2eadf1.tar.gz",
      "a7c0c330019377d8ba92526bec31fa1e23221e24b25570ad812193dcce8a90af.tar.gz",
      "4086fb25cb3aecb8c62916a2063233d68d8f66df8372dde579338d24347c2cf1.tar.gz",
      "9ebf38c2ce8c2b2b3698f0e8d7ba7356835275ded5e253d8ca9c7080be05e5a5.tar.gz"
    ]
  }
]

It did fetch the 44cdc501a5a3f3c2ee4b03b01f4ad06e472f766d5e10ca1b95ce16a2058df041.tar.gz file from the tar-ball and it started looking for the following three files (source):

  • latest assets: opt/bin/minio, opt/bin/minio.sha256sum, opt/bin/minio.minisig or
  • legacy assets usr/bin/minio, usr/bin/minio.sha256sum, usr/bin/minio.minisig

This is the contents of this tar-file:

❯ tar tvzf 44cdc501a5a3f3c2ee4b03b01f4ad06e472f766d5e10ca1b95ce16a2058df041.tar.gz
drwxr-xr-x 0/0               0 2024-07-18 19:25 usr/
dr-xr-xr-x 0/0               0 2024-08-03 10:49 usr/bin/
-rwxr-xr-x 0/0       103174296 2024-08-03 10:49 usr/bin/minio

This tar-ball only contains usr/bin/minio, but it doesn't contain usr/bin/minio.sha256sum, usr/bin/minio.minisig. The other layers seem to be the base image and the /usr/bin/mc layer. It looks like the SHA and signature files are missing, so I guess this is actually a bug in the CI/CD pipeline. I'm also confused why our latest image seems to use legacy assets instead of latest assets.

@harshavardhana
Copy link
Member

Yeah looks like we removed them when we moved to micro UBI8 base image.


# Install curl and minisign
RUN apk add -U --no-cache ca-certificates && \
    apk add -U --no-cache curl && \
    go install aead.dev/minisign/cmd/[email protected]

# Download minio binary and signature file
RUN curl -s -q https://dl.min.io/server/minio/release/linux-${TARGETARCH}/archive/minio.${RELEASE} -o /go/bin/minio && \
    curl -s -q https://dl.min.io/server/minio/release/linux-${TARGETARCH}/archive/minio.${RELEASE}.minisig -o /go/bin/minio.minisig && \
    chmod +x /go/bin/minio

# Download mc binary and signature file
RUN curl -s -q https://dl.min.io/client/mc/release/linux-${TARGETARCH}/mc -o /go/bin/mc && \
    curl -s -q https://dl.min.io/client/mc/release/linux-${TARGETARCH}/mc.minisig -o /go/bin/mc.minisig && \
    chmod +x /go/bin/mc

RUN if [ "$TARGETARCH" = "amd64" ]; then \
       curl -L -s -q https://github.com/moparisthebest/static-curl/releases/latest/download/curl-${TARGETARCH} -o /go/bin/curl; \
       chmod +x /go/bin/curl; \
    fi

# Verify binary signature using public key "RWTx5Zr1tiHQLwG9keckT0c45M3AGeHD6IvimQHpyRywVWGbP1aVSGavRUN"
RUN minisign -Vqm /go/bin/minio -x /go/bin/minio.minisig -P RWTx5Zr1tiHQLwG9keckT0c45M3AGeHD6IvimQHpyRywVWGbP1aVSGav && \
    minisign -Vqm /go/bin/mc -x /go/bin/mc.minisig -P RWTx5Zr1tiHQLwG9keckT0c45M3AGeHD6IvimQHpyRywVWGbP1aVSGav

We can fix this though by copying the relevant files.

@ramondeklein
Copy link
Contributor

This PR will improve the error message: #2278 if it can't find all files.

@ramondeklein
Copy link
Contributor

ramondeklein commented Aug 15, 2024

@harshavardhana Okay, I'll submit a PR to fix the four Dockerfiles, see minio/minio#20270.

@pschichtel
Copy link
Contributor Author

@harshavardhana I just tested this on my single-node home-lab by upgrading to RELEASE.2024-08-17T01-24-54Z and I still got the Tar file extraction failed for file index: 2, with: EOF message. So I don't think this is really fixed.

@ramondeklein ramondeklein reopened this Aug 17, 2024
@ramondeklein
Copy link
Contributor

ramondeklein commented Aug 17, 2024

@pschichtel I checked by running the following:

docker pull quay.io/minio/minio:RELEASE.2024-08-17T01-24-54Z
docker image save quay.io/minio/minio:RELEASE.2024-08-17T01-24-54Z | tar xvf -
tar tvf blobs/sha256/edf9024ba4443ed6006f811ef470a062555457eae36463558c6bdce28c799425

It seems the extra files are not in there:

$ tar tvf blobs/sha256/edf9024ba4443ed6006f811ef470a062555457eae36463558c6bdce28c799425
drwxr-xr-x 0/0               0 2024-07-18 19:25 usr/
dr-xr-xr-x 0/0               0 2024-08-17 14:26 usr/bin/
-rwxr-xr-x 0/0       103674008 2024-08-17 14:26 usr/bin/minio

I'll investigate further. I expected the CI/CD build to use the Dockerfile in this patch. It looks like the build procedure uses make docker that loads the old MinIO image and just copies the minio binary. I was expecting that a release would use Dockerfile.release instead.

@harshavardhana
Copy link
Member

I'll investigate further. I expected the CI/CD build to use the Dockerfile in this patch. It looks like the build procedure uses make docker that loads the old MinIO image and just copies the minio binary. I was expecting that a release would use Dockerfile.release instead.

That is not true @ramondeklein it does use the Dockerfile.release

docker buildx build --push --no-cache \
        --build-arg RELEASE="${release}" \
        -t "minio/minio:latest" \
        -t "quay.io/minio/minio:latest" \
        -t "minio/minio:${release}" \
        -t "quay.io/minio/minio:${release}" \
        --platform=linux/arm64,linux/amd64,linux/ppc64le,linux/s390x \
        -f Dockerfile.release .

We do not do any make docker

@harshavardhana
Copy link
Member

docker run -it --entrypoint=/bin/bash --rm quay.io/minio/minio:RELEASE.2024-08-17T01-24-54Z
bash-5.1# ls /usr/bin/minio*
/usr/bin/minio	/usr/bin/minio.minisig	/usr/bin/minio.sha256sum

The container has everything that is needed.

@harshavardhana
Copy link
Member

The file is available in the second "layer" of the image. Operator needs to be updated to handle this or if you want to merge them into same layer you have to change the Dockerfile.release to

diff --git a/Dockerfile.release b/Dockerfile.release
index ca0c3e688..fed3fa926 100644
--- a/Dockerfile.release
+++ b/Dockerfile.release
@@ -54,9 +54,7 @@ ENV MINIO_ACCESS_KEY_FILE=access_key \
     MC_CONFIG_DIR=/tmp/.mc
 
 COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
-COPY --from=build /go/bin/minio /usr/bin/minio
-COPY --from=build /go/bin/minio.minisig /usr/bin/minio.minisig
-COPY --from=build /go/bin/minio.sha256sum /usr/bin/minio.sha256sum
+COPY --from=build /go/bin/minio* /usr/bin/
 COPY --from=build /go/bin/mc /usr/bin/mc
 COPY --from=build /go/bin/mc.minisig /usr/bin/mc.minisig
 COPY --from=build /go/bin/mc.sha256sum /usr/bin/mc.sha256sum

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix bug Something isn't working community
Projects
None yet
4 participants