Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kopia Backup failed on 5k pods on sn , exit on" error to connect to repository: repository not initialized in the provided storage" #6810

Closed
TzahiAshkenazi opened this issue Sep 12, 2023 · 10 comments

Comments

@TzahiAshkenazi
Copy link

TzahiAshkenazi commented Sep 12, 2023

What steps did you take and what happened:
Running kopia backup .
create SN with 5000 pods & PVC size 32MB storage class = cephrbd
backup CR failed exit on : "PartiallyFailed"

Steps & Commands:

  1. Create single namespaces with 5000 Pods&PVs. (Each PV size is32MB without data , created using BusyBox.) sc=cephfs

  2. Created Backup CR
    cr_name': 'backup-kopia-busybox-perf-single-5000-pods-cephfs
    ./velero backup create backup-kopia-busybox-perf-single-5000-pods-cephrbd --include-namespaces busybox-perf-single-ns-5000-pods --default-volumes-to-fs-backup=true --snapshot-volumes=false -n velero-1-12

  3. backup CR failed and exit on "PartiallyFailed"

Errors :

             name: /busybox-perf-single-ns-5000-pods-922 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-923 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-924 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-925 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-926 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-927 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-928 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-929 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-93 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-930 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-931 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-932 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-933 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-934 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-935 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-936 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-937 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-938 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-939 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-94 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-940 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-941 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-942 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-943 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-944 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-945 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-946 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-947 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-948 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-949 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-95 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-950 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-951 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-952 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             name: /busybox-perf-single-ns-5000-pods-953 error: /pod volume backup failed: error to initialize data path: error to boost backup repository connection default-busybox-perf-single-ns-5000-pods-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage

BSL :

NAME      PHASE       LAST VALIDATED   AGE   DEFAULT
default   Available   37s              22h   true
Name:         default
Namespace:    velero-1-12
Labels:       component=velero
Annotations:  <none>
API Version:  velero.io/v1
Kind:         BackupStorageLocation
Metadata:
  Creation Timestamp:  2023-09-11T10:51:40Z
  Generation:          2717
  Managed Fields:
    API Version:  velero.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .:
          f:component:
      f:spec:
        .:
        f:config:
          .:
          f:insecureSkipTLSVerify:
          f:profile:
          f:region:
          f:s3ForcePathStyle:
          f:s3Url:
        f:default:
        f:objectStorage:
          .:
          f:bucket:
          f:prefix:
        f:provider:
    Manager:      velero
    Operation:    Update
    Time:         2023-09-11T10:51:40Z
    API Version:  velero.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:lastSyncedTime:
        f:lastValidationTime:
        f:phase:
    Manager:         velero-server
    Operation:       Update
    Time:            2023-09-11T10:52:10Z
  Resource Version:  230263337
  UID:               74e72a3a-0b96-48cf-b931-f71f89a34dcd
Spec:
  Config:
    Insecure Skip TLS Verify:  true
    Profile:                   default
    Region:                    minio=minio
    s3ForcePathStyle:          true
    s3Url:                     http://minio-minio-bucket.apps.vlan608.rdu2.scalelab.redhat.com
  Default:                     true
  Object Storage:
    Bucket:  minio-oadp-bucket
    Prefix:  velero
  Provider:  aws
Status:
  Last Synced Time:      2023-09-12T09:40:17Z
  Last Validation Time:  2023-09-12T09:39:32Z
  Phase:                 Available
Events:                  <none>

backuprepositories :

NAME                                                   AGE   REPOSITORY TYPE
busybox-perf-single-ns-5000-pods-default-kopia-s98wl   21h   kopia
NAME                                                   AGE   REPOSITORY TYPE
busybox-perf-single-ns-5000-pods-default-kopia-s98wl   21h   kopia
[root@f07-h27-000-r640 velero]# oc describe  backuprepositories -nvelero-1-12
Name:         busybox-perf-single-ns-5000-pods-default-kopia-s98wl
Namespace:    velero-1-12
Labels:       velero.io/repository-type=kopia
              velero.io/storage-location=default
              velero.io/volume-namespace=busybox-perf-single-ns-5000-pods
Annotations:  <none>
API Version:  velero.io/v1
Kind:         BackupRepository
Metadata:
  Creation Timestamp:  2023-09-11T11:45:56Z
  Generate Name:       busybox-perf-single-ns-5000-pods-default-kopia-
  Generation:          6
  Managed Fields:
    API Version:  velero.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:generateName:
        f:labels:
          .:
          f:velero.io/repository-type:
          f:velero.io/storage-location:
          f:velero.io/volume-namespace:
      f:spec:
        .:
        f:backupStorageLocation:
        f:maintenanceFrequency:
        f:repositoryType:
        f:resticIdentifier:
        f:volumeNamespace:
      f:status:
        .:
        f:lastMaintenanceTime:
        f:message:
        f:phase:
    Manager:         velero-server
    Operation:       Update
    Time:            2023-09-11T14:47:16Z
  Resource Version:  228808059
  UID:               a4bfff60-e167-4790-b7a6-0bc3bc5e9b0e
Spec:
  Backup Storage Location:  default
  Maintenance Frequency:    1h0m0s
  Repository Type:          kopia
  Restic Identifier:        s3:http://minio-minio-bucket.apps.vlan608.rdu2.scalelab.redhat.com/minio-oadp-bucket/velero/restic/busybox-perf-single-ns-5000-pods
  Volume Namespace:         busybox-perf-single-ns-5000-pods
Status:
  Last Maintenance Time:  2023-09-11T13:47:16Z
  Message:                error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
  Phase:                  Ready
Events:                   <none>

What did you expect to happen:
backup CR will complete successfully

Anything else you would like to add:

Environment:
Version: release-1.12-dev

features:

Kubernetes version : Kustomize Version: v4.5.4

Kubernetes installer & version:

Cloud provider or hardware configuration:
OCP running over BM servers
3 masters & 12 workers nodes

OS (e.g. from /etc/os-release):

NAME="Red Hat Enterprise Linux CoreOS"
ID="rhcos"
ID_LIKE="rhel fedora"
VERSION="411.86.202306021408-0"
VERSION_ID="4.11"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux CoreOS 411.86.202306021408-0 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::coreos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.11/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
REDHAT_BUGZILLA_PRODUCT_VERSION="4.11"
REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
REDHAT_SUPPORT_PRODUCT_VERSION="4.11"
OPENSHIFT_VERSION="4.11"
RHEL_VERSION="8.6"
OSTREE_VERSION="411.86.202306021408-0"

logs :

bundle-2023-09-12-09-04-49.tar.gz
cycles_logs.tar.gz
velero_outputs.tar.gz

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@weshayutin
Copy link
Contributor

@shubham-pampattiwar @sseago I'm looking into this error and I suspect kopia or storage issue here and not a velero issue. Can you please also see what you can find and comment in this issue. THANK YOU

@Lyndon-Li
Copy link
Contributor

@TzahiAshkenazi
Is there any change to the object store backing the Kopia repo?
For example, below steps will cause the same problem:

  • Create a BSL from Velero, so Kopia repo will be initialized
  • Run a backup (everything should be OK)
  • Make changes to the object store associated to the BSL
  • Run another backup (backup will fail for the same problem)

@TzahiAshkenazi
Copy link
Author

@Lyndon-Li

  • we don't change anything  related to the object store or anything else during  the  backup process
  • we don't run parallel  backup Crs
  • similar  test cases using kopia with the same Backuprepositories  and the same and Backupstoragelocations objects   with lower pods count per namespace  completed successfully
  • on the same baremetal OCP cluster 

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Sep 13, 2023

@TzahiAshkenazi
Please check your environment and share below info:

  • Check your object store that backs the Kopia repository and see if you can find an object named kopia.repository under kopia/<source namespace> prefix
  • If you cannot find the kopia.repository object, what can you see under kopia/<source namespace> prefix? Can you share a screenshot?
  • For the same 5000 pods and PVCs, does this problem happens all the time?
  • If you delete the backuprepository CR for this backup and then rerun the backup, can you still reproduce the problem?

@TzahiAshkenazi
Copy link
Author

TzahiAshkenazi commented Sep 14, 2023

@Lyndon-Li
we no longer have the object store content on the bucket since we , runs other cycles and we clean the bucket before
i grep all the logs and the velero debug command which is attached to this ticket
i tried to reproduce this issue but i got another error different from this ticket
if re-run is needed please let me know
thanks

@Lyndon-Li
Copy link
Contributor

@TzahiAshkenazi
Yes, I think reproduce is required to further troubleshoot this issue. You can find me in velero slack channel if you want anything from me when you are ready to reproduce it.
Before that, a quick question --- For the same 5000 pods and PVCs case, how many times have you run? Did this problem happen all the time?

@TzahiAshkenazi
Copy link
Author

@Lyndon-Li
it was on the first cycle , on the second cycle i got another issue new error
ill contact you on slack to share the new error

@Lyndon-Li
Copy link
Contributor

@TzahiAshkenazi
OK, for the current issue, please let us know when you reproduce it, let's it is a configuration problem or a code bug.

Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.

Copy link

github-actions bot commented Dec 3, 2023

This issue was closed because it has been stalled for 14 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants