Filesystem backup fails with "error to initialize data path" #8767

darnone · 2025-03-07T21:01:08Z

What steps did you take and what happened:
I have Velero deployed to a cluster using helm chart and ArgoCD. I have a test example with 2 deployments - on for ebs and one for efs. Backups and restore work with the folllowing commands:

velero backup create backup-fs --include-namespaces snapshot --default-volumes-to-fs-backup=true
velero restore create restore-fs --from-backup backup-fs --include-namespaces snapshot

Backups and restores complete successfully. /backups, /kopia, and /restores appear in S3.

Then I tried to back up kube-prometheus-stack. it has 2 ebs volumes - one for prometheus and one for grafana. But the backups partially failed with:

Errors:
  Velero:    message: /pod volume backup failed: error to initialize data path: error to boost backup repository connection velero-backup-storage-location-monitoring-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             message: /pod volume backup failed: error to initialize data path: error to boost backup repository connection velero-backup-storage-location-monitoring-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             message: /pod volume backup failed: error to initialize data path: error to boost backup repository connection velero-backup-storage-location-monitoring-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             message: /pod volume backup failed: error to initialize data path: error to boost backup repository connection velero-backup-storage-location-monitoring-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             message: /pod volume backup failed: error to initialize data path: error to boost backup repository connection velero-backup-storage-location-monitoring-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             message: /pod volume backup failed: error to initialize data path: error to boost backup repository connection velero-backup-storage-location-monitoring-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage
             message: /pod volume backup failed: error to initialize data path: error to boost backup repository connection velero-backup-storage-location-monitoring-kopia: error to connect backup repo: error to connect repo with storage: error to connect to repository: repository not initialized in the provided storage

What did you expect to happen:
I expected everything to work as my example as the same storage class is used etc.

bundle-2025-03-07-15-42-09.tar.gz

The following information will help us better understand what's going on:

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)

kubectl logs deployment/velero -n velero
velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>

Anything else you would like to add:
The cluster has multiple nodegroups. Everything, is deployed to a specific nodegroup, including Kube-prometheus-stack and velero (both into same). The node group has 4 nodes labeled:

nodeSelector:
  node: infra

The nodeagents do not have a node selector so a velero nodeagent is running on all 9 nodes of the cluster

So what a am I doing wrong.

My velero configuration is also listed here:

nodeSelector:
  node: infra
  
image:
  repository: velero/velero
  tag: v1.15.2
  pullPolicy: IfNotPresent

configuration:
  features: EnableCSI
  uploaderType: kopia
  backupStorageLocation:
  - name: velero-backup-storage-location
    #bucket: {{ .Values.velero_backups_bucket }}
    bucket: gts-argocd-ci-velero-dev
    default: true
    provider: aws
    config:
      region: us-east-1
  volumeSnapshotLocation:
  - name: velero-volume-storage-location
    provider: aws
    config:
      region: us-east-1

initContainers:
- name: velero-plugin-for-aws
  image: velero/velero-plugin-for-aws:v1.11.1
  volumeMounts:
  - mountPath: /target
    name: plugins

credentials:
  useSecret: false

resources:
  requests:
    cpu: 500m
    memory: 128Mi
  limits:
    cpu: 1000m
    memory: 512Mi

deployNodeAgent: true

nodeAgent:
  podVolumePath: /var/lib/kubelet/pods

  resources:
    requests:
      cpu: 500m
      memory: 512Mi
    limits:
      cpu: 1000m
      memory: 1024Mi

Environment:

Velero version (use velero version): 1.15.2
Velero features (use velero client config get features):
Kubernetes version (use kubectl version): features:
Kubernetes installer & version: terraform 1.30.9
Cloud provider or hardware configuration: AWS EKS 1.30
OS (e.g. from /etc/os-release): Amazon Linux 2 optimized

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

👍 for "I would like to see this bug fixed as soon as possible"
👎 for "There are more important bugs to focus on right now"

The text was updated successfully, but these errors were encountered:

darnone · 2025-03-07T21:46:58Z

If I describe of the failed jobs I see

Warning BackoffLimitExceeded 3m5s job-controller Job has reached the specified backoff limit

but I don't know what to do to fix it or if there is a configuration chart to change it. What I don't understand is I have another cluster the the same config (in the same AWS account ) that is running with out problems.

darnone · 2025-03-07T22:55:45Z

So what I found and did to fix this was to set:

configuration:
  repositoryMaintenanceJob:
    requests:
      cpu: 1000m
      memory: 1024Mi
    limits:
      memory: 2048Mi

I don't know if the resources are overkill but the backup now completes with no errors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filesystem backup fails with "error to initialize data path" #8767

Filesystem backup fails with "error to initialize data path" #8767

darnone commented Mar 7, 2025

darnone commented Mar 7, 2025

darnone commented Mar 7, 2025

Filesystem backup fails with "error to initialize data path" #8767

Filesystem backup fails with "error to initialize data path" #8767

Comments

darnone commented Mar 7, 2025

darnone commented Mar 7, 2025

darnone commented Mar 7, 2025