Restore fails due to Restic timeout #3250
Replies: 9 comments
-
Processing a new restore may not happen instantaneously but I would have expected some entries in the velero deployment log from the restore controller. Could you provide us with the full logs from the velero deployment using |
Beta Was this translation helpful? Give feedback.
-
I want to restore a backup in the same cluster in a different namespace New attempt to deploy backup
in 40 minutes
jenkins_heme restored size
jenkins_home original size
Where is jenkins data from backup?
|
Beta Was this translation helpful? Give feedback.
-
Thanks for providing the additional information and logs. Looking through them, I can see that the restic restore of the volumes timed out after 4 hours: https://gist.github.com/svua/b611f107523c99e3d73d5fcb2223f70e#file-gistfile1-txt-L1733-L1737 and failed again on the second attempt: https://gist.github.com/svua/b611f107523c99e3d73d5fcb2223f70e#file-gistfile1-txt-L3060-L3065 Given that this is an issue with the restic restore, you will need to look at the logs for the restic daemonset (using |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Thanks again for providing the additional logs. Unfortunately, the command I gave you to get the restic logs only provided the logs for one of the restic pods, not all the pods in the DaemonSet. Can you check the logs for all the restic DaemonSet pods? I can't see anything that stands out in the other logs, and it will be the restic pods that process the PodVolumeRestores which aren't making progress. |
Beta Was this translation helpful? Give feedback.
-
All pods have the same logs, except for one
|
Beta Was this translation helpful? Give feedback.
-
@svua Can you please do a |
Beta Was this translation helpful? Give feedback.
-
All pods have the same description Pod that is failing
Another pod
|
Beta Was this translation helpful? Give feedback.
-
@svua how have you even got this jenkins home volume backup working? Do you stop a Jenkins instance before running the backups? I have tried to perform Jenkins backup with velero+restic multiple times and it failed every single time - jenkins is running all the time and I tested backups on a live system, but since the Jenkins app was running all the time and had not been stopped before running a backup, restic could not back up the jenkins home volume as it was like 170 GB of data and constantly changing due to running Jenkins Jobs. Since we are not willing to stop the Jenkins instance at all, I think that the best option would be to copy over to a separate volume only the data that we need to restore Jenkins properly and then run restic for this new volume. |
Beta Was this translation helpful? Give feedback.
-
How I can restore the backup?
velero backup create jenkins-test-selector --default-volumes-to-restic --selector app.kubernetes.io/name=jenkins
Beta Was this translation helpful? Give feedback.
All reactions