-
Notifications
You must be signed in to change notification settings - Fork 0
Troubleshooting
Firstly, you'll need to create the new PVC with a different name than the old one (you can rename it later by doing some shenanigans, or by using some kubectl plugins), then you can run this job:
apiVersion: batch/v1
kind: Job
metadata:
name: pvc-migration-job
namespace: <INSERT-NAMESPACE-HERE>
spec:
ttlSecondsAfterFinished: 100
completions: 1
parallelism: 1
backoffLimit: 3
template:
spec:
containers:
- name: volume-migration
image: ubuntu
tty: true
command: ["/bin/sh"]
args: ["-c", "cp -r -v /mnt/old/** /mnt/new"]
volumeMounts:
- name: old-vol
mountPath: /mnt/old
- name: new-vol
mountPath: /mnt/new
volumes:
- name: old-vol
persistentVolumeClaim:
claimName: <INSERT-OLD-PVC-NAME-HERE> # change to data source PVC
- name: new-vol
persistentVolumeClaim:
claimName: <INSERT-NEW-PVC-NAME-HERE> # change to data target PVC
restartPolicy: Never
After running the job should auto-delete itself, after some time, and you might now delete the old PVC (please do this when all things are working again, to avoid accidental data loss).
At this point, you have two options:
If the PVC was created with a reclaim
policy (you can see it if it has the retain
keyword in the storageClass
name), you are in luck, you can just restore it by following the following steps:
- Firstly, figure out which PersistantVolume was deleted by running
kubectl get pv -A
, which you should get an output like this:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
pvc-01874237-a84e-4017-bb1e-352eec4b345e 3Gi RWO Retain Bound ni-website/mongo-pvc longhorn-locality-retain <unset> 37d
pvc-0ab225ba-6660-4501-87c3-e097f6a52a0b 5Gi RWO Retain Bound sinf-website/redis-pvc longhorn-locality-retain <unset> 30d
pvc-15682c45-75d9-4f8f-96f7-7fa03ac1ff34 1Gi RWO Delete Bound sinf-website/grafana-pvc longhorn <unset> 30d
pvc-200e7271-4df3-45b0-a420-f43fb6f872b7 2Gi RWO Retain Bound plausible-ni/plausible-events-db longhorn-locality-retain <unset> 37d
pvc-3456c2a2-b7b8-496e-8c46-5cb559794e65 1Gi RWO Delete Bound sinf-website/data-meilisearch-0 longhorn <unset> 30d
pvc-3f380380-4682-4603-982e-8de4029c9f3e 10Gi RWO Retain Bound pg/cnpg-cluster-4 longhorn-strict-local-retain <unset> 37d
pvc-41c48b7e-db7c-42c1-9b42-5332aa2487ac 10Gi RWO Retain Bound pg/cnpg-cluster-3 longhorn-strict-local-retain <unset> 37d
pvc-4a6260e6-c3d8-4e78-8914-8427c155166b 10Gi RWO Retain Bound pg/cnpg-cluster-1 longhorn-strict-local-retain <unset> 37d
pvc-74e2757a-d58a-4710-96eb-d26212d231c5 1Gi RWO Delete Bound image-registry/harbor-jobservice longhorn-locality-no-backup <unset> 39d
pvc-768f82a5-c723-4316-8d72-472bf5c65cfb 10Gi RWO Retain Bound sinf-website/website-pvc longhorn-locality-retain <unset> 30d
pvc-c16e4f92-b9fc-4af7-8d2f-d9ac6dfcd5a8 5Gi RWO Delete Bound image-registry/data-harbor-trivy-0 longhorn-locality-no-backup <unset> 39d
pvc-c50d487a-0484-4f31-abc4-a95b9c1122dc 5Gi RWO Delete Bound image-registry/harbor-registry longhorn-locality-no-backup <unset> 39d
pvc-d4b069f0-1666-4abb-9f59-c6ebc66bd661 2Gi RWO Retain Bound ni-website/public-pvc longhorn-locality-retain <unset> 37d
pvc-ea192a85-59bb-49da-a51d-96a8e053c57f 1Gi RWO Delete Bound image-registry/data-harbor-redis-0 longhorn-locality-no-backup <unset> 39d
pvc-f2aacf75-7d5d-4c2a-a4be-eaa83c8d5ca4 1Gi RWO Retain Bound image-registry/database-data-harbor-database-0 longhorn-locality-retain <unset> 39d
- After figuring out the PV's name, you can recreate the PVC by modifying the YAML or equivalent and adding the
volumeName
spec, for example:
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: <PVC-NAME>
namespace: <PVC-NAMESPACE>
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
volumeName: "<PV-NAME>"
Caution
The next method will probably cause some DATA LOSS and should only be done, if really needed.
If you are unlucky because someone didn't think that the reclaim policy was important for that use case, you can try to restore from the Longhorn's snapshots, or if they are already deleted due to time, from Cloudflare's backups. Firstly, try to access the Longhorn's dashboard, it should be exposed in localhost:8080:
kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80
By finding the PV's name, you can restore the PV by finding the name, like done above and then you can restore the PVC.
This will probably cover 100% of cases and should avoid total data loss like the incident of 23 @ SINF23.
This was done by the power of friendship, which you should also try to use but if to no avail, try to contact the founders of NIployments, they should help you (hopefully), you can find their contact info in NIAEFEUP's drive:
- Luís Duarte
- José Costa
- André Lima
- Rubem Neto
- Nuno Pereira
- Bruno Oliveira
- João Silva
- Marco Vilas Boas
- Tomás Palma
- Diogo Goiana
Not everyone is in the photo, unfortunately.
𝒏𝒊𝒑𝒍𝒐𝒚𝒎𝒆𝒏𝒕𝒔 (⸝⸝⸝>﹏<⸝⸝⸝)