Skip to content

Troubleshooting

Luís Duarte edited this page Jun 12, 2024 · 3 revisions

How can I migrate a PVC into another PVC?

Firstly, you'll need to create the new PVC with a different name than the old one (you can rename it later by doing some shenanigans, or by using some kubectl plugins), then you can run this job:

apiVersion: batch/v1
kind: Job
metadata:
  name: pvc-migration-job
  namespace: <INSERT-NAMESPACE-HERE>
spec:
  ttlSecondsAfterFinished: 100
  completions: 1
  parallelism: 1
  backoffLimit: 3
  template:
    spec:
      containers:
      - name: volume-migration
        image: ubuntu
        tty: true
        command: ["/bin/sh"]
        args: ["-c", "cp -r -v /mnt/old/** /mnt/new"]
        volumeMounts:
          - name: old-vol
            mountPath: /mnt/old
          - name: new-vol
            mountPath: /mnt/new
      volumes:
        - name: old-vol
          persistentVolumeClaim:
            claimName: <INSERT-OLD-PVC-NAME-HERE> # change to data source PVC
        - name: new-vol
          persistentVolumeClaim:
            claimName: <INSERT-NEW-PVC-NAME-HERE> # change to data target PVC
      restartPolicy: Never

After running the job should auto-delete itself, after some time, and you might now delete the old PVC (please do this when all things are working again, to avoid accidental data loss).

I've accidentally deleted a PVC, how can I restore it?

At this point, you have two options:

If the PVC was created with a reclaim policy (you can see it if it has the retain keyword in the storageClass name), you are in luck, you can just restore it by following the following steps:

  • Firstly, figure out which PersistantVolume was deleted by running kubectl get pv -A, which you should get an output like this:
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                            STORAGECLASS                   VOLUMEATTRIBUTESCLASS   REASON   AGE
pvc-01874237-a84e-4017-bb1e-352eec4b345e   3Gi        RWO            Retain           Bound    ni-website/mongo-pvc                             longhorn-locality-retain       <unset>                          37d
pvc-0ab225ba-6660-4501-87c3-e097f6a52a0b   5Gi        RWO            Retain           Bound    sinf-website/redis-pvc                           longhorn-locality-retain       <unset>                          30d
pvc-15682c45-75d9-4f8f-96f7-7fa03ac1ff34   1Gi        RWO            Delete           Bound    sinf-website/grafana-pvc                         longhorn                       <unset>                          30d
pvc-200e7271-4df3-45b0-a420-f43fb6f872b7   2Gi        RWO            Retain           Bound    plausible-ni/plausible-events-db                 longhorn-locality-retain       <unset>                          37d
pvc-3456c2a2-b7b8-496e-8c46-5cb559794e65   1Gi        RWO            Delete           Bound    sinf-website/data-meilisearch-0                  longhorn                       <unset>                          30d
pvc-3f380380-4682-4603-982e-8de4029c9f3e   10Gi       RWO            Retain           Bound    pg/cnpg-cluster-4                                longhorn-strict-local-retain   <unset>                          37d
pvc-41c48b7e-db7c-42c1-9b42-5332aa2487ac   10Gi       RWO            Retain           Bound    pg/cnpg-cluster-3                                longhorn-strict-local-retain   <unset>                          37d
pvc-4a6260e6-c3d8-4e78-8914-8427c155166b   10Gi       RWO            Retain           Bound    pg/cnpg-cluster-1                                longhorn-strict-local-retain   <unset>                          37d
pvc-74e2757a-d58a-4710-96eb-d26212d231c5   1Gi        RWO            Delete           Bound    image-registry/harbor-jobservice                 longhorn-locality-no-backup    <unset>                          39d
pvc-768f82a5-c723-4316-8d72-472bf5c65cfb   10Gi       RWO            Retain           Bound    sinf-website/website-pvc                         longhorn-locality-retain       <unset>                          30d
pvc-c16e4f92-b9fc-4af7-8d2f-d9ac6dfcd5a8   5Gi        RWO            Delete           Bound    image-registry/data-harbor-trivy-0               longhorn-locality-no-backup    <unset>                          39d
pvc-c50d487a-0484-4f31-abc4-a95b9c1122dc   5Gi        RWO            Delete           Bound    image-registry/harbor-registry                   longhorn-locality-no-backup    <unset>                          39d
pvc-d4b069f0-1666-4abb-9f59-c6ebc66bd661   2Gi        RWO            Retain           Bound    ni-website/public-pvc                            longhorn-locality-retain       <unset>                          37d
pvc-ea192a85-59bb-49da-a51d-96a8e053c57f   1Gi        RWO            Delete           Bound    image-registry/data-harbor-redis-0               longhorn-locality-no-backup    <unset>                          39d
pvc-f2aacf75-7d5d-4c2a-a4be-eaa83c8d5ca4   1Gi        RWO            Retain           Bound    image-registry/database-data-harbor-database-0   longhorn-locality-retain       <unset>                          39d
  • After figuring out the PV's name, you can recreate the PVC by modifying the YAML or equivalent and adding the volumeName spec, for example:
---

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: <PVC-NAME>
  namespace: <PVC-NAMESPACE>
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  volumeName: "<PV-NAME>"

Caution

The next method will probably cause some DATA LOSS and should only be done, if really needed.

If you are unlucky because someone didn't think that the reclaim policy was important for that use case, you can try to restore from the Longhorn's snapshots, or if they are already deleted due to time, from Cloudflare's backups. Firstly, try to access the Longhorn's dashboard, it should be exposed in localhost:8080:

kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80

By finding the PV's name, you can restore the PV by finding the name, like done above and then you can restore the PVC.

This will probably cover 100% of cases and should avoid total data loss like the incident of 23 @ SINF23.

If nothing else works, and you are stuck

This was done by the power of friendship, which you should also try to use but if to no avail, try to contact the founders of NIployments, they should help you (hopefully), you can find their contact info in NIAEFEUP's drive:

  • Luís Duarte
  • José Costa
  • André Lima
  • Rubem Neto
  • Nuno Pereira
  • Bruno Oliveira
  • João Silva
  • Marco Vilas Boas
  • Tomás Palma
  • Diogo Goiana

The Founders. I hope you don't have to reach them, but they will be happy to help you in any way Not everyone is in the photo, unfortunately.