Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Velero Restore : Azure disk PV Created with Old SKU Despite Updated Storage Class #8401

Open
ankitsharma1133 opened this issue Nov 12, 2024 · 9 comments
Assignees
Labels
Area/Cloud/Azure Needs info Waiting for information

Comments

@ankitsharma1133
Copy link

ankitsharma1133 commented Nov 12, 2024

What steps did you take and what happened:

I am trying to change to storage class of azure disk from LRS to ZRS or from storage to another using name: change-storage-class-config option in velero

I am trying to change the storage class from the default Storage class i.e SKU StandardSSD_LRS to azuredisk-standard-ssd-zrs i.e SKU StandardSSD_ZRS

What did you expect to happen:

After applying the change-storage-class-config and after restoring the velero backup . The application is UP and new PV, PVC are created and data persist in the new PV.
The Storage is updated in the PVC based on Velero configmap but the SKU of PV is still same after the OLD PV before restore

Below is my configmap and PV yaml after Velero restoration

apiVersion: v1
data:
default: azuredisk-standard-ssd-zrs
kind: ConfigMap
metadata:
labels:
velero.io/change-storage-class: RestoreItemAction
velero.io/plugin-config: ""
name: change-storage-class-config
namespace: velero

apiVersion: v1
kind: PersistentVolume
metadata:
name: pvc-###############
labels:
velero.io/backup-name: test-nginx-12
velero.io/restore-name: restore12
annotations:
pv.kubernetes.io/provisioned-by: disk.csi.azure.com
volume.kubernetes.io/provisioner-deletion-secret-name: ''
volume.kubernetes.io/provisioner-deletion-secret-namespace: ''
finalizers:
- external-provisioner.volume.kubernetes.io/finalizer
- kubernetes.io/pv-protection
- external-attacher/disk-csi-azure-com
spec:
capacity:
storage: 10Gi
csi:
driver: disk.csi.azure.com
volumeHandle: >-
/subscriptions/##########/providers/Microsoft.Compute/disks/restore-##################
volumeAttributes:
csi.storage.k8s.io/pv/name: pvc-#############
csi.storage.k8s.io/pvc/name: PVC-nginx
csi.storage.k8s.io/pvc/namespace: test-nginx-2
requestedsizegib: '10'
skuname: StandardSSD_LRS
storage.kubernetes.io/csiProvisionerIdentity: ############-disk.csi.azure.com
accessModes:
- ReadWriteOnce
claimRef:
kind: PersistentVolumeClaim
namespace: test-nginx-2
name: pvc-nginx
persistentVolumeReclaimPolicy: Delete
storageClassName: azuredisk-standard-ssd-zrs
volumeMode: Filesystem
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: topology.disk.csi.azure.com/zone
operator: In
values:
- westeurope-1

The following information will help us better understand what's going on:

Expected Behavior:
The restored PV should be created with the SKU specified in the new storage class (StandardSSD_ZRS).

Actual Behavior:
The PVC gets restored with the updated storage class, but the new PV is created with the old SKU (StandardSSD_LRS).

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@anshulahuja98
Copy link
Collaborator

Can you share your PVC & StorageClass YAML as well?

Also what type of backup approach do you use? CSI? FSBackup? CSI Datamover? Snapshotting via AzurePlugin?

Can you also share version of Velero used

@anshulahuja98 anshulahuja98 added the Needs info Waiting for information label Nov 18, 2024
@ankitsharma1133
Copy link
Author

ankitsharma1133 commented Nov 19, 2024

Hi @anshulahuja98 please find the details

StroageClass yaml

Default SC

`apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: default
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    kubernetes.io/cluster-service: 'true'
  annotations:
    storageclass.kubernetes.io/is-default-class: 'true'
provisioner: disk.csi.azure.com
parameters:
  skuname: StandardSSD_LRS
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

`
SC azuredisk-standard-ssd-zrs

`apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: azuredisk-standard-ssd-zrs
  labels:
    app.kubernetes.io/instance: azuredisk-csi-driver-v2
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: azuredisk-csi-driver
    app.kubernetes.io/version: v2.0.0-beta.6
    helm.sh/chart: azuredisk-csi-driver-v2.0.0-beta.6
  annotations:
    meta.helm.sh/release-name: azuredisk-csi-driver-v2
    meta.helm.sh/release-namespace: kube-system
provisioner: disk.csi.azure.com
parameters:
  skuName: StandardSSD_ZRS
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate

`

PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-nginx
  namespace: test-nginx-2
  labels:
    velero.io/backup-name: test-nginx-13
    velero.io/restore-name: restore13
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 10Gi
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: pvc-####################
  storageClassName: azuredisk-standard-ssd-zrs
  volumeMode: Filesystem

Backup Approach is Snapshotting via AzurePlugin

velero version

Helm chart 8.0.0 image version v1.15.0

Velero config

configuration:
  backupStorageLocation:
  - bucket: velero
    config:
      resourceGroup: #RG#
      storageAccount: #SANAME#
    default: "true"
    defaultVolumesToFsBackup: false
    name: #Name#
    provider: azure
  uploaderType: restic
  volumeSnapshotLocation:
  - config:
      incremental: true
      resourceGroup: #RG#
    name: #name#
    provider: azure

Also what type of backup approach do you use? CSI? FSBackup? CSI Datamover?

@anshulahuja98
Copy link
Collaborator

  1. Can you also share the version of AzurePlugin you use.

  2. app.kubernetes.io/instance: azuredisk-csi-driver-v2 app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: azuredisk-csi-driver app.kubernetes.io/version: v2.0.0-beta.6 helm.sh/chart: azuredisk-csi-driver-v2.0.0-beta.6

    Is your setup a standard AKS setup?

    Do you sideload AKS Disk driver in AKS?

    What version of Disk Driver are you on?

@ankitsharma1133
Copy link
Author

ankitsharma1133 commented Dec 2, 2024

Hi @anshulahuja98

Sorry for the delay response . please find detailed yamls and instruction which I am following

FYIP velero restore is happening with the updated storage class and the pv is created with update storage class but the skuname is of the old storageclass

I am trying to change the SC default (skuname: StandardSSD_LRS) --- > SC name managed-zrs-csi-premium (skuname: Premium_ZRS)

YAML for deploy same yaml

1-demo-app.txt

Velero confimap

velero-cm.txt

Storageclass txt

sc.txt

Backup & restore instruction

velero create backup otx-replica-manual --include-namespaces=otx-replica

velero backup create pvc-ns-manual-backup --include-namespaces=pvc-poc

velero restore create pvc-restore-v1 --from-backup pvc-ns-manual-backup

Please find the Yaml for newly created PV and PVC
new-pv.txt
new-pvc.txt

Below is the velero backup n restore logs

backup-logs.txt
restore-describe-details.txt
restore-logs.txt

And If i create new application with managed-zrs-csi-premium storage class directly the PV is created with right sku

@anshulahuja98
Copy link
Collaborator

I went through the code for AzurePlugin based restore flow

volumeID, err := volumeSnapshotter.CreateVolumeFromSnapshot(snapshotInfo.providerSnapshotID, snapshotInfo.volumeType, snapshotInfo.volumeAZ, snapshotInfo.volumeIOPS)

Seems like velero might not be fetching the StorageClass for the SKU and relying on the original SKU it saved.

This change storage class is probably for dynamic scenarios (CSI)

My suggestion for now would be to try your experiment with CSI plugin for snapshotting instead of AzurePlugin. That should work as expected.

Once we confirm this behaviour, we can callout in the docs.

@ankitsharma1133
Copy link
Author

Hi @anshulahuja98

EnableCSI is already true. Is there any other attribute I need to change/Add to use CSI plugin for snapshotting instead of AzurePlugin?

My setup instruction

helm upgrade -i velero vmware-tanzu/velero -f .\values.yml -n velero

below are my values.yml for helm

configuration:
  uploaderType: restic 
  backupStorageLocation:
    - name: velerobackupsdev
      bucket: velero 
      defaultVolumesToFsBackup: False
      provider: azure
      default: "true"
      config:
        resourceGroup:#RGNAME#
        storageAccount: #STORAFEACCOUNTNAME#
  volumeSnapshotLocation:
    - name: velerosnapshotsdev
      provider: azure
      config:
        resourceGroup: "#AKS_MC_RG"
initContainers:
  - name: velero-plugin-for-azure   
    image: velero/velero-plugin-for-microsoft-azure:v1.11.0
    imagePullPolicy: IfNotPresent
    volumeMounts:
      - mountPath: /target
        name: plugins
deployNodeAgent: true
features: EnableCSI
credentials:
  existingSecret: velero-secrets

@kaovilai
Copy link
Member

kaovilai commented Dec 5, 2024

You can follow https://velero.io/docs/v1.15/csi/#installing-velero-with-csi-support on install + running backup which trigger CSI instead of AzurePlugin.

@ankitsharma1133
Copy link
Author

ankitsharma1133 commented Dec 6, 2024

Hi @kaovilai / @anshulahuja98

I tried with CSI plugin . The issue still persists . The storage class during the restore updates but the SKU type of disk is still old as the pervious storage class
Changing Storage class fro default to managed-zrs-csi-premium
SC: default skuname: StandardSSD_LRS ---->>SC: managed-zrs-csi-premium : skuname: Premium_ZRS


`apiVersion: v1
kind: PersistentVolume
metadata:
  name: pvc-4fd30229-c95f-4e8e-bc8b-a03963597076
  uid: ae36c30a-281c-44a1-b8fe-9e4b3f70feee
  resourceVersion: '882577247'
  creationTimestamp: '2024-12-06T04:46:05Z'
  labels:
    velero.io/backup-name: pvc-poc-backup-0612
    velero.io/restore-name: pvc-poc-backup-0612-restore
  annotations:
    pv.kubernetes.io/provisioned-by: disk.csi.azure.com
    volume.kubernetes.io/provisioner-deletion-secret-name: ''
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ''
  finalizers:
    - external-provisioner.volume.kubernetes.io/finalizer
    - kubernetes.io/pv-protection
    - external-attacher/disk-csi-azure-com
  managedFields:
    - manager: csi-attacher
      operation: Update
      apiVersion: v1
      time: '2024-12-06T04:41:51Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers:
            v:"external-attacher/disk-csi-azure-com": {}
    - manager: csi-provisioner
      operation: Update
      apiVersion: v1
      time: '2024-12-06T04:41:51Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:pv.kubernetes.io/provisioned-by: {}
            f:volume.kubernetes.io/provisioner-deletion-secret-name: {}
            f:volume.kubernetes.io/provisioner-deletion-secret-namespace: {}
          f:finalizers:
            .: {}
            v:"external-provisioner.volume.kubernetes.io/finalizer": {}
        f:spec:
          f:accessModes: {}
          f:capacity:
            .: {}
            f:storage: {}
          f:claimRef:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
            f:namespace: {}
          f:csi:
            .: {}
            f:driver: {}
            f:volumeAttributes:
              .: {}
              f:csi.storage.k8s.io/pv/name: {}
              f:csi.storage.k8s.io/pvc/name: {}
              f:csi.storage.k8s.io/pvc/namespace: {}
              f:requestedsizegib: {}
              f:skuname: {}
              f:storage.kubernetes.io/csiProvisionerIdentity: {}
            f:volumeHandle: {}
          f:nodeAffinity:
            .: {}
            f:required: {}
          f:persistentVolumeReclaimPolicy: {}
          f:storageClassName: {}
          f:volumeMode: {}
    - manager: kube-controller-manager
      operation: Update
      apiVersion: v1
      time: '2024-12-06T04:46:05Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          f:claimRef:
            f:resourceVersion: {}
            f:uid: {}
    - manager: kube-controller-manager
      operation: Update
      apiVersion: v1
      time: '2024-12-06T04:46:05Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          f:phase: {}
      subresource: status
  selfLink: /api/v1/persistentvolumes/pvc-4fd30229-c95f-4e8e-bc8b-a03963597076
status:
  phase: Bound
  lastPhaseTransitionTime: '2024-12-06T04:46:05Z'
spec:
  capacity:
    storage: 10Gi
  csi:
    driver: disk.csi.azure.com
    volumeHandle: >-
      /subscriptions/################/restore-3bcd325a-3870-42ad-aafa-f14d030c1063
    volumeAttributes:
      csi.storage.k8s.io/pv/name: pvc-4fd30229-c95f-4e8e-bc8b-a03963597076
      csi.storage.k8s.io/pvc/name: pvc-nginx
      csi.storage.k8s.io/pvc/namespace: pvc-poc
      requestedsizegib: '10'
      skuname: StandardSSD_LRS
      storage.kubernetes.io/csiProvisionerIdentity: 1732051072822-7900-disk.csi.azure.com
  accessModes:
    - ReadWriteOnce
  claimRef:
    kind: PersistentVolumeClaim
    namespace: pvc-poc
    name: pvc-nginx
    uid: b543a231-c800-4e06-9cd9-f8980932d718
    apiVersion: v1
    resourceVersion: '882577244'
  persistentVolumeReclaimPolicy: Delete
  storageClassName: managed-zrs-csi-premium
  volumeMode: Filesystem
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: topology.disk.csi.azure.com/zone
              operator: In
              values:
                - westeurope-1
`

@anshulahuja98
Copy link
Collaborator

anshulahuja98 commented Dec 6, 2024

please share backup/restore logs, pvc yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/Cloud/Azure Needs info Waiting for information
Projects
None yet
Development

No branches or pull requests

4 participants