Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] restore workload pod is pending #8763

Open
gnolong opened this issue Jan 8, 2025 · 0 comments
Open

[BUG] restore workload pod is pending #8763

gnolong opened this issue Jan 8, 2025 · 0 comments
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@gnolong
Copy link
Contributor

gnolong commented Jan 8, 2025

Bakcup CR annotation kubeblocks.io/cluster-snapshot contains schedulingPolicy. In this case, we have podantiaffinity, when restoring a new cluster from a local backup repo, the new cluster has the same podantiaffinity that even contains old cluster's name info, which leads to restore workloads or new cluster pods keep pending, these pods can't live with old cluster pods in the same node.

apiVersion: dataprotection.kubeblocks.io/v1alpha1
kind: Backup
metadata:
  annotations:
    kubeblocks.io/cluster-snapshot: '{"metadata":{"name":"my1","namespace":"default","creationTimestamp":null,"annotations":{"kubeblocks.io/crd-api-version":"apps.kubeblocks.io/v1","meta.helm.sh/release-name":"my1","meta.helm.sh/release-namespace":"default"}},"spec":{"clusterDef":"apecloud-mysql","topology":"apecloud-mysql","terminationPolicy":"Delete","componentSpecs":[{"name":"mysql","componentDef":"apecloud-mysql-1.0.0-alpha.0","serviceVersion":"8.0.30","replicas":1,"schedulingPolicy":{"affinity":{"podAntiAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":[{"labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"my1","apps.kubeblocks.io/component-name":"mysql"}},"topologyKey":"kubernetes.io/hostname"}],"preferredDuringSchedulingIgnoredDuringExecution":[{"weight":100,"podAffinityTerm":{"labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"my1","apps.kubeblocks.io/component-name":"mysql"}},"topologyKey":"kubernetes.io/hostname"}}]}}},"resources":{"limits":{"cpu":"500m","memory":"512Mi"},"requests":{"cpu":"500m","memory":"512Mi"}},"volumeClaimTemplates":[{"name":"data","spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"20Gi"}}}}],"disableExporter":true}]},"status":{}}'
    kubeblocks.io/encrypted-system-accounts: '{"mysql":{"kbadmin":"yCfELdwsiFXoJ4ySZIaD+4ZRcoRlyZw+NgchuD462PDKvlZFkmS4XFeCXiU=","kbdataprotection":"5xaaklZGOTVtuNTNuUZ5aKTqUo/tAXmAFnAhevF4ycB5PTqrVQtGAMNLUTc=","kbmonitoring":"VE5szC/sv4X/V4dKx1IFtP1gWRxOsFXW/6GT9DzQ9NmkopFXLI0puLNVOfA=","kbprobe":"Vv8Upg5VzPJBs6gj/wTuw/W+hz4Y1hJVy58OuLg3Yszl4DOis/4ojS4T/4M=","kbreplicator":"PdpK0MaFBWyY+XJl2Ov8XbbK49X3kkGxawjiylJzUzgyixB4rVetrXI/Sbs=","root":"2jzTXD+E6/on9y9jMd0g8p5GnR2xNX7lNLmC3ZrMW0ny44C+bk1Tb8b8OMU="}}'
  creationTimestamp: "2025-01-08T08:37:48Z"
  finalizers:
  - dataprotection.kubeblocks.io/finalizer
  generation: 1
  labels:
    app.kubernetes.io/instance: my1
    app.kubernetes.io/managed-by: kubeblocks-dataprotection
    apps.kubeblocks.io/component-name: mysql
    dataprotection.kubeblocks.io/backup-policy: my1-mysql-backup-policy
    dataprotection.kubeblocks.io/backup-repo-name: my-repo
    dataprotection.kubeblocks.io/backup-type: Full
    dataprotection.kubeblocks.io/cluster-uid: 44855908-6aa0-42e4-9749-90340a3e6c6e
    operations.kubeblocks.io/ops-name: bp1
    operations.kubeblocks.io/ops-type: Backup
  name: bp1
  namespace: default
  resourceVersion: "40716"
  uid: dc741dc9-093f-4de8-a8a5-e6c8729d7d7b
spec:
  backupMethod: xtrabackup
  backupPolicyName: my1-mysql-backup-policy
  deletionPolicy: Delete
status:
  actions:
  - actionType: Job
    completionTimestamp: "2025-01-08T08:38:36Z"
    name: dp-backup-0
    objectRef:
      apiVersion: batch/v1
      kind: Job
      name: dp-backup-0-bp1-dc741dc9
      namespace: default
      resourceVersion: "40711"
      uid: b2a2f545-b436-415f-b61f-ebb91636f87b
    phase: Completed
    startTimestamp: "2025-01-08T08:37:48Z"
    targetPodName: my1-mysql-0
  backupMethod:
    actionSetName: apecloud-mysql-xtrabackup
    name: xtrabackup
    snapshotVolumes: false
    targetVolumes:
      volumeMounts:
      - mountPath: /data/mysql
        name: data
  backupRepoName: my-repo
  completionTimestamp: "2025-01-08T08:38:36Z"
  duration: 49s
  formatVersion: 0.1.0
  path: /default/my1-44855908-6aa0-42e4-9749-90340a3e6c6e/mysql/bp1
  persistentVolumeClaimName: pvc-my-repo-ld9dgj
  phase: Completed
  startTimestamp: "2025-01-08T08:37:48Z"
  target:
    connectionCredential:
      passwordKey: password
      secretName: my1-mysql-account-root
      usernameKey: username
    podSelector:
      fallbackLabelSelector:
        matchLabels:
          app.kubernetes.io/instance: my1
          app.kubernetes.io/managed-by: kubeblocks
          apps.kubeblocks.io/component-name: mysql
      matchLabels:
        app.kubernetes.io/instance: my1
        app.kubernetes.io/managed-by: kubeblocks
        apps.kubeblocks.io/component-name: mysql
      strategy: Any
    selectedTargetPods:
    - my1-mysql-0
  timeRange:
    end: "2025-01-08T08:38:33Z"
    start: "2025-01-08T08:38:28Z"
  totalSize: "4400067"

restore pod

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2025-01-08T08:38:49Z"
  finalizers:
  - batch.kubernetes.io/job-tracking
  generateName: restore-preparedata-4addc68c-bp1-0-
  labels:
    app.kubernetes.io/instance: carrot67
    app.kubernetes.io/managed-by: kubeblocks-dataprotection
    apps.kubeblocks.io/component-name: mysql
    apps.kubeblocks.io/vct-name: data
    batch.kubernetes.io/controller-uid: 90bc7c5f-e199-4d9b-baae-829072532c26
    batch.kubernetes.io/job-name: restore-preparedata-4addc68c-bp1-0
    controller-uid: 90bc7c5f-e199-4d9b-baae-829072532c26
    dataprotection.kubeblocks.io/restore: carrot67-mysql-f8e57f49-preparedata
    job-name: restore-preparedata-4addc68c-bp1-0
  name: restore-preparedata-4addc68c-bp1-0-k5l65
  namespace: default
  ownerReferences:
  - apiVersion: batch/v1
    blockOwnerDeletion: true
    controller: true
    kind: Job
    name: restore-preparedata-4addc68c-bp1-0
    uid: 90bc7c5f-e199-4d9b-baae-829072532c26
  resourceVersion: "40857"
  uid: 1219cb5d-6f09-45f8-b9a8-cc9e6818a3b8
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app.kubernetes.io/instance: my1
              apps.kubeblocks.io/component-name: mysql
          topologyKey: kubernetes.io/hostname
        weight: 100
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app.kubernetes.io/instance: my1
            apps.kubeblocks.io/component-name: mysql
        topologyKey: kubernetes.io/hostname
  containers:
  - command:
    - sh
    - -c
    - |
      #!/bin/bash
      set -e
      set -o pipefail
      export PATH="$PATH:$DP_DATASAFED_BIN_PATH"
      export DATASAFED_BACKEND_BASE_PATH="$DP_BACKUP_BASE_PATH"
      mkdir -p ${DATA_DIR}
      TMP_DIR=${DATA_MOUNT_DIR}/temp
      mkdir -p ${TMP_DIR} && cd ${TMP_DIR}

      old_signal="apecloud-mysql.old"
      log_bin=${LOG_BIN}
      if [ "$(datasafed list ${old_signal})" == "${old_signal}" ]; then
         log_bin="${DATA_DIR}/mysql-bin"
      fi

      datasafed pull "${DP_BACKUP_NAME}.xbstream" - | xbstream -x
      xtrabackup --decompress --remove-original --target-dir=${TMP_DIR}
      xtrabackup --prepare --target-dir=${TMP_DIR}
      xtrabackup --move-back --target-dir=${TMP_DIR} --datadir=${DATA_DIR}/ --log-bin=${log_bin}
      touch ${DATA_DIR}/${SIGNAL_FILE}
      rm -rf ${TMP_DIR}
      chmod -R 0777 ${DATA_DIR}
    env:
    - name: DP_BACKUP_NAME
      value: bp1
    - name: DP_TARGET_RELATIVE_PATH
    - name: DP_BACKUP_ROOT_PATH
      value: /default/my1-44855908-6aa0-42e4-9749-90340a3e6c6e/mysql
    - name: DP_BACKUP_BASE_PATH
      value: /default/my1-44855908-6aa0-42e4-9749-90340a3e6c6e/mysql/bp1
    - name: DP_BACKUP_STOP_TIME
      value: "2025-01-08T16:38:33+08:00"
    - name: DATA_DIR
      value: /data/mysql/data
    - name: LOG_BIN
      value: /data/mysql/binlog/mysql-bin
    - name: DP_DB_PORT
      value: "3306"
    - name: DATA_MOUNT_DIR
      value: /data/mysql
    - name: SIGNAL_FILE
      value: .xtrabackup_restore_new_cluster
    - name: DATASAFED_LOCAL_BACKEND_PATH
      value: /backupdata
    - name: DP_DATASAFED_BIN_PATH
      value: /bin/datasafed
    image: docker.io/apecloud/apecloud-xtrabackup:8.0
    imagePullPolicy: IfNotPresent
    name: restore
    resources:
      limits:
        cpu: "0"
        memory: "0"
      requests:
        cpu: "0"
        memory: "0"
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /data/mysql
      name: dp-claim-tpl-data-carrot67-mysql-0
    - mountPath: /backupdata
      name: dp-backup-data
    - mountPath: /bin/datasafed
      name: dp-datasafed-bin
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-jghx6
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - command:
    - /bin/sh
    - -c
    - /scripts/install-datasafed.sh /bin/datasafed
    image: docker.io/apecloud/datasafed:latest
    imagePullPolicy: Always
    name: dp-copy-datasafed
    resources:
      limits:
        cpu: "0"
        memory: "0"
      requests:
        cpu: "0"
        memory: "0"
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /bin/datasafed
      name: dp-datasafed-bin
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-jghx6
      readOnly: true
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext:
    runAsUser: 0
  serviceAccount: kubeblocks-dataprotection-worker
  serviceAccountName: kubeblocks-dataprotection-worker
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: dp-claim-tpl-data-carrot67-mysql-0
    persistentVolumeClaim:
      claimName: data-carrot67-mysql-0
  - name: dp-backup-data
    persistentVolumeClaim:
      claimName: pvc-my-repo-ld9dgj
  - emptyDir: {}
    name: dp-datasafed-bin
  - name: kube-api-access-jghx6
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2025-01-08T08:38:49Z"
    message: '0/4 nodes are available: 1 node(s) didn''t match pod anti-affinity rules.
      preemption: 0/4 nodes are available: 4 No preemption victims found for incoming
      pod..'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: BestEffort

new cluster

apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  annotations:
    kubeblocks.io/crd-api-version: apps.kubeblocks.io/v1
    kubeblocks.io/ops-request: '[{"name":"carrot67","type":"Restore"}]'
    kubeblocks.io/restore-from-backup: '{"mysql":{"doReadyRestoreAfterClusterRunning":"false","encryptedSystemAccounts":"{\"kbadmin\":\"yCfELdwsiFXoJ4ySZIaD+4ZRcoRlyZw+NgchuD462PDKvlZFkmS4XFeCXiU=\",\"kbdataprotection\":\"5xaaklZGOTVtuNTNuUZ5aKTqUo/tAXmAFnAhevF4ycB5PTqrVQtGAMNLUTc=\",\"kbmonitoring\":\"VE5szC/sv4X/V4dKx1IFtP1gWRxOsFXW/6GT9DzQ9NmkopFXLI0puLNVOfA=\",\"kbprobe\":\"Vv8Upg5VzPJBs6gj/wTuw/W+hz4Y1hJVy58OuLg3Yszl4DOis/4ojS4T/4M=\",\"kbreplicator\":\"PdpK0MaFBWyY+XJl2Ov8XbbK49X3kkGxawjiylJzUzgyixB4rVetrXI/Sbs=\",\"root\":\"2jzTXD+E6/on9y9jMd0g8p5GnR2xNX7lNLmC3ZrMW0ny44C+bk1Tb8b8OMU=\"}","name":"bp1","namespace":"default","volumeRestorePolicy":"Parallel"}}'
    meta.helm.sh/release-name: my1
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2025-01-08T08:38:49Z"
  finalizers:
  - cluster.kubeblocks.io/finalizer
  generation: 1
  labels:
    clusterdefinition.kubeblocks.io/name: apecloud-mysql
  name: carrot67
  namespace: default
  resourceVersion: "40808"
  uid: f8e57f49-5b66-4b1b-b59f-868edae92e8d
spec:
  clusterDef: apecloud-mysql
  componentSpecs:
  - componentDef: apecloud-mysql-1.0.0-alpha.0
    disableExporter: true
    name: mysql
    replicas: 1
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 500m
        memory: 512Mi
    schedulingPolicy:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/instance: my1
                  apps.kubeblocks.io/component-name: mysql
              topologyKey: kubernetes.io/hostname
            weight: 100
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/instance: my1
                apps.kubeblocks.io/component-name: mysql
            topologyKey: kubernetes.io/hostname
    serviceVersion: 8.0.30
    volumeClaimTemplates:
    - name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 20Gi
  terminationPolicy: Delete
  topology: apecloud-mysql
status:
  components:
    mysql: {}
  conditions:
  - lastTransitionTime: "2025-01-08T08:38:49Z"
    message: 'The operator has started the provisioning of Cluster: carrot67'
    observedGeneration: 1
    reason: PreCheckSucceed
    status: "True"
    type: ProvisioningStarted
  - lastTransitionTime: "2025-01-08T08:38:49Z"
    message: Successfully applied for resources
    observedGeneration: 1
    reason: ApplyResourcesSucceed
    status: "True"
    type: ApplyResources
  observedGeneration: 1
  phase: Creating
@gnolong gnolong added the kind/bug Something isn't working label Jan 8, 2025
@wangyelei wangyelei self-assigned this Jan 8, 2025
@wangyelei wangyelei added this to the Release 1.0.0 milestone Jan 8, 2025
@github-actions github-actions bot modified the milestones: Release 1.0.0, Release 0.9.3 Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants