Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Ops failed due to the use of an incorrect pod during the switchover #8955

Closed
tianyue86 opened this issue Feb 20, 2025 · 1 comment
Closed
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@tianyue86
Copy link

Describe the bug
Kubernetes: v1.31.1-aliyun.1
KubeBlocks: 1.0.0-beta.29
kbcli: 1.0.0-beta.14

To Reproduce
Steps to reproduce the behavior:

  1. Create pg cluster with yaml below
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: postgres-wbvhtu
  namespace: default
spec:
  clusterDef: postgresql
  topology: replication
  terminationPolicy: WipeOut
  componentSpecs:
    - name: postgresql
      serviceVersion: 12.14.0
      labels:
        apps.kubeblocks.postgres.patroni/scope: postgres-wbvhtu-postgresql
      replicas: 2
      disableExporter: true
      resources:
        limits:
          cpu: 100m
          memory: 0.5Gi
        requests:
          cpu: 100m
          memory: 0.5Gi
      volumeClaimTemplates:
        - name: data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
  1. Two pgs clusters are running at the same time
k get cluster -A|grep post
default     postgres-tgnsec          postgresql           WipeOut              Running    33m
default     postgres-wbvhtu          postgresql           WipeOut              Running    25m
  1. Perform the following operations on the cluster postgres-wbvhtu one by one
kubectl delete pod postgres-wbvhtu-postgresql-0  --namespace default
kbcli cluster scale-out postgres-wbvhtu --auto-approve --force=true --components postgresql --replicas 1  --namespace default
kbcli cluster vscale postgres-wbvhtu --auto-approve --force=true                 --components postgresql                 --cpu 200m                 --memory 0.6Gi --namespace default
kbcli cluster promote postgres-wbvhtu --auto-approve --force=true                  --candidate postgres-wbvhtu-postgresql-1  --namespace default
OpsRequest postgres-wbvhtu-switchover-74tjr created successfully, you can view the progress:
	kbcli cluster describe-ops postgres-wbvhtu-switchover-74tjr -n default
  1. check ops status
kbcli cluster list-ops postgres-wbvhtu --status all  --namespace default
NAME                                      NAMESPACE   TYPE                CLUSTER           COMPONENT    STATUS    PROGRESS   CREATED-TIME                 
postgres-wbvhtu-horizontalscaling-6d7zx   default     HorizontalScaling   postgres-wbvhtu   postgresql   Succeed   1/1        Feb 20,2025 11:31 UTC+0800   
postgres-wbvhtu-verticalscaling-dchrr     default     VerticalScaling     postgres-wbvhtu   postgresql   Succeed   3/3        Feb 20,2025 11:36 UTC+0800   
postgres-wbvhtu-switchover-74tjr          default     Switchover          postgres-wbvhtu                Failed    -/-        Feb 20,2025 11:41 UTC+0800
  1. describe the failed ops ===> Ops failed due to the use of an incorrect pod during the switchover
~k describe opsrequest postgres-wbvhtu-switchover-74tjr
Events:
  Type     Reason                    Age                From                    Message
  ----     ------                    ----               ----                    -------
  Normal   WaitForProgressing        95s                ops-request-controller  wait for the controller to process the OpsRequest: postgres-wbvhtu-switchover-74tjr in Cluster: postgres-wbvhtu
  Normal   ValidateOpsRequestPassed  95s (x2 over 95s)  ops-request-controller  OpsRequest: postgres-wbvhtu-switchover-74tjr is validated
  Normal   SwitchoverStarted         95s (x2 over 95s)  ops-request-controller  
  Warning  OpsRequestFailed          95s                ops-request-controller  the pod "postgres-tgnsec-postgresql-0" not belongs to the component "postgres-wbvhtu-postgresql"

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@tianyue86 tianyue86 added the kind/bug Something isn't working label Feb 20, 2025
@tianyue86 tianyue86 added this to the Release 1.0.0 milestone Feb 20, 2025
@wangyelei
Copy link
Contributor

fixed at apecloud/kbcli#575

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants