opreator does not scale down leader and follower #1148

vexsx · 2024-12-03T05:32:12Z

What version of redis operator are you using?

kubectl logs <_redis-operator_pod_name> -n <namespace>

redis-operator version: 0.18.1

Does this issue reproduce with the latest release?

no idea

What operating system and processor architecture are you using (kubectl version)?
Client Version: v1.30.6
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.6

kubectl version Output

$ kubectl version

What did you expect to see?

redeuce sacle of pods

What did you see instead?
nothing

log of operator :
{"level":"error","ts":"2024-12-03T05:28:54Z","logger":"controllers.RedisCluster","msg":"Failed to ping Redis server","error":"dial tcp :6379: connect: connection refused","stacktrace":"github.com/OT-CONTAINER-KIT/redis-operator/pkg/k8sutils.getRedisNodeID\n\t/workspace/pkg/k8sutils/cluster-scaling.go:116\ngithub.com/OT-CONTAINER-KIT/redis-operator/pkg/k8sutils.ReshardRedisCluster\n\t/workspace/pkg/k8sutils/cluster-scaling.go:58\ngithub.com/OT-CONTAINER-KIT/redis-operator/pkg/controllers/rediscluster.(*RedisClusterReconciler).Reconcile\n\t/workspace/pkg/controllers/rediscluster/rediscluster_controller.go:93\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}
{"level":"info","ts":"2024-12-03T05:28:54Z","logger":"controllers.RedisCluster","msg":"Redis cluster is downscaled... Rebalancing the cluster","Request.Namespace":"ot-operators","Request.Name":"redis-cluster"}

The text was updated successfully, but these errors were encountered:

xiaozhuang-a · 2024-12-11T07:22:29Z

I have encountered the same problem here and it has been consistently reproduced
Reproduction process:

New cluster, clusterSize=4
Delete leader-3, trigger failover, and at this time follow-3 is master
Then perform a scan down ->clusterSize 3
Then an exception will occur, manifested as the inability to delete sharp follow-3, and the number of statsfully set is still 4, with the operator triggering a rebalance every time it reconciles
The problem seems to be that ClusterFail did not execute successfully, and the follower is still the master

vexsx · 2024-12-11T18:21:00Z

yes , if you try to manually sacle down it crash the cluster.
i think solution is to block any active connection using redis then try to scale down.

@xiaozhuang-a

vexsx added the bug Something isn't working label Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opreator does not scale down leader and follower #1148

opreator does not scale down leader and follower #1148

vexsx commented Dec 3, 2024

xiaozhuang-a commented Dec 11, 2024 •

edited

Loading

vexsx commented Dec 11, 2024 •

edited

Loading

opreator does not scale down leader and follower #1148

opreator does not scale down leader and follower #1148

Comments

vexsx commented Dec 3, 2024

xiaozhuang-a commented Dec 11, 2024 • edited Loading

vexsx commented Dec 11, 2024 • edited Loading

xiaozhuang-a commented Dec 11, 2024 •

edited

Loading

vexsx commented Dec 11, 2024 •

edited

Loading