Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Edge case] MCM rollout stuck if rollout + scale-up triggered together #816

Open
1 of 2 tasks
himanshu-kun opened this issue May 23, 2023 · 1 comment
Open
1 of 2 tasks
Labels
area/robustness Robustness, reliability, resilience related kind/bug Bug lifecycle/stale Nobody worked on this for 6 months (will further age) priority/4 Priority (lower number equals higher priority)

Comments

@himanshu-kun
Copy link
Contributor

himanshu-kun commented May 23, 2023

How to categorize this issue?

/area robustness
/kind bug
/priority 2

What happened:

There is a case where if

  • there are > 1 active machineSets (machineSet with .spec.replicas > 0), and
  • machineDeployment is updated such that it starts referring to a new machineClass , and
  • machineDeployment.spec.replicas is increased (in decrease case issue doesn't happen)

then mcm starts to panic.
Furthermore , if the panic doesn't happen , the rollout will be stuck because scale-up logic is run before rollout logic (where new machineSet creation happens), and so new machineSet is never created.

Need to solve this in two steps:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

A result of the changes introduced in #765

Environment:
mcm v0.49.0

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • Others:
@gardener-robot gardener-robot added area/robustness Robustness, reliability, resilience related priority/2 Priority (lower number equals higher priority) labels May 23, 2023
@himanshu-kun himanshu-kun self-assigned this May 23, 2023
@himanshu-kun
Copy link
Contributor Author

t=0 ms1
t=1 ms1 , ms2
t=2 ms1 , ms2 , ms3 (+scale-up)

@himanshu-kun himanshu-kun added priority/4 Priority (lower number equals higher priority) and removed priority/2 Priority (lower number equals higher priority) labels Jun 2, 2023
@himanshu-kun himanshu-kun removed their assignment Nov 2, 2023
@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/robustness Robustness, reliability, resilience related kind/bug Bug lifecycle/stale Nobody worked on this for 6 months (will further age) priority/4 Priority (lower number equals higher priority)
Projects
None yet
Development

No branches or pull requests

2 participants