-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check orphan PVC before updating statefulSet #526
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #526 +/- ##
==========================================
- Coverage 85.91% 85.41% -0.51%
==========================================
Files 12 12
Lines 1633 1659 +26
==========================================
+ Hits 1403 1417 +14
- Misses 145 156 +11
- Partials 85 86 +1 ☔ View full report in Codecov by Sentry. |
@hoyhbx Could you please increase the code coverage. Also the DCO check is failing |
I will try to write a system test to cover these two branches |
@anishakj, I wrote an e2e test to reproduce the issue. I want to confirm my understand for the usage of
I want to implement a fix to make the
But this doesn't work if the |
Please check the latest commit for the tentative fix. I moved the The e2e test added can reproduce the issue without the patch, and it passes after the patch |
Signed-off-by: hoyhbx <[email protected]>
- Move Status.Repicas and Status.ReadyReplicas update to reconcileClusterStatus - Use StatefulSet Replicas when checking orphan PVC - Check need for PVC cleanup before updating STS Signed-off-by: hoyhbx <[email protected]>
pvcList, err := r.getPVCList(instance) | ||
if err != nil { | ||
return err | ||
} | ||
for _, pvcItem := range pvcList.Items { | ||
// delete only Orphan PVCs | ||
if utils.IsPVCOrphan(pvcItem.Name, instance.Spec.Replicas) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hoyhbx why are we deleting pvcs based on Sts replicas. Operator is looking for zookeeper cluster resource. Is there any issue are you seeing if we delete based on instance.Spec.Replicas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to handle the race condition, where the operator has not yet deleted the orphan pvcs after scaling down, but the user scale the cluster back up. In that case, if instance.Spec.Replicas
is used to delete old pvcs, the old pvcs will never get deleted.
Just as the added e2e test, when scaling down from 3 to 1, two pods are deleted, and statefulSet's replica gets down to 1. Then it takes sometime for the operator to delete the orphan pvcs. But before operator is able to delete the orphan pvcs, user scales up from 1 to 3, changing the instance.Spec.Replicas
to 3. Then the old PVCs will never be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @anishakj , does the above explanation make sense to you?
The problem is basically because there is a race condition when deleting the PVC and user upscaling
@hoyhbx Could you please increase the code coverage for this? |
Signed-off-by: hoyhbx <[email protected]>
Signed-off-by: hoyhbx <[email protected]>
@anishakj I added unittests to test the PVC deletion logics and fixed the e2e test |
Hi @anishakj , have you gotten a chance to look at the changes? We are happy to make improvements if you have some suggestions. |
will perform some more tests from my side and let you know |
Thanks @anishakj ! Just let us know if there is anything we could improve. We are also more than happy to fix the issue in the zookeeperStart.sh mentioned here: #513 (comment) |
Hi @anishakj , did you encounter any problem when testing this PR? Is there anything we can help? |
Change log description
Purpose of the change
Fixes #513
What the code does
(Detailed description of the code changes)
If the ZooKeeper cluster is using Persistence storage and the reclaimPolicy is set to Delete, this change makes sure that the number of PVCs is not larger than the statefulSet replicas before updating the statefulSet.
How to verify it
The bug #513 cannot be reproduced after this commit. E2E test is added to prevent regression