-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up existing ClusterRegistrations on Fleet Upgrade #1690
Comments
/backport release/v0.7 fleet-v0.7.1-v2.7.6 |
@manno, Not creating |
/backport fleet-v0.7.1-v2.7.6 release/v0.7 |
Issues #1651 and #1690 are cluster upgrade and post upgrade resources cleanup fixes. Issue #1651:
Issue #1690:
Followed below steps to validate both issues i.e. Cleanup while upgrade is performing and later checked that the cluster registration and associated resources are removed. In order to reproduce the issue following steps were performed.
Observations
After observing this sitution over the days, I upgraded to the latest Rancher RC version and fleet RC version in which the fix is available.
P.S. In above testing, P0 and regression tests performed on the cluster after upgrade. |
Can this be closed as fixed now, @sbulage ? 🤔 |
This is an extension to #1651.
Should also fix #1674
It needs a backport to 0.7.x.
Implemented by:
Fleet 0.7.0 creates multiple clusterregistration resources and does not clean them up. This adds a helm hook to run a a clean up script when upgrading Fleet.
We assume agents are only using the latest clusterregistration and clean up the others. The script does not check if a registration was granted. It does try to delete the child resources, too. If the fleet-controller is running, its clean up handler would also delete the orphaned resources. The script works over all namespaces.
The migration job can be disabled via helm values.
Testing
Engineering Testing
Manual Testing
Upgraded fleet standalone multiple times and watched the job spawn. Checked with
helm template
if the new value work.QA Testing Considerations
The clean up script might use a lot of resources and run for a long time if cleaning up lots of (20k+) resources.
It should be fine for smaller fleets (<20 clusters).
Regressions Considerations
Some fleets might have too many resources for an automatic clean up to be effective?
The text was updated successfully, but these errors were encountered: