-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: Eviction & Placement Disruption budget (#1007)
- Loading branch information
1 parent
a7f9b82
commit 20de7b7
Showing
6 changed files
with
266 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# Eviction & Placement Disruption Budget | ||
|
||
This document explains the concept of `Eviction` and `Placement Disruption Budget` in the context of the fleet. | ||
|
||
## Overview | ||
|
||
`Eviction` provides a way to force remove resources from a target cluster once the resources have already been propagated from the hub cluster by a `Placement` object. | ||
`Eviction` is considered as an voluntary disruption triggered by the user. `Eviction` alone doesn't guarantee that resources won't be propagated to target cluster again by the scheduler. | ||
The users need to use [taints](../../howtos/taint-toleration.md) in conjunction with `Eviction` to prevent the scheduler from picking the target cluster again. | ||
|
||
The `Placement Disruption Budget` object protects against voluntary disruptions. | ||
|
||
The only voluntary disruption that can occur in the fleet is the eviction of resources from a target cluster which can be achieved by creating the `ClusterResourcePlacementEviction` object. | ||
|
||
Some cases of involuntary disruptions in the context of fleet, | ||
- The removal of resources from a member cluster by the scheduler due to scheduling policy changes. | ||
- Users manually deleting workload resources running on a member cluster. | ||
- Users manually deleting the `ClusterResourceBinding` object which is an internal resource the represents the placement of resources on a member cluster. | ||
- Workloads failing to run properly on a member cluster due to misconfiguration or cluster related issues. | ||
|
||
For all the cases of involuntary disruptions described above, the `Placement Disruption Budget` object does not protect against them. | ||
|
||
## ClusterResourcePlacementEviction | ||
|
||
An eviction object is used to remove resources from a member cluster once the resources have already been propagated from the hub cluster. | ||
|
||
The eviction object is only reconciled once after which it reaches a terminal state. Below is the list of terminal states for `ClusterResourcePlacementEviction`, | ||
- `ClusterResourcePlacementEviction` is valid and it's executed successfully. | ||
- `ClusterResourcePlacementEviction` is invalid. | ||
- `ClusterResourcePlacementEviction` is valid but it's not executed. | ||
|
||
To successfully evict resources from a cluster, the user needs to specify: | ||
|
||
- The name of the `ClusterResourcePlacement` object which propagated resources to the target cluster. | ||
- The name of the target cluster from which we need to evict resources. | ||
|
||
When specifying the `ClusterResourcePlacement` object in the eviction's spec, the user needs to consider the following cases: | ||
|
||
- For `PickFixed` CRP, eviction is not allowed; it is recommended that one directly edit the list of target clusters on the CRP object. | ||
- For `PickAll` & `PickN` CRPs, eviction is allowed because the users cannot deterministically pick or unpick a cluster based on the placement strategy; it's up to the scheduler. | ||
|
||
> **Note:** After an eviction is executed, there is no guarantee that the cluster won't be picked again by the scheduler to propagate resources for a `ClusterResourcePlacement` resource. | ||
> The user needs to specify a [taint](../../howtos/taint-toleration.md) on the cluster to prevent the scheduler from picking the cluster again. This is especially true for `PickAll ClusterResourcePlacement` because | ||
> the scheduler will try to propagate resources to all the clusters in the fleet. | ||
## ClusterResourcePlacementDisruptionBudget | ||
|
||
The `ClusterResourcePlacementDisruptionBudget` is used to protect resources propagated by a `ClusterResourcePlacement` to a target cluster from voluntary disruption, i.e., `ClusterResourcePlacementEviction`. | ||
|
||
> **Note:** When specifying a `ClusterResourcePlacementDisruptionBudget`, the name should be the same as the `ClusterResourcePlacement` that it's trying to protect. | ||
Users are allowed to specify one of two fields in the `ClusterResourcePlacementDisruptionBudget` spec since they are mutually exclusive: | ||
|
||
- MaxUnavailable - specifies the maximum number of clusters in which a placement can be unavailable due to any form of disruptions. | ||
- MinAvailable - specifies the minimum number of clusters in which placements are available despite any form of disruptions. | ||
|
||
for both `MaxUnavailable` and `MinAvailable`, the user can specify the number of clusters as an integer or as a percentage of the total number of clusters in the fleet. | ||
|
||
> **Note:** For both MaxUnavailable and MinAvailable, involuntary disruptions are not subject to the disruption budget but will still count against it. | ||
When specifying a disruption budget for a particular `ClusterResourcePlacement`, the user needs to consider the following cases: | ||
|
||
| CRP type | `MinAvailable` DB with an integer | `MinAvailable` DB with a percentage | `MaxUnavailable` DB with an integer | `MaxUnavailable` DB with a percentage | | ||
|--------------|-----------------------------------|-------------------------------------|-------------------------------------|---------------------------------------| | ||
| `PickFixed` | ❌ | ❌ | ❌ | ❌ | | ||
| `PickAll` | ✅ | ❌ | ❌ | ❌ | | ||
| `PickN` | ✅ | ✅ | ✅ | ✅ | | ||
|
||
> **Note:** We don't allow eviction for `PickFixed` CRP and hence specifying a `ClusterResourcePlacementDisruptionBudget` for `PickFixed` CRP does nothing. | ||
> And for `PickAll` CRP, the user can only specify `MinAvailable` because total number of clusters selected by a `PickAll` CRP is non-deterministic. | ||
> If the user creates an invalid `ClusterResourcePlacementDisruptionBudget` object, when an eviction is created, the eviction won't be successfully executed. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,181 @@ | ||
# Using ClusterResourcePlacementEviction and ClusterResourcePlacementDisruptionBudget | ||
|
||
This how-to guide discusses how to create `ClusterResourcePlacementEviction` objects and `ClusterResourcePlacementDisruptionBudget` objects to evict resources from member clusters and protect resources on member clusters from voluntary disruption, respectively. | ||
|
||
## Evicting Resources from Member Clusters using ClusterResourcePlacementEviction | ||
|
||
The `ClusterResourcePlacementEviction` object is used to remove resources from a member cluster once the resources have already been propagated from the hub cluster. | ||
|
||
To successfully evict resources from a cluster, the user needs to specify: | ||
- The name of the `ClusterResourcePlacement` object which propagated resources to the target cluster. | ||
- The name of the target cluster from which we need to evict resources. | ||
|
||
In this example, we will create a `ClusterResourcePlacement` object with PickAll placement policy to propagate resources to an existing `MemberCluster`, add a taint to the member cluster | ||
resource and then create a `ClusterResourcePlacementEviction` object to evict resources from the `MemberCluster`. | ||
|
||
We will first create a namespace that we will propagate to the member cluster. | ||
|
||
``` | ||
kubectl create ns test-ns | ||
``` | ||
|
||
Then we will apply a `ClusterResourcePlacement` with the following spec: | ||
|
||
```yaml | ||
spec: | ||
resourceSelectors: | ||
- group: "" | ||
kind: Namespace | ||
version: v1 | ||
name: test-ns | ||
policy: | ||
placementType: PickAll | ||
``` | ||
The `CRP` status after applying should look something like this: | ||
|
||
```yaml | ||
kubectl get crp test-crp | ||
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE | ||
test-crp 2 True 2 True 2 5m49s | ||
``` | ||
|
||
let's now add a taint to the member cluster to ensure this cluster is not picked again by the scheduler once we evict resources from it. | ||
|
||
Modify the cluster object to add a taint: | ||
|
||
```yaml | ||
spec: | ||
heartbeatPeriodSeconds: 60 | ||
identity: | ||
kind: ServiceAccount | ||
name: fleet-member-agent-cluster-1 | ||
namespace: fleet-system | ||
taints: | ||
- effect: NoSchedule | ||
key: test-key | ||
value: test-value | ||
``` | ||
|
||
Now we will create a `ClusterResourcePlacementEviction` object to evict resources from the member cluster: | ||
|
||
```yaml | ||
apiVersion: placement.kubernetes-fleet.io/v1beta1 | ||
kind: ClusterResourcePlacementEviction | ||
metadata: | ||
name: test-eviction | ||
spec: | ||
placementName: test-crp | ||
clusterName: kind-cluster-1 | ||
``` | ||
|
||
the eviction object should look like this, if the eviction was successful: | ||
|
||
```yaml | ||
kubectl get crpe test-eviction | ||
NAME VALID EXECUTED | ||
test-eviction True True | ||
``` | ||
|
||
since the eviction is successful, the resources should be removed from the cluster, let's take a look at the `CRP` object status to verify: | ||
|
||
```yaml | ||
kubectl get crp test-crp | ||
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE | ||
test-crp 2 True 2 15m | ||
``` | ||
|
||
from the object we can clearly tell that the resources were evicted since the `AVAILABLE` column is empty. If the user needs more information `ClusterResourcePlacement` object's status can be checked. | ||
|
||
## Protecting resources from voluntary disruptions using ClusterResourcePlacementDisruptionBudget | ||
|
||
In this example, we will create a `ClusterResourcePlacement` object with PickN placement policy to propagate resources to an existing MemberCluster, | ||
then create a `ClusterResourcePlacementDisruptionBudget` object to protect resources on the MemberCluster from voluntary disruption and | ||
then try to evict resources from the MemberCluster using `ClusterResourcePlacementEviction`. | ||
|
||
We will first create a namespace that we will propagate to the member cluster. | ||
|
||
``` | ||
kubectl create ns test-ns | ||
``` | ||
Then we will apply a `ClusterResourcePlacement` with the following spec: | ||
```yaml | ||
spec: | ||
resourceSelectors: | ||
- group: "" | ||
kind: Namespace | ||
version: v1 | ||
name: test-ns | ||
policy: | ||
placementType: PickN | ||
numberOfClusters: 1 | ||
``` | ||
|
||
The `CRP` object after applying should look something like this: | ||
|
||
```yaml | ||
kubectl get crp test-crp | ||
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE | ||
test-crp 2 True 2 True 2 8s | ||
``` | ||
|
||
Now we will create a `ClusterResourcePlacementDisruptionBudget` object to protect resources on the member cluster from voluntary disruption: | ||
|
||
```yaml | ||
apiVersion: placement.kubernetes-fleet.io/v1beta1 | ||
kind: ClusterResourcePlacementDisruptionBudget | ||
metadata: | ||
name: test-crp | ||
spec: | ||
minAvailable: 1 | ||
``` | ||
> **Note:** An eviction object is only reconciled once, after which it reaches a terminal state, if the user desires to create/apply the same eviction object again they need to delete the existing eviction object and re-create the object for the eviction to occur again. | ||
Now we will create a `ClusterResourcePlacementEviction` object to evict resources from the member cluster: | ||
|
||
```yaml | ||
apiVersion: placement.kubernetes-fleet.io/v1beta1 | ||
kind: ClusterResourcePlacementEviction | ||
metadata: | ||
name: test-eviction | ||
spec: | ||
placementName: test-crp | ||
clusterName: kind-cluster-1 | ||
``` | ||
|
||
> **Note:** The eviction controller will try to get the corresponding `ClusterResourcePlacementDisruptionBudget` object when a `ClusterResourcePlacementEviction` object is reconciled to check if the specified MaxAvailable or MinAvailable allows the eviction to be executed. | ||
|
||
let's take a look at the eviction object to see if the eviction was executed, | ||
|
||
```yaml | ||
kubectl get crpe test-eviction | ||
NAME VALID EXECUTED | ||
test-eviction True False | ||
``` | ||
|
||
from the eviction object we can see the eviction was not executed. | ||
|
||
let's take a look at the `ClusterResourcePlacementEviction` object status to verify why the eviction was not executed: | ||
|
||
```yaml | ||
status: | ||
conditions: | ||
- lastTransitionTime: "2025-01-21T15:52:29Z" | ||
message: Eviction is valid | ||
observedGeneration: 1 | ||
reason: ClusterResourcePlacementEvictionValid | ||
status: "True" | ||
type: Valid | ||
- lastTransitionTime: "2025-01-21T15:52:29Z" | ||
message: 'Eviction is blocked by specified ClusterResourcePlacementDisruptionBudget, | ||
availablePlacements: 1, totalPlacements: 1' | ||
observedGeneration: 1 | ||
reason: ClusterResourcePlacementEvictionNotExecuted | ||
status: "False" | ||
type: Executed | ||
``` | ||
|
||
the eviction status clearly mentions that the eviction was blocked by the specified `ClusterResourcePlacementDisruptionBudget`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
apiVersion: placement.kubernetes-fleet.io/v1alpha1 | ||
apiVersion: placement.kubernetes-fleet.io/v1beta1 | ||
kind: ClusterResourcePlacementDisruptionBudget | ||
metadata: | ||
name: test-crp | ||
spec: | ||
maxUnavailable: 1 | ||
minAvailable: 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
apiVersion: placement.kubernetes-fleet.io/v1alpha1 | ||
apiVersion: placement.kubernetes-fleet.io/v1beta1 | ||
kind: ClusterResourcePlacementEviction | ||
metadata: | ||
name: test-eviction | ||
spec: | ||
placementName: test-crp | ||
clusterName: cluster-1 | ||
clusterName: kind-cluster-1 |