diff --git a/docs/proposals/000-template.md b/docs/proposals/000-template.md new file mode 100644 index 00000000..07bf0969 --- /dev/null +++ b/docs/proposals/000-template.md @@ -0,0 +1,137 @@ + + +# Proposal information + + +- **Index**: 000 + + +- **Status**: + + +- **Name**: Feature name + + +- **Owner**: FirstName LastName / + +# Proposal Details + +## Summary + + +## Rationale + + +## User facing changes + + +none + +## Alternative solutions + + +none + +## Out of scope + + +none + +# Implementation Details + +## API Changes + +none + +## Bootstrap Provider Changes + +none + +## ControlPlane Provider Changes + +none + +## Configuration Changes + +none + +## Documentation Changes + +none + +## Testing + + +## Considerations for backwards compatibility + + +## Implementation notes and guidelines + diff --git a/docs/proposals/001-in-place-upgrades.md b/docs/proposals/001-in-place-upgrades.md new file mode 100644 index 00000000..f0391e9c --- /dev/null +++ b/docs/proposals/001-in-place-upgrades.md @@ -0,0 +1,303 @@ +# Proposal information + + +- **Index**: 001 + + +- **Status**: **ACCEPTED** + + +- **Name**: ClusterAPI In-Place Upgrades + + +- **Owner**: Berkay Tekin Oz [@berkayoz](https://github.com/berkayoz) + +# Proposal Details + +## Summary + + +Canonical Kubernetes CAPI providers should reconcile workload clusters and perform in-place upgrades based on the metadata in the cluster manifest. + +This can be used in environments where rolling upgrades are not a viable option such as edge deployments and non-HA clusters. + +## Rationale + + +The current Cluster API implementation does not provide a way of updating machines in-place and instead follows a rolling upgrade strategy. + +This means that a version upgrade would trigger a rolling upgrade, which is the process of creating new machines with desired configuration and removing older ones. This strategy is acceptable in most cases for clusters that are provisioned on public or private clouds where having extra resources is not a concern. + +However this strategy is not viable for smaller bare-metal or edge deployments where resources are limited. This makes Cluster API not suitable out of the box for most of the use cases in industries like telco. + +We can enable the use of Cluster API in these use-cases by updating our providers to perform in-place upgrades. + + +## User facing changes + + +Users will be able to perform in-place upgrades per machine basis by running: +```sh +kubectl annotate machine v1beta2.k8sd.io/in-place-upgrade-to={upgrade-option} +``` + +`{upgrade-option}` can be one of: +* `channel=` which would refresh the machine to the provided channel e.g. `channel=1.31-classic/stable` +* `revision=` which would refresh the machine to the provided revision e.g. `revision=640` +* `localPath=` which would refresh the machine to the provided local `*.snap` file e.g. `localPath=/path/to/k8s.snap` + +## Alternative solutions + + +We could alternatively use the `version` fields defined in `CK8sControlPlane` and `MachineDeployment` manifests instead of annotations which could be a better/more native user experience. + +However at the time of writing CAPI does not have support for changing upgrade strategies which means changes to the `version` fields trigger a rolling update. + +This behaviour can be adjusted on `ControlPlane` objects as our provider has more/full control but can not be easily adjusted on `MachineDeployment` objects which causes issues for worker nodes. + +Switching to using the `version` field should take place when upstream implements support for different upgrade strategies. + +## Out of scope + + +### Cluster-wide Orchestration +A cluster controller called `ClusterReconciler` is added which would perform the one-by-one in-place upgrade of the entire workload cluster. + +Users perform in-place upgrades on the entire cluster by running: +```sh +kubectl annotate cluster v1beta2.k8sd.io/in-place-upgrade-to={upgrade-option} +``` +This would upgrade machines belonging to `` one by one. + +The controller would propagate the `v1beta2.k8sd.io/in-place-upgrade-to` annotation on the `Cluster` object by adding this annotation one-by-one to all the machines that is owned by this cluster. + +The reconciler would perform upgrades in 2 separate phases for control-plane and worker machines. + +A Kubernetes API call listing the objects of type `Machine` and filtering with `ownerRef` would produce the list of machines owned by the cluster. For each phase controller would iterate over this list filtering by the machine type, annotating the machines and waiting for the operation to complete on each iteration. + +The reconciler should not trigger the upgrade endpoint if `v1beta2.k8sd.io/in-place-upgrade-status` is already set to `in-progress` on the machine. + +Once upgrades of the underlying machines are finished: +* `v1beta2.k8sd.io/in-place-upgrade-to` annotation on the `Cluster` would be removed +* `v1beta2.k8sd.io/in-place-upgrade-release` annotation on the `Cluster` would be added/updated with the used `{upgrade-option}`. + +This process can be adapted to use `CK8sControlPlane` and `MachineDeployment` objects instead to be able to upgrade control-plane and worker nodes separately. This will be introduced and explained more extensively in another proposal. + +### Upgrades of Underlying OS and Dependencies +The in-place upgrades only address the upgrades of Canonical Kubernetes and it's respective dependencies. Which means changes on the OS front/image would not be handled since the underlying machine image stays the same. This would be handled by a rolling upgrade as usual. + +# Implementation Details + +## API Changes + +### `POST /snap/refresh` + +```go +type SnapRefreshRequest struct { + // Channel is the channel to refresh the snap to. + Channel string `json:"channel"` + // Revision is the revision number to refresh the snap to. + Revision string `json:"revision"` + // LocalPath is the local path to use to refresh the snap. + LocalPath string `json:"localPath"` +} + +// SnapRefreshResponse is the response message for the SnapRefresh RPC. +type SnapRefreshResponse struct { + // The change id belonging to a snap refresh/install operation. + ChangeID string `json:"changeId"` +} +``` + +`POST /snap/refresh` performs the in-place upgrade with the given options and returns the change id of the snap operation. + +The upgrade can be either done with a `Channel`, `Revision` or a local `*.snap` file provided via `LocalPath`. The value of `LocalPath` should be an absolute path. + +### `POST /snap/refresh-status` + +```go +// SnapRefreshStatusRequest is the request message for the SnapRefreshStatus RPC. +type SnapRefreshStatusRequest struct { + // The change id belonging to a snap refresh/install operation. + ChangeID string `json:"changeId"` +} + +// SnapRefreshStatusResponse is the response message for the SnapRefreshStatus RPC. +type SnapRefreshStatusResponse struct { + // Status is the status of the snap refresh/install operation. + Status string `json:"status"` + // Completed is a boolean indicating if the snap refresh/install operation has completed. + // The status should be considered final when this is true. + Completed bool `json:"completed"` + // ErrorMessage is the error message if the snap refresh/install operation failed. + ErrorMessage string `json:"errorMessage"` +} +``` +`POST /snap/refresh-status` returns the status of the refresh operation for the given change id. + +The operation is considered fully complete once `Completed=true`. + +The `Status` field will contain the status of the operation, with `Done` and `Error` being statuses of interest. + +The `ErrorMessage` field is populated if the operation could not be completed successfully. + + +### Node Token Authentication + +A node token per node will be generated at bootstrap time, which gets seeded into the node under the `/capi/etc/node-token` file. On bootstrap the token under `/capi/etc/node-token` will be copied over to `/var/snap/k8s/common/node-token` with the help of `k8s x-capi set-node-token ` command. The generated token will be stored on the management cluster in the `$clustername-token` secret, with keys formatted as `refresh-token::$nodename`. + +The endpoints will use `ValidateNodeTokenAccessHandler("node-token")` to check the `node-token` header to match against the token in the `/var/snap/k8s/common/node-token` file. + + +## Bootstrap Provider Changes + + +A machine controller called `MachineReconciler` is added which would perform the in-place upgrade if `v1beta2.k8sd.io/in-place-upgrade-to` annotation is set on the machine. + +The controller would use the value of this annotation to make an endpoint call to the `/snap/refresh` through `k8sd-proxy`. The controller then would periodically query the `/snap/refresh-status` with the change id of the operation until the operation is fully completed(`Completed=true`). + +A failed request to `/snap/refresh` endpoint would requeue the requested upgrade without setting any annotations. + +The result of the refresh operation will be communicated back to the user via the `v1beta2.k8sd.io/in-place-upgrade-status` annotation. Values being: + +* `in-progress` for an upgrade currently in progress +* `done` for a successful upgrade +* `failed` for a failed upgrade + +After an upgrade process begins: +* `v1beta2.k8sd.io/in-place-upgrade-status` annotation on the `Machine` would be added/updated with `in-progress` +* `v1beta2.k8sd.io/in-place-upgrade-change-id` annotation on the `Machine` would be updated with the change id returned from the refresh response. +* An `InPlaceUpgradeInProgress` event is added to the `Machine` with the `Performing in place upgrade with {upgrade-option}` message. + +After a successfull upgrade: +* `v1beta2.k8sd.io/in-place-upgrade-to` annotation on the `Machine` would be removed +* `v1beta2.k8sd.io/in-place-change-id` annotation on the `Machine` would be removed +* `v1beta2.k8sd.io/in-place-upgrade-release` annotation on the `Machine` would be added/updated with the used `{upgrade-option}`. +* `v1beta2.k8sd.io/in-place-upgrade-status` annotation on the `Machine` would be added/updated with `done` +* An `InPlaceUpgradeDone` event is added to the `Machine` with the `Successfully performed in place upgrade with {upgrade-option}` message. + +After a failed upgrade: +* `v1beta2.k8sd.io/in-place-upgrade-status` annotation on the `Machine` would be added/updated with `failed` +* `v1beta2.k8sd.io/in-place-change-id` annotation on the `Machine` would be removed +* An `InPlaceUpgradeFailed` event is added to the `Machine` with the `Failed to perform in place upgrade with option {upgrade-option}: {error}` message. + +A custom condition with type `InPlaceUpgradeStatus` can also be added to relay these information. + +The reconciler should not trigger the upgrade endpoint if `v1beta2.k8sd.io/in-place-upgrade-status` is already set to `in-progress` on the machine. + +#### Changes for Rolling Upgrades, Scaling Up and Creating New Machines +In case of a rolling upgrade or when creating new machines the `CK8sConfigReconciler` should check for the `v1beta2.k8sd.io/in-place-upgrade-release` annotation both on the `Machine` object. + +The value of one of the annotation should be used instead of the `version` field while generating a cloud-init script for a machine. + +Using an annotation value requires changing the `install.sh` file to perform the relevant snap operation based on the option. +* `snap install k8s --classic --channel ` for `Channel` +* `snap install k8s --classic --revision ` for `Revision` +* `snap install --classic --dangerous --name k8s` for `LocalPath` + +When a rolling upgrade is triggered the `LocalPath` option requires the newly created machine to contain the local `*.snap` file. This usually means the machine image used by the infrastructure provider should be updated to contain this image. This file could possibly be sideloaded in the cloud-init script before installation. + +This operation should not be performed if `install.sh` script is overridden by the user in the manifests. + +This would prevent adding nodes with an outdated version and possibly breaking the cluster due to a version mismatch. + +## ControlPlane Provider Changes + +none + +## Configuration Changes + +none + +## Documentation Changes + +`How-To` page on performing in-place upgrades should be created. + +`Reference` page listing the annotations and possible values should be created/updated. + +## Testing + +The new feature can be tested manually by applying an annotation on the machine/node and checking for the `v1beta2.k8sd.io/in-place-upgrade-status` annotation's value to be `done`. A timeout should be set for waiting on the upgrade process. + +The tests can be integrated into the CI the same way with the CAPD infrastructure provider. + +The upgrade should be performed with the `localPath` option. Under Pebble the process would replace the `kubernetes` binary with the binary provided in the annotation value. + +This means a docker image containing 2 versions should be created. The different/new version of the `kubernetes` binary would also be built and put into a path. + +## Considerations for backwards compatibility + + +## Implementation notes and guidelines + + +The annotation method is chosen due to the "immutable infrastructure" assumption CAPI currently has. Which means updates are always done by creating new machines and fields are immutable. This might also pose some challenges on displaying accurate Kubernetes version information through CAPI. + +We should be aware of the [metadata propagation](https://cluster-api.sigs.k8s.io/developer/architecture/controllers/metadata-propagation) performed by the upstream controllers. Some metadata is propagated in-place, which can ultimately propagate all the way down to the `Machine` objects. This could potentially flood the cluster with upgrades if machines get annotated at the same time. The cluster wide upgrade is handled through the annotation on the actual Cluster object due to this reason. + +Updating the `version` field would trigger rolling updates by default, with the only difference than upstream being the precedence of the version value provided in the annotations.