Skip to content

Commit

Permalink
Multi-user additions for backup/restore
Browse files Browse the repository at this point in the history
  • Loading branch information
Wolfgang Kulhanek authored and Wolfgang Kulhanek committed Nov 12, 2024
1 parent e7c2577 commit 60e4340
Showing 1 changed file with 82 additions and 78 deletions.
160 changes: 82 additions & 78 deletions content/modules/ROOT/pages/module-03-backup-restore.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,16 @@ Fist you will create a new VM in a new project to use for the exercise.
image::module-03-backup-restore/01.png[]

. Specify `kasten-lab-{user}` as the *_Name_* and click *_Create_*.
. Select *_Virtualization → Virtual Machines_* from the sidebar, and click *_Create VirtualMachine → From template_* in the `kasten-lab` Project.
. Select *_Virtualization → Virtual Machines_* from the sidebar, and click *_Create VirtualMachine → From template_* in the `kasten-lab-{user}` Project.
+
image::module-03-backup-restore/03.png[]
+
____
[!CAUTION]
====
[CAUTION]
Kasten can be used to protect VMs provisioned using an `InstanceType` but currently requires a manual transform be applied during the restore process, which is out of the scope of this lab exercise.
Full support for `InstanceTypes` will be added in an upcoming release.
____
====

. Under *_Template catalog_*, select the `fedora-server-small` template.
+
Expand All @@ -32,24 +32,24 @@ image::module-03-backup-restore/04.png[]
+
image::module-03-backup-restore/05.png[]
+
____
[!NOTE]
====
[NOTE]
This will provision the VM with preferred storage settings for the `ocs-external-storagecluster-ceph-rbd` StorageClass, specifically *_Block VolumeMode_* to provide the *_ReadWriteMany_* access required to enable live migration between OpenShift nodes.
____
====

. Validate the `fedora-k10-volume` PersistentVolumeClaim configuration from *_Storage → PersistentVolumeClaims_* in the sidebar.
+
image::module-03-backup-restore/06.png[]

== 3. Enabling Block Mode Exports

____
[!NOTE]
====
[NOTE]
As some storage provisioners may not fully support Block volume mode, StorageClasses should first be evaluated for compatibility https://docs.kasten.io/latest/operating/k10tools.html#k10-primer-block-mount-check[using the primer script].
This is skipped in the lab exercise as the `openshift-storage.rbd.csi.ceph.com` provisioner is known to be compatible.
____
====

. In the *_Web Terminal_*, run the following to allow the Kasten datamover to export raw Block volumes using the `ocs-external-storagecluster-ceph-rbd` StorageClass:
+
Expand All @@ -61,104 +61,108 @@ ____
+
By default, Kasten provides an out-of-the-box, direct API integration with ODF's Ceph Volume Manager for block data movement - including support for incremental backups to lower storage consumption requirements for protecting Virtual Machines.
+
____
[!IMPORTANT]
====
[IMPORTANT]
While many Kubernetes backup solutions can orchestrate local CSI snapshots of a `volumeMode: Block` PVC, it is important to remember that _snapshots do not equal backup_ - and that having a backup solution that can provide off-cluster data movement for `volumeMode: Block` PVCs is critical.
Support for this capability does *NOT* exist in Velero or OADP today.
____
====

== 4. Creating a Kasten Policy

. In the *_Kasten Dashboard_*, select *_Applications_* to view all discovered namespaces.
+
image::module-03-backup-restore/07.png[]
+
Your `kasten-lab` application should appear as *_Unmanaged_*, indicating it is not being protected by any policy.
+
<!-- > [!TIP]
Your `kasten-lab-{user}` application should appear as *_Unmanaged_*, indicating it is not being protected by any policy.
+
____
====
[!TIP]
You will notice that you don't see any namespaces starting with `openshift` in the list of applications.
Namespaces, including the `+openshift-...+` system namespaces can be excluded from the *_Applications_* list (and compliance reporting) by adding a list of `excludedApps` to the K10 Operand `spec`, as shown:
image::module-03-backup-restore/08.png[]
The following command can be used to produce a properly formatted list of namespaces beginning with `openshift` that can be copy/paste into the K10 Operand YAML tab:
`+bash oc get ns --no-headers=true | awk 'BEGIN { print " excludedApps:" } /^openshift/{print " -",$1}' +` -->
____
`+bash oc get ns --no-headers=true | awk 'BEGIN { print " excludedApps:" } /^openshift/{print " -",$1}' +`
Your lab environment has been configured already to exclude system namespaces.
====

. Click `kasten-lab` in the *_Applications_* list to view details about the workloads and additional resources discovered within the namespace.
. Click `kasten-lab-{user}` in the *_Applications_* list to view details about the workloads and additional resources discovered within the namespace.
+
image::module-03-backup-restore/09.png[]

. Close the *_Application Details_* window.
. Under `kasten-lab`, select *_...
. Under `kasten-lab-{user}`, select *_...
→ Create a Policy_*.
+
image::module-03-backup-restore/10.png[]

. Leave the default *_Name_* and *_Action_*.
. Use `kasten-lab-backup-{user}` for the *_Name_* and leave the default (`Snapshot`) for *_Action_*.
+
image::module-03-backup-restore/11.png[]
+
____
[!NOTE]
====
[NOTE]
Policy Presets provide the option of allowing administrators to define SLA-focused configurations to simplify self-service data protection for other users.
____
====

. Leave the default *_Hourly Backup Frequency_* and *_Snapshot Retention_* values.
+
image::module-03-backup-restore/12.png[]
+
____
[!NOTE]
====
[NOTE]
Toggling *_Advanced Frequency Options_* allows users to specify what time hourly snapshots occur, how many snapshots to take per hour, and which snapshots should be used for daily, weekly, monthly, and yearly promotion.
Toggling *_Backup Window_* allows users to specify during what times is Kasten allowed to run the policy.
Enabling *_Use Staggering_* can intelligently distribute when to start policies during the specified window such that the desired frequency is maintained, but with the least amount of policies running simultaneously, allowing Kasten to reduce the peak load on the cluster.
These settings should be left unselected for this lab.
____
====

. Toggle *_Enable Backups via Snapshot Exports_* and select `ceph-rgw-immutable` as the *_Export Location Profile_*.
. Toggle *_Enable Backups via Snapshot Exports_* and select `kastenbackups-{user}` as the *_Export Location Profile_*.
+
image::module-03-backup-restore/13.png[]
+
____
[!NOTE]
====
[NOTE]
By default, Kasten will export all data associated with the snapshot to ensure you have a durable, off-cluster copy.
However, there are circumstances where you may only want to export references to the snapshot, such as migrating a workload in AWS from one availability zone to another.
This ability to only export snapshot metadata can dramatically improve performance in these specific instances.
This can be configured under *_Advanced Export Settings_*.
____
====

. Under *_Select Applications_*, verify the `kasten-lab` namespace has been selected.
. Under *_Select Applications_*, verify the `kasten-lab-{user}` namespace has been selected.
+
image::module-03-backup-restore/14.png[]
+
____
[!NOTE]
====
[NOTE]
Targeting application(s) based on namespace is generally the most straightforward method of defining a backup policy.
However, Kasten also allows you to identify applications based on native Kubernetes labels.
This is especially helpful if you have many VMs in a single namespace and only want to protect current and *_future_* VMs with a specific label on the `VirtualMachine` resource, such as `backup: gold` or `vm: prod`.
Kasten also provides rich filtering capabilities to include or exclude resources based on Kubernetes *_API Group_*, *_API Version_*, *_Resource Type_*, *_Resource Name_*, and *_Labels_*.
For example, you could exclude backup for *_Secrets_* resources where a label includes an indication that the secret is externally managed.
____
====

. Leave the remaining settings as default.
+
____
[!TIP]
====
[TIP]
When performing many tasks within the Kasten UI, you can press the *_</> YAML_* button to expose the native Kubernetes YAML that defines the resource created through the UI.
This can be useful for familiarizing yourself with the Kubernetes-native APIs defined by Kasten and for extracting snippets for use in GitOps or Infrastructure-as-Code tools.
____
====

. Click *_Create Policy_*.

Expand All @@ -168,31 +172,31 @@ Kasten can freeze the guest filesystem before the snapshot and unfreeze after th

. In the *_Web Terminal_*, enable filesystem freezing for `fedora-k10`:
+
[,bash]
[source,bash,role=execute,subs="attributes"]
----
oc annotate virtualmachine fedora-k10 \
-n kasten-lab \
k10.kasten.io/freezeVM=true
oc annotate virtualmachine fedora-k10 \
-n kasten-lab-{user} \
k10.kasten.io/freezeVM=true
----
+
____
[!NOTE]
====
[NOTE]
The freeze and unfreeze operations will only be attempted if the VirtualMachine is in *_Running_* state.
____
====
+
____
[!WARNING]
====
[WARNING]
Kasten defines a 5 minute default timeout for the snapshot operation to complete before aborting the snapshot operation and unfreezing the VM.
This can be overridden using the `kubeVirtVMs.snapshot.unfreezeTimeout` Helm/Operand parameter.
____
====

== 6. Running the Policy

Rather than wait until the top of the hour for the policy to run, you can manually initiate a policy run programmatically or via the UI.

. In *_Kasten Dashboard → Policies → Policies_*, click *_Run Once_* for the `kasten-lab-backup` Policy.
. In *_Kasten Dashboard → Policies → Policies_*, click *_Run Once_* for the `kasten-lab-backup-{user}` Policy.
+
image::module-03-backup-restore/15.png[]

Expand All @@ -201,7 +205,7 @@ image::module-03-backup-restore/15.png[]
image::module-03-backup-restore/16.png[]

. Select *_Dashboard_* from the sidebar.
. Under *_Actions_*, select the `kasten-lab-backup` Policy Run to monitor status.
. Under *_Actions_*, select the `kasten-lab-backup-{user}` Policy Run to monitor status.
+
image::module-03-backup-restore/17.png[]
+
Expand All @@ -211,25 +215,25 @@ image::module-03-backup-restore/18.png[]

. Wait for the *_Policy Run_* to complete before proceeding.
+
____
[!WARNING]
====
[WARNING]
If your policy fails, review the provided error message for further details.
_Did you skip link:./backup-restore#_3-enabling-block-mode-exports[annotating the storage class to allow block mode exports] earlier in the lab?_
image::module-03-backup-restore/18b.png[]
____
====

== 7. Performing a Local Restore

When performing an in-place restore on the application's original cluster, choosing the local RestorePoint provides the most rapid recovery as it uses the snapshot data already present on primary storage, rather than having to depend on data which must be transferred from the remote repository.

. In the *_Kasten Dashboard_*, select *_Applications_* from the sidebar.
+
You should observe that the `kasten-lab` *_Status_* has changed to *_Compliant_*, indicating that the application is compliant with the backup SLA defined in the policy (i.e.
You should observe that the `kasten-lab-{user}` *_Status_* has changed to *_Compliant_*, indicating that the application is compliant with the backup SLA defined in the policy (i.e.
There is a backup for the application created within the last hour to satisfy the hourly policy frequency).

. Under `kasten-lab`, select *_...
. Under `kasten-lab-{user}`, select *_...
→ Restore_*.
+
image::module-03-backup-restore/19.png[]
Expand All @@ -244,12 +248,12 @@ You should observe by default the selected RestorePoint includes all resources c
+
image::module-03-backup-restore/21.png[]
+
____
[!WARNING]
====
[WARNING]
Kasten will terminate the running VM and overwrite the existing resources.
However, any resources in the namespace that do not exist in the RestorePoint will not be altered (protecting against unintentional data loss).
____
====

. Return to the *_Dashboard_* to monitor the status of the *_Restore_* under *_Actions_*.
+
Expand All @@ -259,79 +263,79 @@ You should expect this operation to complete rapidly, as the VM volume is being
+
image::module-03-backup-restore/22.png[]
+
____
[!NOTE]
====
[NOTE]
You can also validate the source of the restored volume by running:
[,bash]
[source,bash,role=execute,subs="attributes"]
----
oc describe pvc fedora-k10 -n kasten-lab
----
You should observe the volume's *_DataSource_* is a `+k10-csi-snap-...+` VolumeSnapshot, confirming the volume was restored from a local snapshot.
____
====

== 8. Performing a Remote Restore

Often, local snapshot data may not be available, requiring that data be restored from the remote Kasten repository.

. In the *_Web Terminal_*, run the following to delete the `kasten-lab` namespace:
. In the *_Web Terminal_*, run the following to delete the `kasten-lab-{user}` namespace:
+
[,bash]
[source,bash,role=execute,subs="attributes"]
----
oc delete virtualmachine fedora-k10 -n kasten-lab
oc delete virtualmachine fedora-k10 -n kasten-lab-{user}
oc delete namespace kasten-lab
oc delete namespace kasten-lab-{user}
----
+
____
[!IMPORTANT]
====
[IMPORTANT]
_"Snapshots are not backup."_ ~ Mark Twain
VolumeSnapshots are namespaced resources.
Removing the `kasten-lab` namespace will delete the VolumeSnapshots associated with your local RestorePoints.
Additionally, the `ocs-storagecluster-rbdplugin-snapclass` VolumeSnapshotClass sets `deletionPolicy: Delete` by default, meaning that deletion of the VolumeSnapshot resource results in the removal of the snapshot within Ceph.
____
====

. In the *_Kasten Dashboard_*, select *_Applications_* from the sidebar.
+
You should observe that `kasten-lab` no longer appears in the list of applications as the namespace no longer exists on the cluster.
You should observe that `kasten-lab-{user}` no longer appears in the list of applications as the namespace no longer exists on the cluster.

. Click the *_All_* dropdown menu and select *_Removed_* to view the list of non-existent namespaces with available RestorePoints.
+
image::module-03-backup-restore/23.png[]

. Under `kasten-lab`, select *_...
. Under `kasten-lab-{user}`, select *_...
→ Restore_*.
. Select the most recent RestorePoint, and click the *_EXPORTED_* version as shown below.
+
image::module-03-backup-restore/24.png[]

. Under *_Application Name_*, click *_+ Create New Namespace_*.
. Specify `kasten-lab-clone` as the *_New Namespace_* and click *_Create_*.
. Specify `kasten-lab-clone-{user}` as the *_New Namespace_* and click *_Create_*.
+
image::module-03-backup-restore/25.png[]

. Click *_Restore_* and return to the *_Dashboard_* to monitor progress under *_Actions_*.
+
image::module-03-backup-restore/26.png[]

. Return to *_OpenShift Console → Virtualization → VirtualMachines_* and observe the `fedora-k10` VirtualMachine now running in the `kasten-lab-clone` namespace.
. Return to *_OpenShift Console → Virtualization → VirtualMachines_* and observe the `fedora-k10` VirtualMachine now running in the `kasten-lab-clone-{user}` namespace.
+
image::module-03-backup-restore/27.png[]
+
____
====
[!NOTE]
Unlike the local restore, the PVC populated by the Kasten datamover will not contain a *_DataSource_* snapshot reference:
[,bash]
[source,bash,role=execute,subs="attributes"]
----
oc describe pvc fedora-k10 -n kasten-lab-clone
oc describe pvc fedora-k10 -n kasten-lab-clone-{user}
----
____
====

== 9. Takeaways

Expand Down

0 comments on commit 60e4340

Please sign in to comment.