Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Velero for taking a backup of non MySQL kubeflow PVCs #1203

Open
kimwnasptd opened this issue Feb 6, 2025 · 1 comment
Open

Use Velero for taking a backup of non MySQL kubeflow PVCs #1203

kimwnasptd opened this issue Feb 6, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@kimwnasptd
Copy link
Contributor

Context

Right now for our backup/restore story we have these 2 parts:

  1. Backup/restore of kubeflow namespace state
  2. Use Velero to take a backup of CKF on top of EKS #1197

For the kubeflow namespace right now all our instructions need the admin to manually go and make the backup of the PVCs we need (MinIO, MLMD).

We should explore using Velero for backing up the MinIO and MLMD PVCs, and not use kubectl+rsync. Note that for the MySQL databases we'll stick with the backup/restore juju actions.

With this we will have proven the E2E backup story, and use Velero for backing up all state that we can't via JuJu.

What needs to get done

We'll need to verify the following plan:

  1. Deploy CKF, and run the KFP/MLflow example (wine)
  2. Take a backup, with Velero, of the MinIO and MLMD PVCs
    1. We can also try to take a backup of MySQL for this PoC, as it will be slightly faster
    2. For the final flow, not use Velero for those
  3. Take a backup of user namespaces, as per Use Velero to take a backup of CKF on top of EKS #1197
  4. Delete user namespaces and CKF installation altogether

Then we'll test the restore:

  1. Create the kubeflow model
  2. Restore the PVCs in kubeflow via Velero
  3. Re-deploy the charms
    • Here we are making the assumption that once the Charms see the existing PVCs, they'll just use them for the Pods
  4. Restore the user namespaces as per Use Velero to take a backup of CKF on top of EKS #1197

Definition of Done

  1. We try out the above plan
  2. We write down any issues/blockers
  3. We write the commands needed for the above
@kimwnasptd kimwnasptd added the enhancement New feature or request label Feb 6, 2025
Copy link

Thank you for reporting your feedback to us!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6885.

This message was autogenerated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant