Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DM-48286] Automate Chronograf and kapacitor backups #4042

Merged
merged 6 commits into from
Jan 6, 2025
Merged

Conversation

afausti
Copy link
Member

@afausti afausti commented Jan 6, 2025

Chronograf and Kapacitor don’t have an application level backup tool.

An alternative is to backup their PVs or specifically the BoltDB file which holds all the application configuration including dashboards in the case of Chronograf and alert rules in the case of kapacitor.

One complication is that we cannot mount their PVs in the sasquatch-backup Pod because they are ReadWriteOnce.

We investigated using kubernetes Volume Snaphots which could be a good option for backups in all our environments, but that's not currently available at USDF and our Telescope environments. Velero could be perhaps the best option but that's a larger project.

A more pragmatic solution is to use the kubectl tool to copy the BoltDB file from the application, and this approach works in all environments.

This PR adds Chronograf and Kapacitor backups to the Sasquatch backup CronJob using the kubectl tool copy the relevant files and implements an optional backup retention parameter.

The output from the backup Job looks like:

Backing up Chronograf...
Backup completed successfully at /backup/chronograf-2025-01-06.
Cleaning up backups older than 3 day(s)...
Backing up Kapacitor...
Backup completed successfully at /backup/kapacitor-2025-01-06.
Cleaning up backups older than 3 day(s)...
Backing up InfluxDB Enterprise (incremental backup)...
Backup completed successfully at /backup/sasquatch-influxdb-enterprise-backup

The PR that adds Chronograf and Kapacitor backups to the backup script is lsst-sqre/sasquatch#52

- Add sasquatch-backup service account and a role with the necessary permissions
- Use the new service account in the Pod sasquatch-backup Pod specification
- Pass the backupItems value to the Pod environment
- retention_days parameter is optional, it doesn't apply to incremental backups for example.
- The order we write the entries in the backup items list is preserved, so backups that take less time can be executed first.
@afausti afausti added this pull request to the merge queue Jan 6, 2025
Merged via the queue into main with commit 1665a6e Jan 6, 2025
7 checks passed
@afausti afausti deleted the tickets/DM-48286 branch January 6, 2025 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant