Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore backup folder #2100

Closed
5 tasks done
jemrobinson opened this issue Aug 6, 2024 · 11 comments
Closed
5 tasks done

Restore backup folder #2100

jemrobinson opened this issue Aug 6, 2024 · 11 comments
Labels
enhancement New functionality that should be added to the Safe Haven
Milestone

Comments

@jemrobinson
Copy link
Member

✅ Checklist

  • I have searched open and closed issues for duplicates.
  • This is a request for a new feature in the Data Safe Haven or an upgrade to an existing feature.
  • The feature is still missing in the latest version.
  • I have read through the documentation.
  • This isn't an open-ended question (open a discussion if it is).

🍓 Suggested change

The v4 release series had a backup folder - we should do the same.

🚂 How could this be done?

@jemrobinson jemrobinson added the enhancement New functionality that should be added to the Safe Haven label Aug 6, 2024
@jemrobinson jemrobinson added this to the Release 5.0.0-rc3 milestone Aug 6, 2024
@JimMadge
Copy link
Member

JSON description of a non-working backup instance.

This suggests the role assignment may be missing the necessary permissions.
(I recall that there were permissions specifically associated with Azure backup).

{
    "properties": {
        "friendlyName": "BlobBackupSensitiveData",
        "dataSourceInfo": {
            "resourceID": "/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/resourceGroups/shm-daimyo-sre-hojo-rg/providers/Microsoft.Storage/storageAccounts/shdaisrehojsensitivedata",
            "resourceUri": "/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/resourceGroups/shm-daimyo-sre-hojo-rg/providers/Microsoft.Storage/storageAccounts/shdaisrehojsensitivedata",
            "datasourceType": "Microsoft.Storage/storageAccounts/blobServices",
            "resourceName": "shdaisrehojsensitivedata",
            "resourceType": "Microsoft.Storage/storageAccounts",
            "resourceLocation": "uksouth",
            "objectType": "Datasource"
        },
        "policyInfo": {
            "policyId": "/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/resourceGroups/shm-daimyo-sre-hojo-rg/providers/Microsoft.DataProtection/backupVaults/shm-daimyo-sre-hojo-bv-backup/backupPolicies/backup-policy-blobs"
        },
        "protectionStatus": {
            "status": "ProtectionError",
            "errorDetails": {
                "message": "Appropriate permissions to perform the operation is missing.",
                "recommendedAction": [
                    "Grant appropriate permissions to perform this operation as mentioned at https://aka.ms/UserErrorMissingRequiredPermissions and retry the operation."
                ],
                "code": "UserErrorMissingRequiredPermissions",
                "target": "",
                "isRetryable": false,
                "isUserError": false,
                "properties": {
                    "ActivityId": "dac6e9f0-196b-4a88-934b-7452a078d301"
                }
            }
        },
        "currentProtectionState": "ProtectionError",
        "protectionErrorDetails": {
            "message": "Appropriate permissions to perform the operation is missing.",
            "recommendedAction": [
                "Grant appropriate permissions to perform this operation as mentioned at https://aka.ms/UserErrorMissingRequiredPermissions and retry the operation."
            ],
            "code": "UserErrorMissingRequiredPermissions",
            "target": "",
            "isRetryable": false,
            "isUserError": false,
            "properties": {
                "ActivityId": "dac6e9f0-196b-4a88-934b-7452a078d301"
            }
        },
        "provisioningState": "Succeeded",
        "objectType": "BackupInstance"
    },
    "id": "/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/resourceGroups/shm-daimyo-sre-hojo-rg/providers/Microsoft.DataProtection/backupVaults/shm-daimyo-sre-hojo-bv-backup/backupInstances/backup-instance-blobs",
    "name": "backup-instance-blobs",
    "type": "Microsoft.DataProtection/backupVaults/backupInstances"
}

@JimMadge JimMadge modified the milestones: Release 5.0.0, Release 5.1.0 Aug 13, 2024
@jemrobinson
Copy link
Member Author

jemrobinson commented Aug 15, 2024

OK, the following things are needed for backup to work (see here)

  • the backup vault needs Storage Account Backup Contributor permissions on the storage account
  • the storage account needs to be STORAGE_V2 (not BLOCK_BLOB_STORAGE)
  • we need to disable HNS and the NFSv3 flag (not sure whether this disables NFS or not)
  • we can't use PREMIUM_ZRS (but STANDARD_GRS seems to work).

Some questions @JimMadge:

  1. Are we happy to make these changes to the storage account that has /ingress and /egress in it or would we rather do this somewhere else
  2. What do we actually want to back up? Which of /home, /ingress, /egress, /shared should we be backing up?
  3. Are we happy with running e.g. rsync daily to copy whichever subset of the above directories we want to backup? Would losing file permissions/ownership be a problem?

Depending on what we think, I'll either write something minimal that could target v5.0.0 or make a more major change that targets v5.1.0

@JimMadge
Copy link
Member

  • the storage account needs to be STORAGE_V2 (not BLOCK_BLOB_STORAGE)
  • we need to disable HNS and the NFSv3 flag (not sure whether this disables NFS or not)

I think this means we cannot backup those. HNS is required for NFSv3 and I think storage v2 doesn't support NFSv3.

I think we shouldn't backup /ingress. It is read-only inside SREs and it would be better to delete all copies than to forget to delete a copy and risk it leaking.

My guess would be we want to backup,

  • /shared
  • /egress
  • probably/possibly /home

If we are going to use a command line tool instead of Azure resources. I think we should go with something like borg which will handle encryption, de-duplication, compression.

@jemrobinson
Copy link
Member Author

I'm suggesting using a command line tool to copy the files from a storage account that we can't back up (e.g. things we're mounting over NFS) into a storage account that we can back up.

I think we probably want the backup account to maintain the file structure of the things we're backing up, so we can easily restore single files or folders from backup. I could be convinced that it's better to store binary dumps from an archiving tool if there's a sensible restore-from-backup workflow that doesn't involve admins trying to run commands through the serial console!

@JimMadge
Copy link
Member

Oh I see.

I think that would still require some manual intervention though. If we had /backup which was managed by Azure Backup Vault, we could restore that directory but would still need to propagate any roll back to /output, /shared, etc..

It feels more robust to have a one step process like borgmatic restore than click some things in the portal then run a script.

I'm sure we could have a CLI entrypoint which runs the restore commands.

@jemrobinson
Copy link
Member Author

Here are some relevant DSPT requirements:

How does your organisation make sure that there are working backups of all important data and information?

Are backups routinely tested to make sure that data and information can be restored?

Are your backups kept separate from your network ('offline'), or in a cloud service designed for this purpose?

I think Azure Backup meets the last one, but if we use borg we would need to work out how to store these "separate from our network".

@jemrobinson
Copy link
Member Author

duplicity might be an option. Here's a guide to backing up to Azure storage.

@JimMadge
Copy link
Member

Are your backups kept separate from your network ('offline'), or in a cloud service designed for this purpose?

We should be careful with that, I think there would often be a legal obligation to not transfer the the data outside of our network.

This is one of the places where I feel that DSPT wasn't designed for TREs. I think it is talking about off site backup as in "If your building burned down, how would you make sure you don't loose everyone's medical records". However we don't expect to archive or curate data. We expect to permanently delete everything soon.

In our case, I think the equivalent of offsite is "If you tear down the workspaces and storage accounts, will you also loose the backups" and "If the datacentre burns down would you loose the backups". We could achieve that by using different resources and redundant storage.

@jemrobinson
Copy link
Member Author

I was assuming this means that we'd need to either explicitly store backups at another datacentre location or use a very high redundancy storage account SKU.

@JimMadge
Copy link
Member

Yes I think that is sensible and best practice.

@JimMadge
Copy link
Member

Closing as superseded by #2270

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New functionality that should be added to the Safe Haven
Projects
None yet
Development

No branches or pull requests

2 participants