Skip to content
This repository has been archived by the owner on Nov 10, 2023. It is now read-only.

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: M-Kusumgar <[email protected]>
  • Loading branch information
richfitz and M-Kusumgar authored Oct 19, 2023
1 parent 43e3648 commit 4e14809
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 9 deletions.
21 changes: 14 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,15 @@

## The idea

We need a way of syncronising some docker volumes from a machine to some backup server, incrementally, using `rsync`. We previously used [`offen/docker-volume-backup`](https://github.com/offen/docker-volume-backup) to backup volumes in their entirity to another machine as a tar file but the space and time requirements made this hard to use in practice.
We need a way of synchronising some docker volumes from a machine to some backup server, incrementally, using `rsync`. We previously used [`offen/docker-volume-backup`](https://github.com/offen/docker-volume-backup) to backup volumes in their entirety to another machine as a tar file but the space and time requirements made this hard to use in practice.

### The setup

We assume some number of **server** machines -- these will recieve data, and some number of **client** machines -- these will send data to the server(s). A client can back any number of volumes to any number of servers, and a server can recieve and serve any unmber of volumes to any number of clients.
We assume some number of **server** machines -- these will receive data, and some number of **client** machines -- these will send data to the server(s). A client can back any number of volumes to any number of servers, and a server can receive and serve any number of volumes to any number of clients.

A typical topolgy for us would be that we would have a "production" machine which is backing up to one or more servers, and then some additional set of "staging" machines that recieve data from the servers, but which in practice never send any data.
A typical framework for us would be that we would have a "production" machine which is backing up to one or more servers, and then some additional set of "staging" machines that receive data from the servers, which in practice never send any data.

Because we are going to use ssh for transport, we assume existance of [HashiCorp Vault](https://www.vaultproject.io/) to store secrets.
Because we are going to use ssh for transport, we assume existence of [HashiCorp Vault](https://www.vaultproject.io/) to store secrets.

### Configuration

Expand Down Expand Up @@ -60,13 +60,20 @@ Restoration is always manual
privateer2 restore <volume> [--server=NAME] [--source=NAME]
```

where `--server` controls the server you are pulling from (if you have more than one configured) and `--source` controls the original machine that backed the data up (if more than one machine is pushing backups).
where `--server` controls the server you are pulling from (useful if you have more than one configured) and `--source` controls the original machine that backed the data up (if more than one machine is pushing backups).

For example, if you are on a "staging" machine, connecting to the "backup" server and want to pull the "user_data" volume that was backed up from "production" machine called you would type

```
privateer2 restore user_data --server=backup --source=production
```


## What's the problem anyway?

[Docker volumes](https://docs.docker.com/storage/volumes/) are useful for abstracting away some persistant storage for an application. They're much nicer to use than bind mounts because they don't pollute the host sytem with immovable files (docker containers often running as root or with a uid different to the user running docker). The docker [docs describe some approaches to backup and restore](https://docs.docker.com/storage/volumes/#back-up-restore-or-migrate-data-volumes) but in practice this ignores many practical issues, especially when the volumes are large or off-site backup is important.
[Docker volumes](https://docs.docker.com/storage/volumes/) are useful for abstracting away some persistent storage for an application. They're much nicer to use than bind mounts because they don't pollute the host system with immovable files (docker containers often running as root or with a uid different to the user running docker). The docker [docs](https://docs.docker.com/storage/volumes/#back-up-restore-or-migrate-data-volumes) describe some approaches to backup and restore but in practice this ignores many practical issues, especially when the volumes are large or off-site backup is important.

We want to be able to syncronise a volume to another volume on a different machine; our setup looks like this:
We want to be able to synchronise a volume to another volume on a different machine; our setup looks like this:

```
bob alice
Expand Down
4 changes: 2 additions & 2 deletions development.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ We can now restore it:
privateer2 --path tmp --as=bob restore data
```

or see the commands to do this outselves:
or see the commands to do this ourselves:

```
privateer2 --path tmp --as=bob restore data --dry-run
Expand All @@ -112,4 +112,4 @@ privateer2 --path tmp --as=alice server stop

## Writing tests

We use a lot of global resources, so it's easy to leave behind volumes and containers (often exited) after running tests. At best this is lazy and messy, but at worst it creates hard-to-diagnose dependencies between tests. Try and create names for auto-cleaned volumes and containers using the `managed_docker` fixture (see [`tests/conftest.py`](tests/conftest.py) for details).
We use a lot of global resources, so it's easy to leave behind volumes and containers (often exited) after running tests. At best this is lazy and messy, but at worst it creates hard-to-diagnose dependencies between tests. Try and create names for auto-cleaned volumes and containers using the `managed_docker` fixture (see [`tests/conftest.py`](tests/conftest.py) for details).

0 comments on commit 4e14809

Please sign in to comment.