Skip to content

Commit

Permalink
Merge branch 'master' of git.w2k.jumptrading.com:minguyen/clickhouse-…
Browse files Browse the repository at this point in the history
…backup into fix_retry
  • Loading branch information
awiik-jump committed Nov 10, 2023
2 parents 85a9c25 + c2920de commit 97aa700
Show file tree
Hide file tree
Showing 7 changed files with 340 additions and 310 deletions.
4 changes: 4 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# v2.4.4
BUG FIXES
- remove aws.LogResponseWithBody for `S3_DEBUG` to avoid too many logs (affected 2.4.0+ versions)

# v2.4.3
IMPROVEMENTS
- add `list` command to API /backup/actions, fix [772](https://github.com/Altinity/clickhouse-backup/issues/772)
Expand Down
106 changes: 56 additions & 50 deletions Examples.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Use cases of clickhouse-backup

## How to convert MergeTree to ReplicatedMergeTree
don't work for tables which created in `MergeTree(date_column, (primary keys columns), 8192)` format
This doesn't work for tables created in `MergeTree(date_column, (primary keys columns), 8192)` format
1. Create backup
```
clickhouse-backup create --table='my_db.my_table' my_backup
```
2. Edit `/var/lib/clickhouse/backup/my_backup/metadata/my_db/my_table.json`, change `query` field,
replace MergeTree() to ReplicatedMergeTree() with parameters according to https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables
replace MergeTree() with ReplicatedMergeTree() with parameters according to https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables
3. Drop table in Clickhouse
```
clickhouse-client -q "DROP TABLE my_db.my.table NO DELAY"
Expand All @@ -18,60 +18,57 @@ don't work for tables which created in `MergeTree(date_column, (primary keys col
```

## How to store backups on NFS, backup drive or another server via SFTP
Use 'rsync'
'rsync' supports hard links with means that backup on remote server or mounted fs will be stored as efficiently as in the '/var/lib/clickhouse/backup'.
You can create daily backup by clickhouse-backup and sync backup folder to mounted fs with this command:
`rsync -a -H --delete --progress --numeric-ids --update /var/lib/clickhouse/backup/ /mnt/data/clickhouse-backup/` or similar for sync over ssh. In this case rsync will copy only difference between backups.
Use `rsync`.
`rsync` supports hard links, which means that a backup on a remote server or mounted fs will be stored as efficiently as in `/var/lib/clickhouse/backup`.
You can create a daily backup by clickhouse-backup and a sync backup folder to mounted fs with this command:
`rsync -a -H --delete --progress --numeric-ids --update /var/lib/clickhouse/backup/ /mnt/data/clickhouse-backup/` or similar for sync over ssh. In this case `rsync` will copy only difference between backups.

## How to move data to another clickhouse server
destination server
## How to move data to another ClickHouse server
On the destination server:
```bash
mkdir -p /var/lib/clickhouse/backups/backup_name
```
source server
On the source server:
```bash
clickhouse-backup create backup_name
rsync --rsh=ssh /var/lib/clickhouse/backups/backup_name/ user@dst_server:/var/lib/clickhouse/backups/backup_name
```

destination server
On the destination server:
```bash
clickhouse-backup restore --rm backup_name
```

## How to reduce number of partitions
...

## How to monitor that backups created and uploaded correctly
## How to monitor that backups were created and uploaded correctly
Use services like https://healthchecks.io or https://deadmanssnitch.com.
Or use `clickhouse-backup server` and prometheus endpoint :7171/metrics, look alerts examples on https://github.com/Altinity/clickhouse-operator/blob/master/deploy/prometheus/prometheus-alert-rules-backup.yaml
Or use `clickhouse-backup server` and prometheus endpoint `:7171/metrics`. For an example of setting up Prometheus alerts, see https://github.com/Altinity/clickhouse-operator/blob/master/deploy/prometheus/prometheus-alert-rules-backup.yaml.

## How to make backup / restore sharded cluster
## How to back up / restore a sharded cluster
### BACKUP
run only on the first replica for each shard
Run only on the first replica for each shard:
```bash
shard_number=$(clickhouse-client -q "SELECT getMacro('shard')")
clickhouse-backup create_remote shard${shard_number}-backup
clickhouse-backup delete local shard${shard_number}-backup
```

### RESTORE
run on all replicas
Run on all replicas:
```bash
shard_number=$(clickhouse-client -q "SELECT getMacro('shard')")
clickhouse-backup restore_remote --rm --schema shard${shard_number}-backup
clickhouse-backup delete local shard${shard_number}-backup
```
after it, run only on the first replica for each shard
After that, run only on the first replica for each shard:
```bash
shard_number=$(clickhouse-client -q "SELECT getMacro('shard')")
clickhouse-backup restore_remote --rm shard${shard_number}-backup
clickhouse-backup delete local shard${shard_number}-backup
```

## How to make backup sharded cluster with Ansible
On the first day of month full backup will be uploaded and increments on the others days.
`hosts: clickhouse-cluster` shall be only first replica on each shard
## How to back up a sharded cluster with Ansible
On the first day of month a full backup will be uploaded and increments on the other days.
`hosts: clickhouse-cluster` shall be only the first replica on each shard

```yaml
- hosts: clickhouse-cluster
Expand Down Expand Up @@ -110,12 +107,12 @@ On the first day of month full backup will be uploaded and increments on the oth
- uri: url="https://hc-ping.com/{{ healthchecksio_clickhouse_upload_id }}/fail"
```
## How to make backup database with several terabytes of data
## How to back up a database with several terabytes of data
You can use clickhouse-backup for creating periodical backups and keep it local. It protects you from destructive operations.
In addition, you may create instance of ClickHouse on another DC and have it fresh by clickhouse-copier it protects you from hardware or DC failures.
In addition, you may create instance of ClickHouse on another DC and have it fresh by clickhouse-copier to protect you from hardware or DC failures.
## How to use clickhouse-backup in Kubernetes
Install [clickhouse kubernetes operator](https://github.com/Altinity/clickhouse-operator/) and use following manifest
Install the [clickhouse kubernetes operator](https://github.com/Altinity/clickhouse-operator/) and use the following manifest:
```yaml
apiVersion: "clickhouse.altinity.com/v1"
Expand Down Expand Up @@ -232,7 +229,8 @@ spec:
containerPort: 7171
```
You need to prepare remote storage, for test only
You need to prepare remote storage for test only:
```yaml
---
apiVersion: "apps/v1"
Expand Down Expand Up @@ -280,7 +278,8 @@ spec:
targetPort: minio
```
Also, you can apply CronJob to run `clickhouse-backup` actions by schedule
You can also use CronJob to run `clickhouse-backup` actions on a schedule:

```yaml
apiVersion: batch/v1
kind: CronJob
Expand Down Expand Up @@ -315,7 +314,7 @@ spec:
value: backup
- name: BACKUP_PASSWORD
value: "backup_password"
# change to 1, if you want make full backup only in $FULL_BACKUP_WEEKDAY (1 - Mon, 7 - Sun)
# change to 1, if you want to make full backup only in $FULL_BACKUP_WEEKDAY (1 - Mon, 7 - Sun)
- name: MAKE_INCREMENT_BACKUP
value: "1"
- name: FULL_BACKUP_WEEKDAY
Expand Down Expand Up @@ -381,7 +380,8 @@ spec:
echo "BACKUP CREATED"
```
For one time restore data you could use `Job`
For one time restore data, you can use `Job`:

```yaml
# example to restore latest backup
apiVersion: batch/v1
Expand Down Expand Up @@ -472,20 +472,25 @@ spec:

## How to use AWS IRSA and IAM to allow S3 backup without Explicit credentials

Create Role <ROLE NAME> and IAM Policy, look details in https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-enable-IAM.html
Create Role <ROLE NAME> and IAM Policy. This field typically looks like this:
`arn:aws:iam::1393332413596:role/rolename-clickhouse-backup`,
where `1393332413596` is the ID of the role and
`rolename-clickhouse-backup` is the name of the role.
See [the AWS documentation](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-enable-IAM.html)
for all the details.

Create service account with annotations
Create a service account with annotations:
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: <SERVICE ACOUNT NAME>
name: <SERVICE ACCOUNT NAME>
namespace: <NAMESPACE>
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_NAME>:role/<ROLE_NAME>
```

Link service account to podTemplate it will create `AWS_ROLE_ARN` and `AWS_WEB_IDENTITY_TOKEN_FILE` environment variables.
Link the service account to a podTemplate to create `AWS_ROLE_ARN` and `AWS_WEB_IDENTITY_TOKEN_FILE` environment variables:
```yaml
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
Expand All @@ -507,9 +512,10 @@ spec:

### How to use clickhouse-backup + clickhouse-operator in FIPS compatible mode in Kubernetes for S3

use `altinity/clickhouse-backup:X.X.X-fips` as image (where X.X.X version number)
run following commands to generate self-signed TLS keys for secure clickhouse-backup API endpoint,
you need periodically renew this certs, use https://github.com/cert-manager/cert-manager for it in kubernetes
Use the image `altinity/clickhouse-backup:X.X.X-fips` (where X.X.X is the version number).
Run the following commands to generate self-signed TLS keys for secure clickhouse-backup API endpoint:
(You need to renew these certs periodically; use https://github.com/cert-manager/cert-manager for it in kubernetes.)

```bash
openssl genrsa -out ca-key.pem 4096
openssl req -subj "/O=altinity" -x509 -new -nodes -key ca-key.pem -sha256 -days 365000 -out ca-cert.pem
Expand All @@ -518,7 +524,8 @@ you need periodically renew this certs, use https://github.com/cert-manager/cert
openssl x509 -req -days 365 -extensions SAN -extfile <(printf "\n[SAN]\nsubjectAltName=DNS:localhost,DNS:*.cluster.local") -in server-req.csr -out server-cert.pem -CA ca-cert.pem -CAkey ca-key.pem -CAcreateserial
```

create following `ConfigMap` + `ClickHouseInstallation` kubernetes manifest
Create the following `ConfigMap` + `ClickHouseInstallation` kubernetes manifest:

```yaml
---
apiVersion: v1
Expand Down Expand Up @@ -624,17 +631,16 @@ spec:
containerPort: 7171
```

## How do incremental backups work to remote storage
- Incremental backup calculate increment only during execute `upload` or `create_remote` command or similar REST API request.
- When `use_embedded_backup_restore: false`, then incremental backup calculate increment only on table parts level, else increment backups also calculates based on `checksums.txt` for 23.3+ clickhouse version, look to ClickHouse documentation to fill the difference between [data parts](https://clickhouse.tech/docs/en/operations/system-tables/parts/) and [table partitions](https://clickhouse.tech/docs/en/operations/system-tables/partitions/).
- To calculate increment, backup which listed on `--diff-from` parameter is required to be present as local backup, look to `clickhouse-backup list` command results for ensure.
- Currently, during execute `clickhouse-backup upload --diff-from=base_backup` don't check `base_backup` exits on remote storage, be careful.
- During upload operation `base_backup` added to current backup metadata as required. All data parts which exists in `base_backup` also mark in backup metadata table level with `required` flag and skip data uploading.
- During download, if backup contains link to `required` backup it will try to fully download first. This action apply recursively. If you have a chain of incremental backups, all incremental backups in the chain and first "full" will download to local storage.
- Size of increment depends not only on the intensity your data ingestion and also depends on the intensity background merges for data parts in your tables. Please increase how much rows you will ingest during one INSERT query and don't apply often [table data mutations](https://clickhouse.tech/docs/en/operations/system-tables/mutations/).
- Look to [ClickHouse documentation](https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/mergetree/) and try to understand how exactly `*MergeTree` table engine works.
## How incremental backups work with remote storage
- Incremental backup calculates the increment only while executing `upload` or `create_remote` commands or similar REST API requests.
- When `use_embedded_backup_restore: false`, then incremental backup calculates the increment only on the table parts level; otherwise the increment is also calculated based on `checksums.txt`. For ClickHouse version 23.3+, see the ClickHouse documentation to find the difference between [data parts](https://clickhouse.tech/docs/en/operations/system-tables/parts/) and [table partitions](https://clickhouse.tech/docs/en/operations/system-tables/partitions/). Currently `clickhouse-baskup` does not support incremental backups when `use_embedded_backup_restore: true`.
- To calculate the increment, the backup listed on the `--diff-from` parameter is required to be present as a local backup. Check the `clickhouse-backup list` command results for errors.
- During upload, `base_backup` is added to current backup metadata as required. All data parts that exist in `base_backup` also mark in the backup metadata table level with `required` flag and skip data uploading.
- During download, if a backup contains link to a `required` backup, each table which contains parts marked as `required` will download these parts to local storage after complete downloading for non `required` parts. If you have a chain of incremental backups and required parts exist in this chain, then this action applies recursively.
- The size of the increment depends not only on the intensity of your data ingestion but also on the intensity of background merges for data parts in your tables. Please increase how many rows you will ingest during one INSERT query and don't do frequent [table data mutations](https://clickhouse.tech/docs/en/operations/system-tables/mutations/).
- See the [ClickHouse documentation](https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/mergetree/) for information on how the `*MergeTree` table engine works.

## How to work `watch` command
Current implementation simple and will improve in next releases
- When `watch` command start, it call create_remote+delete command sequence to make `full` backup
- Then it wait `watch-interval` time period and call create_remote+delete command sequence again, type of backup will `full` if `full-interval` expired after last full backup created and `incremental`, if not.
## How to watch backups work
The current implementation is simple and will improve in next releases.
- When the `watch` command starts, it calls the `create_remote+delete command` sequence to make a `full` backup
- Then it waits `watch-interval` time period and calls the `create_remote+delete` command sequence again. The type of backup will be `full` if `full-interval` expired after last full backup created and `incremental` if not.
Loading

0 comments on commit 97aa700

Please sign in to comment.