Skip to content

Commit

Permalink
Merge pull request #944 from nithin-vunet/add_restore_table_mapping
Browse files Browse the repository at this point in the history
Implement `--restore-table-mapping` for `restore` and `restore_remote`
  • Loading branch information
Slach authored Jul 4, 2024
2 parents d53e960 + 053b126 commit 23f8cab
Show file tree
Hide file tree
Showing 12 changed files with 268 additions and 103 deletions.
6 changes: 4 additions & 2 deletions Manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,13 +147,14 @@ NAME:
clickhouse-backup restore - Create schema and restore data from backup
USAGE:
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>
OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Restore only database and objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, --tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand All @@ -177,13 +178,14 @@ NAME:
clickhouse-backup restore_remote - Download and restore
USAGE:
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>
OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Download and restore objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, --tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Download and restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand Down
11 changes: 9 additions & 2 deletions ReadMe.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ general:
# RESTORE_DATABASE_MAPPING, restore rules from backup databases to target databases, which is useful when changing destination database, all atomic tables will be created with new UUIDs.
# The format for this env variable is "src_db1:target_db1,src_db2:target_db2". For YAML please continue using map syntax
restore_database_mapping: {}

# RESTORE_TABLE_MAPPING, restore rules from backup tables to target tables, which is useful when changing destination tables.
# The format for this env variable is "src_table1:target_table1,src_table2:target_table2". For YAML please continue using map syntax
restore_table_mapping: {}
retries_on_failure: 3 # RETRIES_ON_FAILURE, how many times to retry after a failure during upload or download
retries_pause: 30s # RETRIES_PAUSE, duration time to pause after each download or upload failure

Expand Down Expand Up @@ -476,6 +480,7 @@ Create schema and restore data from backup: `curl -s localhost:7171/backup/resto
- Optional query argument `rbac` works the same as the `--rbac` CLI argument (restore RBAC).
- Optional query argument `configs` works the same as the `--configs` CLI argument (restore configs).
- Optional query argument `restore_database_mapping` works the same as the `--restore-database-mapping` CLI argument.
- Optional query argument `restore_table_mapping` works the same as the `--restore-table-mapping` CLI argument.
- Optional query argument `callback` allow pass callback URL which will call with POST with `application/json` with payload `{"status":"error|success","error":"not empty when error happens"}`.

### POST /backup/delete
Expand Down Expand Up @@ -705,13 +710,14 @@ NAME:
clickhouse-backup restore - Create schema and restore data from backup
USAGE:
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>
OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Restore only database and objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, --tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand All @@ -735,13 +741,14 @@ NAME:
clickhouse-backup restore_remote - Download and restore
USAGE:
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>
OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Download and restore objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, --tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Download and restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand Down
29 changes: 19 additions & 10 deletions cmd/clickhouse-backup/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,14 @@ import (
"os"
"strings"

"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/logcli"
"github.com/Altinity/clickhouse-backup/v2/pkg/status"
"github.com/apex/log"
"github.com/urfave/cli"

"github.com/Altinity/clickhouse-backup/v2/pkg/backup"
"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/logcli"
"github.com/Altinity/clickhouse-backup/v2/pkg/server"

"github.com/apex/log"
"github.com/urfave/cli"
"github.com/Altinity/clickhouse-backup/v2/pkg/status"
)

var (
Expand Down Expand Up @@ -337,10 +336,10 @@ func main() {
{
Name: "restore",
Usage: "Create schema and restore data from backup",
UsageText: "clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>",
UsageText: "clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>",
Action: func(c *cli.Context) error {
b := backup.NewBackuper(config.GetConfigFromCli(c))
return b.Restore(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("partitions"), c.Bool("schema"), c.Bool("data"), c.Bool("drop"), c.Bool("ignore-dependencies"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), version, c.Int("command-id"))
return b.Restore(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("restore-table-mapping"), c.StringSlice("partitions"), c.Bool("schema"), c.Bool("data"), c.Bool("drop"), c.Bool("ignore-dependencies"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), version, c.Int("command-id"))
},
Flags: append(cliapp.Flags,
cli.StringFlag{
Expand All @@ -353,6 +352,11 @@ func main() {
Usage: "Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "restore-table-mapping, tm",
Usage: "Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "partitions",
Hidden: false,
Expand Down Expand Up @@ -409,10 +413,10 @@ func main() {
{
Name: "restore_remote",
Usage: "Download and restore",
UsageText: "clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>",
UsageText: "clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>",
Action: func(c *cli.Context) error {
b := backup.NewBackuper(config.GetConfigFromCli(c))
return b.RestoreFromRemote(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("partitions"), c.Bool("s"), c.Bool("d"), c.Bool("rm"), c.Bool("i"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), c.Bool("resume"), version, c.Int("command-id"))
return b.RestoreFromRemote(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("restore-table-mapping"), c.StringSlice("partitions"), c.Bool("s"), c.Bool("d"), c.Bool("rm"), c.Bool("i"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), c.Bool("resume"), version, c.Int("command-id"))
},
Flags: append(cliapp.Flags,
cli.StringFlag{
Expand All @@ -425,6 +429,11 @@ func main() {
Usage: "Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "restore-table-mapping, tm",
Usage: "Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "partitions",
Hidden: false,
Expand Down
16 changes: 8 additions & 8 deletions pkg/backup/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@ import (
"encoding/json"
"errors"
"fmt"
"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/storage"
"golang.org/x/sync/errgroup"
"os"
"path"
"path/filepath"
Expand All @@ -17,19 +14,22 @@ import (
"sync/atomic"
"time"

apexLog "github.com/apex/log"
"github.com/google/uuid"
recursiveCopy "github.com/otiai10/copy"
"golang.org/x/sync/errgroup"

"github.com/Altinity/clickhouse-backup/v2/pkg/clickhouse"
"github.com/Altinity/clickhouse-backup/v2/pkg/common"
"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/filesystemhelper"
"github.com/Altinity/clickhouse-backup/v2/pkg/keeper"
"github.com/Altinity/clickhouse-backup/v2/pkg/metadata"
"github.com/Altinity/clickhouse-backup/v2/pkg/partition"
"github.com/Altinity/clickhouse-backup/v2/pkg/status"
"github.com/Altinity/clickhouse-backup/v2/pkg/storage"
"github.com/Altinity/clickhouse-backup/v2/pkg/storage/object_disk"
"github.com/Altinity/clickhouse-backup/v2/pkg/utils"

apexLog "github.com/apex/log"
"github.com/google/uuid"
recursiveCopy "github.com/otiai10/copy"
)

const (
Expand Down Expand Up @@ -255,7 +255,7 @@ func (b *Backuper) createBackupLocal(ctx context.Context, backupName, diffFromRe
var backupDataSize, backupObjectDiskSize, backupMetadataSize uint64
var metaMutex sync.Mutex
createBackupWorkingGroup, createCtx := errgroup.WithContext(ctx)
createBackupWorkingGroup.SetLimit(max(b.cfg.ClickHouse.MaxConnections,1))
createBackupWorkingGroup.SetLimit(max(b.cfg.ClickHouse.MaxConnections, 1))

var tableMetas []metadata.TableTitle
for tableIdx, tableItem := range tables {
Expand Down
Loading

0 comments on commit 23f8cab

Please sign in to comment.