Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement --restore-table-mapping for restore and restore_remote #944

Merged
merged 9 commits into from
Jul 4, 2024
6 changes: 4 additions & 2 deletions Manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,13 +147,14 @@ NAME:
clickhouse-backup restore - Create schema and restore data from backup

USAGE:
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [-tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>

OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Restore only database and objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, -tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand All @@ -177,13 +178,14 @@ NAME:
clickhouse-backup restore_remote - Download and restore

USAGE:
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [-tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>

OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Download and restore objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, -tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Download and restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand Down
11 changes: 9 additions & 2 deletions ReadMe.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ general:
# RESTORE_DATABASE_MAPPING, restore rules from backup databases to target databases, which is useful when changing destination database, all atomic tables will be created with new UUIDs.
# The format for this env variable is "src_db1:target_db1,src_db2:target_db2". For YAML please continue using map syntax
restore_database_mapping: {}

# RESTORE_TABLE_MAPPING, restore rules from backup tables to target tables, which is useful when changing destination tables.
# The format for this env variable is "src_table1:target_table1,src_table2:target_table2". For YAML please continue using map syntax
restore_table_mapping: {}
retries_on_failure: 3 # RETRIES_ON_FAILURE, how many times to retry after a failure during upload or download
retries_pause: 30s # RETRIES_PAUSE, duration time to pause after each download or upload failure

Expand Down Expand Up @@ -476,6 +480,7 @@ Create schema and restore data from backup: `curl -s localhost:7171/backup/resto
- Optional query argument `rbac` works the same as the `--rbac` CLI argument (restore RBAC).
- Optional query argument `configs` works the same as the `--configs` CLI argument (restore configs).
- Optional query argument `restore_database_mapping` works the same as the `--restore-database-mapping` CLI argument.
- Optional query argument `restore_table_mapping` works the same as the `--restore-table-mapping` CLI argument.
- Optional query argument `callback` allow pass callback URL which will call with POST with `application/json` with payload `{"status":"error|success","error":"not empty when error happens"}`.

### POST /backup/delete
Expand Down Expand Up @@ -705,13 +710,14 @@ NAME:
clickhouse-backup restore - Create schema and restore data from backup

USAGE:
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>
clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [-tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>

OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Restore only database and objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, -tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand All @@ -735,13 +741,14 @@ NAME:
clickhouse-backup restore_remote - Download and restore

USAGE:
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>
clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [-tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>

OPTIONS:
--config value, -c value Config 'FILE' name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--environment-override value, --env value override any environment variable via CLI parameter
--table value, --tables value, -t value Download and restore objects which matched with table name patterns, separated by comma, allow ? and * as wildcard
--restore-database-mapping value, -m value Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.
--restore-table-mapping value, -tm value Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.
--partitions partition_id Download and restore backup only for selected partition names, separated by comma
If PARTITION BY clause returns numeric not hashed values for partition_id field in system.parts table, then use --partitions=partition_id1,partition_id2 format
If PARTITION BY clause returns hashed string values, then use --partitions=('non_numeric_field_value_for_part1'),('non_numeric_field_value_for_part2') format
Expand Down
27 changes: 18 additions & 9 deletions cmd/clickhouse-backup/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,14 @@ import (
"os"
"strings"

"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/logcli"
"github.com/Altinity/clickhouse-backup/v2/pkg/status"
"github.com/apex/log"
"github.com/urfave/cli"

"github.com/Altinity/clickhouse-backup/v2/pkg/backup"
"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/logcli"
"github.com/Altinity/clickhouse-backup/v2/pkg/server"

"github.com/apex/log"
"github.com/urfave/cli"
"github.com/Altinity/clickhouse-backup/v2/pkg/status"
)

var (
Expand Down Expand Up @@ -340,7 +339,7 @@ func main() {
UsageText: "clickhouse-backup restore [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [-s, --schema] [-d, --data] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] <backup_name>",
Action: func(c *cli.Context) error {
b := backup.NewBackuper(config.GetConfigFromCli(c))
return b.Restore(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("partitions"), c.Bool("schema"), c.Bool("data"), c.Bool("drop"), c.Bool("ignore-dependencies"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), version, c.Int("command-id"))
return b.Restore(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("restore-table-mapping"), c.StringSlice("partitions"), c.Bool("schema"), c.Bool("data"), c.Bool("drop"), c.Bool("ignore-dependencies"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), version, c.Int("command-id"))
},
Flags: append(cliapp.Flags,
cli.StringFlag{
Expand All @@ -353,6 +352,11 @@ func main() {
Usage: "Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "restore-table-mapping, tm",
Usage: "Define the rule to restore data. For the table not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "partitions",
Hidden: false,
Expand Down Expand Up @@ -409,10 +413,10 @@ func main() {
{
Name: "restore_remote",
Usage: "Download and restore",
UsageText: "clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>",
UsageText: "clickhouse-backup restore_remote [--schema] [--data] [-t, --tables=<db>.<table>] [-m, --restore-database-mapping=<originDB>:<targetDB>[,<...>]] [-tm, --restore-table-mapping=<originTable>:<targetTable>[,<...>]] [--partitions=<partitions_names>] [--rm, --drop] [-i, --ignore-dependencies] [--rbac] [--configs] [--skip-rbac] [--skip-configs] [--resumable] <backup_name>",
Action: func(c *cli.Context) error {
b := backup.NewBackuper(config.GetConfigFromCli(c))
return b.RestoreFromRemote(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("partitions"), c.Bool("s"), c.Bool("d"), c.Bool("rm"), c.Bool("i"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), c.Bool("resume"), version, c.Int("command-id"))
return b.RestoreFromRemote(c.Args().First(), c.String("t"), c.StringSlice("restore-database-mapping"), c.StringSlice("restore-table-mapping"), c.StringSlice("partitions"), c.Bool("s"), c.Bool("d"), c.Bool("rm"), c.Bool("i"), c.Bool("rbac"), c.Bool("rbac-only"), c.Bool("configs"), c.Bool("configs-only"), c.Bool("resume"), version, c.Int("command-id"))
},
Flags: append(cliapp.Flags,
cli.StringFlag{
Expand All @@ -425,6 +429,11 @@ func main() {
Usage: "Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "restore-table-mapping, tm",
Usage: "Define the rule to restore data. For the database not defined in this struct, the program will not deal with it.",
Hidden: false,
},
cli.StringSliceFlag{
Name: "partitions",
Hidden: false,
Expand Down
16 changes: 8 additions & 8 deletions pkg/backup/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@ import (
"encoding/json"
"errors"
"fmt"
"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/storage"
"golang.org/x/sync/errgroup"
"os"
"path"
"path/filepath"
Expand All @@ -17,19 +14,22 @@ import (
"sync/atomic"
"time"

apexLog "github.com/apex/log"
"github.com/google/uuid"
recursiveCopy "github.com/otiai10/copy"
"golang.org/x/sync/errgroup"

"github.com/Altinity/clickhouse-backup/v2/pkg/clickhouse"
"github.com/Altinity/clickhouse-backup/v2/pkg/common"
"github.com/Altinity/clickhouse-backup/v2/pkg/config"
"github.com/Altinity/clickhouse-backup/v2/pkg/filesystemhelper"
"github.com/Altinity/clickhouse-backup/v2/pkg/keeper"
"github.com/Altinity/clickhouse-backup/v2/pkg/metadata"
"github.com/Altinity/clickhouse-backup/v2/pkg/partition"
"github.com/Altinity/clickhouse-backup/v2/pkg/status"
"github.com/Altinity/clickhouse-backup/v2/pkg/storage"
"github.com/Altinity/clickhouse-backup/v2/pkg/storage/object_disk"
"github.com/Altinity/clickhouse-backup/v2/pkg/utils"

apexLog "github.com/apex/log"
"github.com/google/uuid"
recursiveCopy "github.com/otiai10/copy"
)

const (
Expand Down Expand Up @@ -255,7 +255,7 @@ func (b *Backuper) createBackupLocal(ctx context.Context, backupName, diffFromRe
var backupDataSize, backupObjectDiskSize, backupMetadataSize uint64
var metaMutex sync.Mutex
createBackupWorkingGroup, createCtx := errgroup.WithContext(ctx)
createBackupWorkingGroup.SetLimit(max(b.cfg.ClickHouse.MaxConnections,1))
createBackupWorkingGroup.SetLimit(max(b.cfg.ClickHouse.MaxConnections, 1))

var tableMetas []metadata.TableTitle
for tableIdx, tableItem := range tables {
Expand Down
Loading
Loading