-
Notifications
You must be signed in to change notification settings - Fork 700
restore: update the definition of the parameter --load-stats and the usage of pitr id map #21078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -89,7 +89,11 @@ During the initial restore, `br` first enters the snapshot restore phase. BR rec | |||||
|
||||||
When entering the log restore phase during the initial restore, `br` creates a `__TiDB_BR_Temporary_Log_Restore_Checkpoint` database in the target cluster. This database records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually delete the `__TiDB_BR_Temporary_Log_Restore_Checkpoint` database and retry with a different backup. | ||||||
|
||||||
Before entering the log restore phase during the initial restore, `br` constructs a mapping of upstream and downstream cluster database and table IDs at the `restored-ts` time point. This mapping is persisted in the system table `mysql.tidb_pitr_id_map` to prevent duplicate allocation of database and table IDs. Deleting data from `mysql.tidb_pitr_id_map` might lead to inconsistent PITR restore data. | ||||||
Before entering the log restore phase during the initial restore, `br` constructs a mapping of upstream and downstream cluster database and table IDs at the `restored-ts` time point. This mapping is persisted in the system table `mysql.tidb_pitr_id_map` to prevent duplicate allocation of database and table IDs. **Deleting data from `mysql.tidb_pitr_id_map` at will might lead to inconsistent PITR restore data.** | ||||||
|
||||||
> **Note:** | ||||||
> | ||||||
> To be compatible with clusters of older versions, starting from v9.0.0, when the system table `mysql.tidb_pitr_id_map` does not exist in the restoring cluster, the `pitr_id_map` data will be written to the log backup directory with the file name `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}`. | ||||||
|
||||||
## Implementation details: store checkpoint data in the external storage | ||||||
|
||||||
|
@@ -151,4 +155,4 @@ During the initial restore, `br` first enters the snapshot restore phase. BR rec | |||||
|
||||||
When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the target cluster. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup. | ||||||
|
||||||
Before entering the log restore phase during the initial restore, `br` constructs a mapping of the database and table IDs in the upstream and downstream clusters at the `restored-ts` time point. This mapping is persisted in the system table `mysql.tidb_pitr_id_map` to prevent duplicate allocation of database and table IDs. Deleting data from `mysql.tidb_pitr_id_map` might lead to inconsistent PITR restore data. | ||||||
Before entering the log restore phase during the initial restore, `br` constructs a mapping of the database and table IDs in the upstream and downstream clusters at the `restored-ts` time point. This mapping is persisted in the checkpoint storage with the file name `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}` to prevent duplicate allocation of database and table IDs. **Deleting files from the directory `pitr_id_maps` at will might lead to inconsistent PITR restore data.** | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to the previous comment, the phrase "at will" could be replaced with a more formal term like "arbitrarily" or "without understanding the implications" for better clarity 1. Style Guide References
Suggested change
Footnotes |
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -127,8 +127,21 @@ tiup br restore full \ | |||||||||
--storage local:///br_data/ --pd "${PD_IP}:2379" --log-file restore.log | ||||||||||
``` | ||||||||||
|
||||||||||
> **Note:** | ||||||||||
> | ||||||||||
> Starting from v9.0.0, when the parameter `--load-stats` is set to false, br will not update the relevant information of the restored tables in the table `mysql.stats_meta`. And then you can manually execute `analyze table` SQL after the recovery is complete to update statistics. | ||||||||||
|
||||||||||
When the backup and restore feature backs up data, it stores statistics in JSON format within the `backupmeta` file. When restoring data, it loads statistics in JSON format into the cluster. For more information, see [LOAD STATS](/sql-statements/sql-statement-load-stats.md). | ||||||||||
|
||||||||||
Starting from 9.0.0, BR introduces the parameter `--fast-load-sys-tables`, which is enabled by default. When the br restore data in a new cluster and the IDs of tables and partitions can be reused (otherwise, it will automatically fall back to logically load statistic data), by setting `--fast-load-sys-tables`, br will use the `RENAME TABLE` DDL to atomically swap the system tables in the database `__TiDB_BR_Temporary_mysql` with the system tables in the database `mysql`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This paragraph introduces the Style Guide References
Suggested change
Footnotes |
||||||||||
|
||||||||||
The following is an example: | ||||||||||
|
||||||||||
```shell | ||||||||||
tiup br restore full \ | ||||||||||
--storage local:///br_data/ --pd "${PD_IP}:2379" --log-file restore.log --load-stats --fast-load-sys-tables | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The example command includes
Suggested change
|
||||||||||
``` | ||||||||||
|
||||||||||
## Encrypt the backup data | ||||||||||
|
||||||||||
BR supports encrypting backup data at the backup side and [at the storage side when backing up to Amazon S3](/br/backup-and-restore-storages.md#amazon-s3-server-side-encryption). You can choose either encryption method as required. | ||||||||||
|
@@ -181,6 +194,22 @@ Download&Ingest SST <----------------------------------------------------------- | |||||||||
Restore Pipeline <-------------------------/...............................................> 17.12% | ||||||||||
``` | ||||||||||
|
||||||||||
Starting from TiDB v9.0.0, BR lets you specify `--fast-load-sys-tables` to restore statistic data physically in a new cluster: | ||||||||||
|
||||||||||
```shell | ||||||||||
tiup br restore full \ | ||||||||||
--pd "${PD_IP}:2379" \ | ||||||||||
--with-sys-table \ | ||||||||||
--fast-load-sys-tables \ | ||||||||||
--storage "s3://${backup_collection_addr}/snapshot-${date}?access-key=${access-key}&secret-access-key=${secret-access-key}" \ | ||||||||||
--ratelimit 128 \ | ||||||||||
--log-file restorefull.log | ||||||||||
``` | ||||||||||
|
||||||||||
> **Note:** | ||||||||||
> | ||||||||||
> Different from restoring system tables logically by `REPLACE INTO` SQL, restoring system tables physically will completely overwrite the original data in the system tables. | ||||||||||
Comment on lines
+197
to
+211
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This section introduces the Style Guide ReferencesFootnotes |
||||||||||
|
||||||||||
## Restore a database or a table | ||||||||||
|
||||||||||
You can use `br` to restore partial data of a specified database or table from backup data. This feature allows you to filter out data that you do not need during the restore. | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrase "at will" is a bit informal. Could we use a more formal or precise term here, such as "arbitrarily" or "without understanding the implications"? This would align with maintaining clarity and simplicity in technical documentation 1.
Style Guide References
Footnotes
Maintain clarity and simplicity. (link) ↩