Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dumpling: update the gc info (#18433) #18435

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion develop/dev-guide-timeouts-in-tidb.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Note that the system variable configuration takes effect globally and immediatel

> **Tip:**
>
> Specifically, when Dumpling is exporting data from TiDB (less than 1 TB), if the TiDB version is later than or equal to v4.0.0 and Dumpling can access the PD address of the TiDB cluster, Dumpling automatically extends the GC time without affecting the original cluster.
> Specifically, when Dumpling is exporting data from TiDB (less than 1 TB), if the TiDB version is v4.0.0 or later and Dumpling can access the PD address and the [`INFORMATION_SCHEMA.CLUSTER_INFO`](/information-schema/information-schema-cluster-info.md) table of the TiDB cluster, Dumpling automatically adjusts the GC safe point to block GC without affecting the original cluster.
>
> However, in either of the following scenarios, Dumpling cannot automatically adjust the GC time:
>
Expand Down
4 changes: 2 additions & 2 deletions dumpling-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Compared to Mydumper, Dumpling has the following improvements:
- Support exporting data to Amazon S3 cloud storage.
- More optimizations are made for TiDB:
- Support configuring the memory limit of a single TiDB SQL statement.
- If Dumpling can connect directly to PD, Dumpling supports automatic adjustment of TiDB GC time for TiDB v4.0.0 and later versions.
- If Dumpling can access the PD address and the [`INFORMATION_SCHEMA.CLUSTER_INFO`](/information-schema/information-schema-cluster-info.md) table of the TiDB cluster, Dumpling supports automatically adjusting the [GC](/garbage-collection-overview.md) safe point time to block GC for TiDB v4.0.0 and later versions.
- Use TiDB's hidden column `_tidb_rowid` to optimize the performance of concurrent data export from a single table.
- For TiDB, you can set the value of [`tidb_snapshot`](/read-historical-data.md#how-tidb-reads-data-from-history-versions) to specify the time point of the data backup. This ensures the consistency of the backup, instead of using `FLUSH TABLES WITH READ LOCK` to ensure the consistency.

Expand Down Expand Up @@ -355,7 +355,7 @@ When Dumpling is exporting a large single table from TiDB, Out of Memory (OOM) m

### Manually set the TiDB GC time

When exporting data from TiDB (less than 1 TB), if the TiDB version is later than or equal to v4.0.0 and Dumpling can access the PD address of the TiDB cluster, Dumpling automatically extends the GC time without affecting the original cluster.
When exporting data from TiDB (less than 1 TB), if the TiDB version is v4.0.0 or later and Dumpling can access the PD address and the [`INFORMATION_SCHEMA.CLUSTER_INFO`](/information-schema/information-schema-cluster-info.md) table of the TiDB cluster, Dumpling automatically adjusts the GC safe point to block GC without affecting the original cluster.

However, in either of the following scenarios, Dumpling cannot automatically adjust the GC time:

Expand Down
2 changes: 1 addition & 1 deletion migrate-from-tidb-to-mysql.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ After setting up the environment, you can use [Dumpling](/dumpling-overview.md)

1. Disable Garbage Collection (GC).

To ensure that newly written data is not deleted during incremental migration, you should disable GC for the upstream cluster before exporting full data. In this way, history data is not deleted.
To ensure that newly written data is not deleted during incremental migration, you should disable GC for the upstream cluster before exporting full data. In this way, history data is not deleted. For TiDB v4.0.0 and later versions, Dumpling might [automatically adjust the GC safe point to block GC](/dumpling-overview.md#manually-set-the-tidb-gc-time). Nevertheless, manually disabling GC is still necessary because the GC process might begin after Dumpling exits, leading to the failure of incremental changes migration.

Run the following command to disable GC:

Expand Down
Loading