title | summary |
---|---|
Continuous Replication from Databases that Use gh-ost or pt-osc |
Learn how to use DM to replicate incremental data from databases that use online DDL tools gh-ost or pt-osc |
In production scenarios, table locking during DDL execution can block the reads from or writes to the database to a certain extent. Therefore, online DDL tools are often used to execute DDLs to minimize the impact on reads and writes. Common DDL tools are gh-ost and pt-osc.
When using DM to migrate data from MySQL to TiDB, you can enable online-ddl
to allow collaboration of DM and gh-ost or pt-osc.
For the detailed replication instructions, refer to the following documents by scenarios:
- Migrate Small Datasets from MySQL to TiDB
- Migrate Large Datasets from MySQL to TiDB
- Migrate and Merge MySQL Shards of Small Datasets to TiDB
- Migrate and Merge MySQL Shards of Large Datasets to TiDB
In the task configuration file of DM, set the global parameter online-ddl
to true
, as shown below:
# ----------- Global configuration -----------
## ********* Basic configuration *********
name: test # The name of the task. Should be globally unique.
task-mode: all # The task mode. Can be set to `full`, `incremental`, or `all`.
shard-mode: "pessimistic" # The shard merge mode. Optional modes are `pessimistic` and `optimistic`. The `pessimistic` mode is used by default. After understanding the principles and restrictions of the "optimistic" mode, you can set it to the "optimistic" mode.
meta-schema: "dm_meta" # The downstream database that stores the `meta` information.
online-ddl: true # Enable online-ddl support on DM to support automatic processing of "gh-ost" and "pt-osc" for the upstream database.
After online-ddl is enabled on DM, the DDL statements generated by DM replicating gh-ost or pt-osc will change.
The workflow of gh-ost or pt-osc:
-
Create a ghost table according to the table schema of the DDL real table.
-
Apply DDLs on the ghost table.
-
Replicate the data of the DDL real table to the ghost table.
-
After the data are consistent between the two tables, use the rename statement to replace the real table with the ghost table.
The workflow of DM:
-
Skip creating the ghost table downstream.
-
Record DDLs applied to the ghost table.
-
Replicate data only from the ghost table.
-
Apply DDLs recorded downstream.
The change in the workflow brings the following advantages:
-
The downstream TiDB does not need to create and replicate the ghost table, saving the storage space and network transmission overhead.
-
When you migrate and merge data from sharded tables, the RENAME operation is ignored for each sharded ghost table to ensure the correctness of the replication.