From a6eea9b049456ee88da442ba0d739426edc1cd1d Mon Sep 17 00:00:00 2001 From: github-actions Date: Thu, 29 Jun 2023 09:16:38 +0000 Subject: [PATCH] Update files from https://github.com/pingcap/docs/pull/13947/commits/2a3a7af8788fa91fa6745335c80cd9f31bef3bed --- markdown-pages/en/tidb/master/_docHome.md | 42 +- .../en/tidb/master/quick-start-with-tidb.md | 515 ++++++++++++++++++ .../sql-statement-flush-privileges.md | 48 ++ .../en/tidb/master/ticdc/ticdc-overview.md | 98 ++++ 4 files changed, 682 insertions(+), 21 deletions(-) create mode 100644 markdown-pages/en/tidb/master/quick-start-with-tidb.md create mode 100644 markdown-pages/en/tidb/master/sql-statements/sql-statement-flush-privileges.md create mode 100644 markdown-pages/en/tidb/master/ticdc/ticdc-overview.md diff --git a/markdown-pages/en/tidb/master/_docHome.md b/markdown-pages/en/tidb/master/_docHome.md index d376c34d3e..6276229b9c 100644 --- a/markdown-pages/en/tidb/master/_docHome.md +++ b/markdown-pages/en/tidb/master/_docHome.md @@ -9,33 +9,33 @@ hide_leftNav: true -TiDB Cloud is a fully-managed Database-as-a-Service (DBaaS) that brings everything great about TiDB to your cloud, and lets you focus on your applications, not the complexities of your database. +TiDB Cloud is a fully-managed Database-as-a-Service (DBaaS) that brings everything great about TiDB to your cloud, allowing you to focus on your applications instead of the complexities of your database. -See the documentation of TiDB Cloud +View the documentation for TiDB Cloud. -Guides you through an easy way to get started with TiDB Cloud +Guide for an easy way to get started with TiDB Cloud. -Helps you quickly complete a Proof of Concept (PoC) of TiDB Cloud +Helps you quickly complete a Proof of Concept (PoC) with TiDB Cloud. -Get the power of a cloud-native, distributed SQL database built for real-time analytics in a fully-managed service. +Experience the power of a cloud-native, distributed SQL database built for real-time analytics in a fully-managed service. -Try Free +Try for Free @@ -55,25 +55,25 @@ TiDB is an open-source distributed SQL database that supports Hybrid Transaction -See the documentation of TiDB +View the documentation for TiDB. -Walks you through the quickest way to get started with TiDB +Walks you through the quickest way to get started with TiDB. -Learn how to deploy TiDB locally in production +Learn how to deploy TiDB locally in a production environment. -The open-source TiDB platform is released under the Apache 2.0 license, and supported by the community. +The open-source TiDB platform is released under the Apache 2.0 license and is supported by the community. Download @@ -85,13 +85,13 @@ The open-source TiDB platform is released under the Apache 2.0 license, and supp -Documentation for TiDB application developers +Documentation for TiDB application developers. -Documentation for TiDB Cloud application developers +Documentation for TiDB Cloud application developers. @@ -105,55 +105,55 @@ Documentation for TiDB Cloud application developers -One-stop and interactive experience of TiDB's capabilities WITHOUT registration +Experience the capabilities of TiDB without registration. -Learn TiDB and TiDB Cloud through well-designed online courses and instructor-led training +Learn TiDB and TiDB Cloud through well-designed online courses and instructor-led training. -Join us on Slack or become a contributor +Join us on Slack or become a contributor. -Learn great articles about TiDB and TiDB Cloud +Read great articles about TiDB and TiDB Cloud. -See a compilation of short videos describing TiDB and a variety of use cases +Watch a compilation of short videos describing TiDB and various use cases. -Learn events about PingCAP and the community +Learn about events hosted by PingCAP and the community. -Download eBooks and papers +Download eBooks and papers. -A powerful insight tool that analyzes in depth any GitHub repository, powered by TiDB Cloud +A powerful insight tool that analyzes any GitHub repository in depth, powered by TiDB Cloud. -Let’s work together to make the documentation better! +Let's work together to improve the documentation! diff --git a/markdown-pages/en/tidb/master/quick-start-with-tidb.md b/markdown-pages/en/tidb/master/quick-start-with-tidb.md new file mode 100644 index 0000000000..7337d27681 --- /dev/null +++ b/markdown-pages/en/tidb/master/quick-start-with-tidb.md @@ -0,0 +1,515 @@ +--- +title: Quick Start Guide for the TiDB Database Platform +summary: Learn how to quickly get started with the TiDB platform and see if TiDB is the right choice for you. +aliases: ['/docs/dev/quick-start-with-tidb/','/docs/dev/test-deployment-using-docker/'] +--- + +# Quick Start Guide for the TiDB Database Platform + +This guide provides the quickest way to get started with TiDB. For non-production environments, you can deploy your TiDB database using either of the following methods: + +- [Deploy a local test cluster](#deploy-a-local-test-cluster) (for macOS and Linux) +- [Simulate production deployment on a single machine](#simulate-production-deployment-on-a-single-machine) (for Linux only) + +In addition, you can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?utm_source=docs&utm_medium=tidb_quick_start). + +> **Note:** +> +> The deployment method provided in this guide is **ONLY FOR** quick start, **NOT FOR** production. +> +> - To deploy a self-hosted production cluster, see the [production installation guide](/production-deployment-using-tiup.md). +> - To deploy TiDB on Kubernetes, see [Get Started with TiDB on Kubernetes](https://docs.pingcap.com/tidb-in-kubernetes/stable/get-started). +> - To manage TiDB in the cloud, see [TiDB Cloud Quick Start](https://docs.pingcap.com/tidbcloud/tidb-cloud-quickstart). + +## Deploy a local test cluster + +- Scenario: Quickly deploy a local TiDB cluster for testing using a single macOS or Linux server. By deploying such a cluster, you can learn the basic architecture of TiDB and the operation of its components, such as TiDB, TiKV, PD, and the monitoring components. + + +
+ +As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB instances, 3 TiKV instances, 3 PD instances, and optional TiFlash instances. With TiUP Playground, you can quickly build the test cluster by following these steps: + +1. Download and install TiUP: + + {{< copyable "shell-regular" >}} + + ```shell + curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh + ``` + + If the following message is displayed, you have successfully installed TiUP: + + ```log + Successfully set mirror to https://tiup-mirrors.pingcap.com + Detected shell: zsh + Shell profile: /Users/user/.zshrc + /Users/user/.zshrc has been modified to add tiup to PATH + open a new terminal or source /Users/user/.zshrc to use it + Installed path: /Users/user/.tiup/bin/tiup + =============================================== + Have a try: tiup playground + =============================================== + ``` + + Note the Shell profile path in the output above. You need to use the path in the next step. + +2. Declare the global environment variable: + + > **Note:** + > + > After the installation, TiUP displays the absolute path of the corresponding Shell profile file. You need to modify `${your_shell_profile}` in the following `source` command according to the path. In this case, `${your_shell_profile}` is `/Users/user/.zshrc` from the output of Step 1. + + {{< copyable "shell-regular" >}} + + ```shell + source ${your_shell_profile} + ``` + +3. Start the cluster in the current session: + + - To start a TiDB cluster of the latest version with 1 TiDB instance, 1 TiKV instance, 1 PD instance, and 1 TiFlash instance, run the following command: + + {{< copyable "shell-regular" >}} + + ```shell + tiup playground + ``` + + - To specify the TiDB version and the number of instances of each component, run a command like this: + + {{< copyable "shell-regular" >}} + + ```shell + tiup playground v7.1.0 --db 2 --pd 3 --kv 3 + ``` + + The command downloads a version cluster to the local machine and starts it, such as v7.1.0. To view the latest version, run `tiup list tidb`. + + This command returns the access methods of the cluster: + + ```log + CLUSTER START SUCCESSFULLY, Enjoy it ^-^ + To connect TiDB: mysql --comments --host 127.0.0.1 --port 4001 -u root -p (no password) + To connect TiDB: mysql --comments --host 127.0.0.1 --port 4000 -u root -p (no password) + To view the dashboard: http://127.0.0.1:2379/dashboard + PD client endpoints: [127.0.0.1:2379 127.0.0.1:2382 127.0.0.1:2384] + To view Prometheus: http://127.0.0.1:9090 + To view Grafana: http://127.0.0.1:3000 + ``` + + > **Note:** + > + > + Since v5.2.0, TiDB supports running `tiup playground` on the machine that uses the Apple M1 chip. + > + For the playground operated in this way, after the test deployment is finished, TiUP will clean up the original cluster data. You will get a new cluster after re-running the command. + > + If you want the data to be persisted on storage, run `tiup --tag playground ...`. For details, refer to the [TiUP Reference Guide](/tiup/tiup-reference.md#-t---tag). + +4. Start a new session to access TiDB: + + + Use the TiUP client to connect to TiDB. + + {{< copyable "shell-regular" >}} + + ```shell + tiup client + ``` + + + Alternatively, you can use the MySQL client to connect to TiDB. + + {{< copyable "shell-regular" >}} + + ```shell + mysql --host 127.0.0.1 --port 4000 -u root + ``` + +5. Access the Prometheus dashboard of TiDB at . + +6. Access the [TiDB Dashboard](/dashboard/dashboard-intro.md) at . The default username is `root`, and the password is empty. + +7. Access the Grafana dashboard of TiDB through . Both the default username and password are `admin`. + +8. (Optional) [Load data to TiFlash](/tiflash/tiflash-overview.md#use-tiflash) for analysis. + +9. Clean up the cluster after the test deployment: + + 1. Stop the above TiDB service by pressing Control+C. + + 2. Run the following command after the service is stopped: + + {{< copyable "shell-regular" >}} + + ```shell + tiup clean --all + ``` + +> **Note:** +> +> TiUP Playground listens on `127.0.0.1` by default, and the service is only locally accessible. If you want the service to be externally accessible, specify the listening address using the `--host` parameter to bind the network interface card (NIC) to an externally accessible IP address. + +
+
+ +As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB instances, 3 TiKV instances, 3 PD instances, and optional TiFlash instances. With TiUP Playground, you can quickly build the test cluster by following these steps: + +1. Download and install TiUP: + + {{< copyable "shell-regular" >}} + + ```shell + curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh + ``` + + If the following message is displayed, you have successfully installed TiUP: + + ```log + Successfully set mirror to https://tiup-mirrors.pingcap.com + Detected shell: zsh + Shell profile: /Users/user/.zshrc + /Users/user/.zshrc has been modified to add tiup to PATH + open a new terminal or source /Users/user/.zshrc to use it + Installed path: /Users/user/.tiup/bin/tiup + =============================================== + Have a try: tiup playground + =============================================== + ``` + + Note the Shell profile path in the output above. You need to use the path in the next step. + +2. Declare the global environment variable: + + > **Note:** + > + > After the installation, TiUP displays the absolute path of the corresponding Shell profile file. You need to modify `${your_shell_profile}` in the following `source` command according to the path. + + {{< copyable "shell-regular" >}} + + ```shell + source ${your_shell_profile} + ``` + +3. Start the cluster in the current session: + + - To start a TiDB cluster of the latest version with 1 TiDB instance, 1 TiKV instance, 1 PD instance, and 1 TiFlash instance, run the following command: + + {{< copyable "shell-regular" >}} + + ```shell + tiup playground + ``` + + - To specify the TiDB version and the number of instances of each component, run a command like this: + + {{< copyable "shell-regular" >}} + + ```shell + tiup playground v7.1.0 --db 2 --pd 3 --kv 3 + ``` + + The command downloads a version cluster to the local machine and starts it, such as v7.1.0. To view the latest version, run `tiup list tidb`. + + This command returns the access methods of the cluster: + + ```log + CLUSTER START SUCCESSFULLY, Enjoy it ^-^ + To connect TiDB: mysql --host 127.0.0.1 --port 4000 -u root -p (no password) --comments + To view the dashboard: http://127.0.0.1:2379/dashboard + PD client endpoints: [127.0.0.1:2379] + To view the Prometheus: http://127.0.0.1:9090 + To view the Grafana: http://127.0.0.1:3000 + ``` + + > **Note:** + > + > For the playground operated in this way, after the test deployment is finished, TiUP will clean up the original cluster data. You will get a new cluster after re-running the command. + > If you want the data to be persisted on storage, run `tiup --tag playground ...`. For details, refer to the [TiUP Reference Guide](/tiup/tiup-reference.md#-t---tag). + +4. Start a new session to access TiDB: + + + Use the TiUP client to connect to TiDB. + + {{< copyable "shell-regular" >}} + + ```shell + tiup client + ``` + + + Alternatively, you can use the MySQL client to connect to TiDB. + + {{< copyable "shell-regular" >}} + + ```shell + mysql --host 127.0.0.1 --port 4000 -u root + ``` + +5. Access the Prometheus dashboard of TiDB at . + +6. Access the [TiDB Dashboard](/dashboard/dashboard-intro.md) at . The default username is `root`, and the password is empty. + +7. Access the Grafana dashboard of TiDB through . Both the default username and password are `admin`. + +8. (Optional) [Load data to TiFlash](/tiflash/tiflash-overview.md#use-tiflash) for analysis. + +9. Clean up the cluster after the test deployment: + + 1. Stop the process by pressing Control+C. + + 2. Run the following command after the service is stopped: + + {{< copyable "shell-regular" >}} + + ```shell + tiup clean --all + ``` + +> **Note:** +> +> TiUP Playground listens on `127.0.0.1` by default, and the service is only locally accessible. If you want the service to be externally accessible, specify the listening address using the `--host` parameter to bind the network interface card (NIC) to an externally accessible IP address. + +
+
+ +## Simulate production deployment on a single machine + +- Scenario: Experience the smallest TiDB cluster with the complete topology and simulate the production deployment steps on a single Linux server. + +This section describes how to deploy a TiDB cluster using a YAML file of the smallest topology in TiUP. + +### Preparation + +Before deploying the TiDB cluster, ensure that the target machine meets the following requirements: + +- CentOS 7.3 or a later version is installed. +- The Linux OS has access to the internet, which is required to download TiDB and related software installation packages. + +The smallest TiDB cluster topology consists of the following instances: + +> **Note:** +> +> The IP addresses of the instances are given as examples only. In your actual deployment, replace the IP addresses with your actual IP addresses. + +| Instance | Count | IP | Configuration | +|:-- | :-- | :-- | :-- | +| TiKV | 3 | 10.0.1.1
10.0.1.1
10.0.1.1 | Avoid conflict between the port and the directory | +| TiDB | 1 | 10.0.1.1 | The default port
Global directory configuration | +| PD | 1 | 10.0.1.1 | The default port
Global directory configuration | +| TiFlash | 1 | 10.0.1.1 | The default port
Global directory configuration | +| Monitor | 1 | 10.0.1.1 | The default port
Global directory configuration | + +Other requirements for the target machine include: + +- The `root` user and its password are required +- [Stop the firewall service of the target machine](/check-before-deployment.md#check-and-stop-the-firewall-service-of-target-machines), or open the port needed by the TiDB cluster nodes +- Currently, the TiUP cluster supports deploying TiDB on the x86_64 (AMD64) and ARM architectures: + + - It is recommended to use CentOS 7.3 or later versions on AMD64. + - It is recommended to use CentOS 7.6 1810 on ARM. + +### Deployment + +> **Note:** +> +> You can log in to the target machine as a regular user or the `root` user. The following steps use the `root` user as an example. + +1. Download and install TiUP: + + {{< copyable "shell-regular" >}} + + ```shell + curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh + ``` + +2. Declare the global environment variable. + + > **Note:** + > + > After the installation, TiUP displays the absolute path of the corresponding Shell profile file. You need to modify `${your_shell_profile}` in the following `source` command according to the path. + + {{< copyable "shell-regular" >}} + + ```shell + source ${your_shell_profile} + ``` + +3. Install the cluster component of TiUP: + + {{< copyable "shell-regular" >}} + + ```shell + tiup cluster + ``` + +4. If the TiUP cluster is already installed on the machine, update the software version: + + {{< copyable "shell-regular" >}} + + ```shell + tiup update --self && tiup update cluster + ``` + +5. Increase the connection limit of the `sshd` service using the root user privilege. This is because TiUP needs to simulate deployment on multiple machines. + + 1. Modify `/etc/ssh/sshd_config`, and set `MaxSessions` to `20`. + 2. Restart the `sshd` service: + + {{< copyable "shell-root" >}} + + ```shell + service sshd restart + ``` + +6. Create and start the cluster: + + Edit the configuration file according to the following template, and name it as `topo.yaml`: + + {{< copyable "" >}} + + ```yaml + # # Global variables are applied to all deployments and used as the default value of + # # the deployments if a specific deployment value is missing. + global: + user: "tidb" + ssh_port: 22 + deploy_dir: "/tidb-deploy" + data_dir: "/tidb-data" + + # # Monitored variables are applied to all the machines. + monitored: + node_exporter_port: 9100 + blackbox_exporter_port: 9115 + + server_configs: + tidb: + log.slow-threshold: 300 + tikv: + readpool.storage.use-unified-pool: false + readpool.coprocessor.use-unified-pool: true + pd: + replication.enable-placement-rules: true + replication.location-labels: ["host"] + tiflash: + logger.level: "info" + + pd_servers: + - host: 10.0.1.1 + + tidb_servers: + - host: 10.0.1.1 + + tikv_servers: + - host: 10.0.1.1 + port: 20160 + status_port: 20180 + config: + server.labels: { host: "logic-host-1" } + + - host: 10.0.1.1 + port: 20161 + status_port: 20181 + config: + server.labels: { host: "logic-host-2" } + + - host: 10.0.1.1 + port: 20162 + status_port: 20182 + config: + server.labels: { host: "logic-host-3" } + + tiflash_servers: + - host: 10.0.1.1 + + monitoring_servers: + - host: 10.0.1.1 + + grafana_servers: + - host: 10.0.1.1 + ``` + + - `user: "tidb"`: Use the `tidb` system user (automatically created during deployment) to perform the internal management of the cluster. By default, use port 22 to log in to the target machine via SSH. + - `replication.enable-placement-rules`: This PD parameter is set to ensure that TiFlash runs normally. + - `host`: The IP of the target machine. + +7. Execute the cluster deployment command: + + {{< copyable "shell-regular" >}} + + ```shell + tiup cluster deploy ./topo.yaml --user root -p + ``` + + - ``: Set the cluster name + - ``: Set the TiDB cluster version, such as `v7.1.0`. You can see all the supported TiDB versions by running the `tiup list tidb` command + - `-p`: Specify the password used to connect to the target machine. + + > **Note:** + > + > If you use secret keys, you can specify the path of the keys through `-i`. Do not use `-i` and `-p` at the same time. + + Enter "y" and the `root` user's password to complete the deployment: + + ```log + Do you want to continue? [y/N]: y + Input SSH password: + ``` + +8. Start the cluster: + + {{< copyable "shell-regular" >}} + + ```shell + tiup cluster start + ``` + +9. Access the cluster: + + - Install the MySQL client. If it is already installed, skip this step. + + {{< copyable "shell-regular" >}} + + ```shell + yum -y install mysql + ``` + + - Access TiDB. The password is empty: + + {{< copyable "shell-regular" >}} + + ```shell + mysql -h 10.0.1.1 -P 4000 -u root + ``` + + - Access the Grafana monitoring dashboard at . The default username and password are both `admin`. + + - Access the [TiDB Dashboard](/dashboard/dashboard-intro.md) at . The default username is `root`, and the password is empty. + + - To view the currently deployed cluster list: + + {{< copyable "shell-regular" >}} + + ```shell + tiup cluster list + ``` + + - To view the cluster topology and status: + + {{< copyable "shell-regular" >}} + + ```shell + tiup cluster display + ``` + +## What's next + +If you have just deployed a TiDB cluster for the local test environment, here are the next steps: + +- Learn about basic SQL operations in TiDB by referring to the [Basic SQL operations in TiDB](/basic-sql-operations.md) documentation. +- You can also migrate data to TiDB by referring to the [Migrate data to TiDB](/migration-overview.md) documentation. + +If you are ready to deploy a TiDB cluster for the production environment, here are the next steps: + +- [Deploy TiDB using TiUP](/production-deployment-using-tiup.md) +- Alternatively, you can deploy TiDB on Cloud using TiDB Operator by referring to the [TiDB Operator](https://docs.pingcap.com/tidb-in-kubernetes/stable) documentation. + +If you're looking for an analytics solution with TiFlash, here are the next steps: + +- [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash) +- [TiFlash Overview](/tiflash/tiflash-overview.md) diff --git a/markdown-pages/en/tidb/master/sql-statements/sql-statement-flush-privileges.md b/markdown-pages/en/tidb/master/sql-statements/sql-statement-flush-privileges.md new file mode 100644 index 0000000000..b120364dd1 --- /dev/null +++ b/markdown-pages/en/tidb/master/sql-statements/sql-statement-flush-privileges.md @@ -0,0 +1,48 @@ +--- +title: FLUSH PRIVILEGES | TiDB SQL Statement Reference +summary: An overview of the usage of FLUSH PRIVILEGES for the TiDB database. +aliases: ['/docs/dev/sql-statements/sql-statement-flush-privileges/','/docs/dev/reference/sql/statements/flush-privileges/'] +--- + +# FLUSH PRIVILEGES + +The statement `FLUSH PRIVILEGES` instructs TiDB to reload the in-memory copy of privileges from the [privilege tables](/privilege-management.md#privilege-table). You must execute this statement after manually editing tables such as `mysql.user`. However, executing this statement is not necessary after using privilege statements like `GRANT` or `REVOKE`. To execute this statement, the `RELOAD` privilege is required. + +## Synopsis + +```ebnf+diagram +FlushStmt ::= + 'FLUSH' NoWriteToBinLogAliasOpt FlushOption + +NoWriteToBinLogAliasOpt ::= + ( 'NO_WRITE_TO_BINLOG' | 'LOCAL' )? + +FlushOption ::= + 'PRIVILEGES' +| 'STATUS' +| 'TIDB' 'PLUGINS' PluginNameList +| 'HOSTS' +| LogTypeOpt 'LOGS' +| TableOrTables TableNameListOpt WithReadLockOpt +``` + +## Examples + +```sql +mysql> FLUSH PRIVILEGES; +Query OK, 0 rows affected (0.01 sec) +``` + +## MySQL compatibility + +This statement is fully compatible with MySQL. If there are any compatibility differences, report them via [an issue on GitHub](https://github.com/pingcap/tidb/issues/new/choose). + +## See also + +* [SHOW GRANTS](/sql-statements/sql-statement-show-grants.md) + + + +* [Privilege Management](/privilege-management.md) + + diff --git a/markdown-pages/en/tidb/master/ticdc/ticdc-overview.md b/markdown-pages/en/tidb/master/ticdc/ticdc-overview.md new file mode 100644 index 0000000000..893db69c10 --- /dev/null +++ b/markdown-pages/en/tidb/master/ticdc/ticdc-overview.md @@ -0,0 +1,98 @@ +--- +title: TiCDC Overview +summary: Learn what TiCDC is, what features TiCDC provides, and how to install and deploy TiCDC. +aliases: ['/docs/dev/ticdc/ticdc-overview/','/docs/dev/reference/tools/ticdc/overview/'] +--- + +# TiCDC Overview + +[TiCDC](https://github.com/pingcap/tiflow/tree/master/cdc) is a tool used to replicate incremental data from TiDB. Specifically, TiCDC pulls TiKV change logs, sorts captured data, and exports row-based incremental data to downstream databases. + +## Usage scenarios + +TiCDC has multiple usage scenarios, including: + +- Providing high availability and disaster recovery solutions for multiple TiDB clusters. TiCDC ensures eventual data consistency between primary and secondary clusters in case of a disaster. +- Replicating real-time data changes to homogeneous systems. This provides data sources for various scenarios, such as monitoring, caching, global indexing, data analysis, and primary-secondary replication between heterogeneous databases. + +## Major features + +### Key capabilities + +TiCDC has the following key capabilities: + +- Replicating incremental data between TiDB clusters with second-level RPO and minute-level RTO. +- Bidirectional replication between TiDB clusters, allowing the creation of a multi-active TiDB solution using TiCDC. +- Replicating incremental data from a TiDB cluster to a MySQL database or other MySQL-compatible databases with low latency. +- Replicating incremental data from a TiDB cluster to a Kafka cluster. The recommended data format includes [Canal-JSON](/ticdc/ticdc-canal-json.md) and [Avro](/ticdc/ticdc-avro-protocol.md). +- Replicating tables with the ability to filter databases, tables, DMLs, and DDLs. +- High availability with no single point of failure, supporting dynamically adding and deleting TiCDC nodes. +- Cluster management through Open API, including querying task status, dynamically modifying task configuration, and creating or deleting tasks. + +### Replication order + +TiCDC ensures that all DDL or DML statements are outputted at least once. In case of a failure, TiCDC may send the same DDL/DML statement repeatedly. For duplicated DDL/DML statements: + +- TiCDC outputs all DDL or DML statements **at least once**. +- When the TiKV or TiCDC cluster encounters a failure, TiCDC might send the same DDL/DML statement repeatedly. For duplicated DDL/DML statements: + + - The MySQL sink can execute DDL statements repeatedly. For DDL statements that can be executed repeatedly in the downstream, such as `TRUNCATE TABLE`, the statement is executed successfully. For those that cannot be executed repeatedly, such as `CREATE TABLE`, the execution fails, and TiCDC ignores the error and continues with the replication process. + - The Kafka sink provides different strategies for data distribution. You can distribute data to different Kafka partitions based on the table, primary key, or timestamp. This ensures that the updated data of a row is sent to the same partition in order. + - All these distribution strategies send Resolved TS messages to all topics and partitions periodically. This indicates that all messages earlier than the Resolved TS have already been sent to the topics and partitions. The Kafka consumer can use the Resolved TS to sort the messages received. + - Kafka sink sometimes sends duplicated messages, but these duplicated messages do not affect the constraints of `Resolved Ts`. For example, if a changefeed is paused and then resumed, Kafka sink might send `msg1`, `msg2`, `msg3`, `msg2`, and `msg3` in order. You can filter out the duplicated messages from Kafka consumers. + +### Replication consistency + +- MySQL sink + + - TiCDC enables the redo log to ensure eventual consistency of data replication. + - TiCDC ensures that the order of single-row updates is consistent with the upstream. + - TiCDC does not ensure that the downstream transactions are executed in the same order as the upstream transactions. + + > **Note:** + > + > Since v6.2, you can use the sink URI parameter [`transaction-atomicity`](/ticdc/ticdc-sink-to-mysql.md#configure-sink-uri-for-mysql-or-tidb) to control whether to split single-table transactions. Splitting single-table transactions can greatly reduce the latency and memory consumption of replicating large transactions. + +## TiCDC architecture + +TiCDC is an incremental data replication tool for TiDB, which is highly available through PD's etcd. The replication process consists of the following steps: + +1. Multiple TiCDC processes pull data changes from TiKV nodes. +2. TiCDC sorts and merges the data changes. +3. TiCDC replicates the data changes to multiple downstream systems through multiple replication tasks (changefeeds). + +The architecture of TiCDC is illustrated in the following figure: + +![TiCDC architecture](/media/ticdc/cdc-architecture.png) + +The components in the architecture diagram are described as follows: + +- TiKV Server: TiKV nodes in a TiDB cluster. When data changes occur, TiKV nodes send the changes as change logs (KV change logs) to TiCDC nodes. If TiCDC nodes detect that the change logs are not continuous, they will actively request the TiKV nodes to provide change logs. +- TiCDC: TiCDC nodes where TiCDC processes run. Each node runs a TiCDC process. Each process pulls data changes from one or more tables in TiKV nodes and replicates the changes to the downstream system through the sink component. +- PD: The scheduling module in a TiDB cluster. This module is responsible for scheduling cluster data and usually consists of three PD nodes. PD provides high availability through the etcd cluster. In the etcd cluster, TiCDC stores its metadata, such as node status information and changefeed configurations. + +As shown in the architecture diagram, TiCDC supports replicating data to TiDB, MySQL, and Kafka databases. + +## Best practices + +- If the network latency between two TiDB clusters is higher than 100 ms, it is recommended to deploy TiCDC in the region (IDC) where the downstream TiDB cluster is located when replicating data between the two clusters. +- TiCDC only replicates tables that have at least one valid index. A valid index is defined as follows: + + - A primary key (`PRIMARY KEY`) is a valid index. + - A unique index (`UNIQUE INDEX`) is valid if every column of the index is explicitly defined as non-nullable (`NOT NULL`) and the index does not have a virtual generated column (`VIRTUAL GENERATED COLUMNS`). + +- To use TiCDC in disaster recovery scenarios, you need to configure [redo log](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios). +- When replicating a wide table with a large single row (greater than 1K), it is recommended to configure the [`per-table-memory-quota`](/ticdc/ticdc-server-config.md) so that `per-table-memory-quota` = `ticdcTotalMemory`/(`tableCount` * 2). `ticdcTotalMemory` is the memory of a TiCDC node, and `tableCount` is the number of target tables that a TiCDC node replicates. + +> **Note:** +> +> Since v4.0.8, TiCDC supports replicating tables without a valid index by modifying the task configuration. However, this compromises the guarantee of data consistency to some extent. For more details, see [Replicate tables without a valid index](/ticdc/ticdc-manage-changefeed.md#replicate-tables-without-a-valid-index). + +## Unsupported scenarios + +Currently, the following scenarios are not supported: + +- A TiKV cluster that uses RawKV alone. +- The [`CREATE SEQUENCE` DDL operation](/sql-statements/sql-statement-create-sequence.md) and the [`SEQUENCE` function](/sql-statements/sql-statement-create-sequence.md#sequence-function) in TiDB. When the upstream TiDB uses `SEQUENCE`, TiCDC ignores `SEQUENCE` DDL operations/functions performed upstream. However, DML operations using `SEQUENCE` functions can be correctly replicated. + +TiCDC only partially supports scenarios involving large transactions in the upstream. For details, refer to the [TiCDC FAQ](/ticdc/ticdc-faq.md#does-ticdc-support-replicating-large-transactions-is-there-any-risk), where you can find details on whether TiCDC supports replicating large transactions and any associated risks.