Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ticdc: Support Multiple Downstream Addresses for MySQL #11475

Open
wlwilliamx opened this issue Aug 7, 2024 · 3 comments · May be fixed by #11527
Open

ticdc: Support Multiple Downstream Addresses for MySQL #11475

wlwilliamx opened this issue Aug 7, 2024 · 3 comments · May be fixed by #11527
Labels
type/feature Issues about a new feature

Comments

@wlwilliamx
Copy link
Contributor

wlwilliamx commented Aug 7, 2024

Is your feature request related to a problem?

In the current version, TiCDC can only connect one server of a MySQL-compatible database cluster in one Changefeed. If the connection is lost, it will be necessary to recreate the changefeed, as it will not automatically connect to other available servers in the cluster.

Describe the feature you'd like

As stated in the title, we can allow multiple MySQL-compatible downstream addresses in --sink-uri option when user create/update changefeed. When one of the downstream servers fails, TiCDC can automatically switch to another available server from the provided multiple optional servers to continue working, thereby providing high availability.

NOTE: This feature is unrelated to load balance; it's solely for fault tolerance.

Therefore, the value of the --sink-uri option in the command can be changed to the following format:
[scheme]://[user[:password]@][host[:port]][,host[:port]][,host[:port]][/path][?param1=value1&paramN=valueN]

Related code

In the TiCDC code, the following places are involved in establishing a connection with MySQL-compatible downstream:

  • newMySQLSyncPointStore(): cdc/syncpointstore/mysql_syncpoint_store.go, needs modification.
  • CreateMySQLDBConn(): pkg/sink/mysql/db_helper.go
    Places where CreateMySQLDBConn() is used:
    • NewDDLSink(): cdc/sink/ddlsink/mysql/mysql_ddl_sink.go, established a connection with the downstream database for DDL Sink, needs modification.
    • NewMySQLSink(): cdc/sink/dmlsink/txn/txn_dml_sink.go, established a connection with the downstream database for DML Sink,needs modification.
    • NewObserver(): pkg/sink/observer/observer.go, established a connection with the downstream database for the Owner to create Observers and periodically query certain performance metrics of the downstream TiDB via SQL, needs modification.
    • TestNewMySQLTimeout(): cdc/sink/dmlsink/txn/mysql/mysql_test.go,a UT for test timeout, no change needed.
    • checkBDRMode(): cdc/sink/validator/validator.go,temporarily establish a connection to the downstream database to check if BDR Mode is supported, no change needed.
    • doVerify(): pkg/upstream/upstream.go,temporarily establish a connection to the upstream database to authenticate upstream user, no change needed.
  • openDB(): cmd/kafka-consumer/writer.go,used by kafka-consumer to open the upstream database (for checking diff), no change needed.
  • A bunch of tests, no change needed.

Describe alternatives you've considered

No response

Teachability, Documentation, Adoption, Migration Strategy

No response

@wlwilliamx wlwilliamx added the type/feature Issues about a new feature label Aug 7, 2024
@lance6716
Copy link
Contributor

IMO it's better to use another layer like load balancer to solve this problem. CDC sink can only care about output protocol, high availability of downstream is not a responsibility of CDC.

@wlwilliamx
Copy link
Contributor Author

IMO it's better to use another layer like load balancer to solve this problem. CDC sink can only care about output protocol, high availability of downstream is not a responsibility of CDC.

Thank you for your feedback. While using a load balancer is indeed a common solution for high availability, it can also become a single point of failure, especially if not properly configured or if the load balancer itself encounters issues. By allowing TiCDC to support multiple downstream addresses natively, we can add an extra layer of redundancy. This would enable TiCDC to automatically switch to another available server in the cluster if the primary server fails, providing more robust fault tolerance. This approach could be more reliable in scenarios where a load balancer might not be feasible or adds additional complexity.

Moreover, by integrating this functionality directly into TiCDC, we simplify the deployment and management of the system, as users wouldn’t need to rely on external solutions for high availability. This makes TiCDC more resilient and easier to use in a variety of environments.

@flowbehappy
Copy link
Collaborator

@benmeadowcroft Please take a look

@wlwilliamx wlwilliamx changed the title ticdc: support multiple MySQL downstream addresses in sink URI ticdc: Support Multiple Downstream Addresses for MySQL Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Issues about a new feature
Projects
None yet
3 participants