migrating tables with cdc enabled ends with Failed to execute #84

carlo4002 · 2022-07-20T14:32:35Z

Hello guys

I am testing a migration of a table with cdc enable in source (cassandra) and target ( scylladb). The job finished in error with the next message.

22/07/13 14:09:49 ERROR QueryExecutor: Failed to execute: com.datastax.spark.connector.writer.RichBoundStatementWrapper@6512ab5fcom.datastax.oss.driver.api.core.servererrors.InvalidQueryException:cdc: attempted to get a stream from an earlier generation than the currently used one.With CDC you cannot send writes with timestamps too far into the past,because that would break consistency properties (write timestamp: 2018/11/21 23:56:33, current generation started at: 2022/07/11 10:15:34)

I cannot change current generation because it is the date the cluster was created. disabling the CDC in the target fix this problem but we need CDC enable during our dual writes ( migration is without downtime )

is it there a way force this writes ?

The text was updated successfully, but these errors were encountered:

tarzanek · 2022-07-27T13:05:23Z

This looked like a CDC bug
there is a way to fix the streams

see
scylladb/scylladb#7127

tarzanek · 2022-07-27T13:08:26Z

API for that was in scylladb/scylladb#6498

but all this assumes the error comes from Scylla, not sure about cassandra @carlo4002

and looking closer it's more about migrator preserving timestamps and writing with old timestamps

tarzanek · 2022-07-27T13:11:13Z

also I am confused by your implementation of dual writes, you just need CDC on source of dual writes and consume and write it to target.
There is a kafka CDC consumer to help with this.
Target won't need CDC at all, resp. I don't see why you would need it there.

Also note that other option is to just do dual writes from application and in such case you won't need CDC anywhere (but a small code change would be needed in client of course).

tarzanek · 2022-07-27T13:14:06Z

so I would just migrate with CDC disabled in target

alternative is of course disabling preserveTimestamp in migrator, but this way you will risk overwriting dual written data!

carlo4002 · 2022-10-07T14:42:13Z

Hello @tarzanek , Sorry it took so long to give you my feedback about this, I am still working on this and yes my work around was to disable cdc in scylla for the moment.

The cdc is scylla isn't for the migration (dual writes) but for some applications that use this db. So not all the tables have the cdc.

So when I say I need migration without downtime, I wanted to say that cdc must be enable in target for those tables that need it. However we are going to switch those services after first load, so no need to have the cdc on

tarzanek closed this as completed Jul 27, 2022

tarzanek reopened this Jul 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

migrating tables with cdc enabled ends with Failed to execute #84

migrating tables with cdc enabled ends with Failed to execute #84

carlo4002 commented Jul 20, 2022

tarzanek commented Jul 27, 2022 •

edited

Loading

tarzanek commented Jul 27, 2022 •

edited

Loading

tarzanek commented Jul 27, 2022 •

edited

Loading

tarzanek commented Jul 27, 2022 •

edited

Loading

carlo4002 commented Oct 7, 2022

migrating tables with cdc enabled ends with Failed to execute #84

migrating tables with cdc enabled ends with Failed to execute #84

Comments

carlo4002 commented Jul 20, 2022

tarzanek commented Jul 27, 2022 • edited Loading

tarzanek commented Jul 27, 2022 • edited Loading

tarzanek commented Jul 27, 2022 • edited Loading

tarzanek commented Jul 27, 2022 • edited Loading

carlo4002 commented Oct 7, 2022

tarzanek commented Jul 27, 2022 •

edited

Loading

tarzanek commented Jul 27, 2022 •

edited

Loading

tarzanek commented Jul 27, 2022 •

edited

Loading

tarzanek commented Jul 27, 2022 •

edited

Loading