Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-19664 first approach - for copying Basic Table definition #3718

Open
wants to merge 4 commits into
base: trunk
Choose a base branch
from

Conversation

Maxwell-Guo
Copy link
Contributor

This is for CASSANDRA-19664:
1、we can copy table under the same keyspace of under different keysapces.
2、only support basic table schema : primary key(partition key and clustering key), regular columns , static columns, column masking, table params(compaction, compression and so on).

@@ -236,6 +236,7 @@ cqlStatement returns [CQLStatement.Raw stmt]
| st42=addIdentityStatement { $stmt = st42; }
| st43=dropIdentityStatement { $stmt = st43; }
| st44=listSuperUsersStatement { $stmt = st44; }
| st45=copyTableStatement { $stmt = st45; }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason i used a new statement here is that :
1、The original createtablestatment code is too bloated. If the semantics of like are added, the code needs to be changed a lot.
2、In addition, I think this new copystatement can do things other than table cloning, and can also support semantic capabilities such as “CREATE TABLE AS”. Because they are similar to table copying functions, they are called copystatement here.

Besides, I have also prepared a patch, which is mainly to modify the pares of create table, that is, to implement the function of create table like in create table statement. If you think it is necessary, I can create a new PR to replace the current one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another pr here that implement CASSANDRA-19964, but use create table's parase path .
I am not sure which one is better , so I attach both of them.

@Maxwell-Guo Maxwell-Guo force-pushed the CASSANDRA-19964 branch 3 times, most recently from 52f4d30 to 8b09b40 Compare December 3, 2024 12:07
@smiklosovic smiklosovic changed the title CREATE TABLE LIKE for copying Basic Table definition for CASSANDRA-19664 CASSANDRA-19664 first approach - for copying Basic Table definition Dec 3, 2024
throw ire("Cannot use CTREATE TABLE LIKE across different keyspace when source table have UDTs.");
}

String sourceCQLString = sourceTableMeta.toCqlString(false, false, true, false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we ignore the dropped columns, if the seconds input argument is true that means the dropped column will be consider .
Then the cql will be like : create table tb (pk int primary key, cn int, dn int) with xxxx ; alter table tb drop dn;

}

@Override
public void validate(ClientState state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? Will not be validate of super called automatically when you do not override?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes , we can remove it.

Copy link
Contributor

@smiklosovic smiklosovic Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Maxwell-Guo please cover my last comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi , I replied here, I think we may not need to do this , because the CQLs that are parsed are all create table like syntax, there will be no coexistence with create table syntax. For cql like create table if not exist ta like tb or some other cql that contains these three keywords : CREATE / TABLE / LIKE , only create table a like b can be successfully paresd into copytablestatemnt, others will got a SyntaxException. I have a test for this in CreateLikeCqlParseTest.

Copy link
Contributor

@blerer blerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my first round of review. :-) Thanks a lot for pushing that functionality forward.

src/java/org/apache/cassandra/cql3/QueryProcessor.java Outdated Show resolved Hide resolved
test/unit/org/apache/cassandra/cql3/CQLTester.java Outdated Show resolved Hide resolved
test/unit/org/apache/cassandra/cql3/CQLTester.java Outdated Show resolved Hide resolved
Comment on lines 831 to 842
String souceTable = createTable(KEYSPACE_PER_TEST,
"CREATE TABLE %s (" +
" pk1 text, " +
" pk2 int MASKED WITH DEFAULT, " +
" ck1 int, " +
" ck2 double," +
" s1 float static, " +
" v1 int, " +
" v2 int, " +
"PRIMARY KEY ((pk1, pk2), ck1, ck2 ))");
String targetTable = createTableLike("create table %s like %s", souceTable, KEYSPACE_PER_TEST, KEYSPACE_PER_TEST);
Pair<TableMetadata, TableMetadata> pair = assertTableMetaEquals(KEYSPACE_PER_TEST, KEYSPACE_PER_TEST, souceTable, targetTable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be replaced by:

        String souceTable = createTable(KEYSPACE_PER_TEST,
                                  "CREATE TABLE %s (" +
                                        "  pk1 text, " +
                                        "  pk2 int MASKED WITH DEFAULT, " +
                                        "  ck1 int, " +
                                        "  ck2 double," +
                                        "  s1 float static, " +
                                        "  v1 int, " +
                                        "  v2 int, " +
                                        "PRIMARY KEY ((pk1, pk2), ck1, ck2 ))");
        TableMetadata sourceTableMetadata = currentTableMetadata();
        String targetTable = createTableLike("create table %s like %s", souceTable, KEYSPACE_PER_TEST, KEYSPACE_PER_TEST);
        TableMetadata targetTableMetadata = currentTableMetadata();
        assertTableMetaEquals(sourceTableMetadata, targetTableMetadata); 

and assertTableMetaEquals can be moved to DescribeStatementTest and replaced by:

    protected void assertTableMetaEquals(TableMetadata source, TableMetadata target)  {
        assertNotNull(source);
        assertNotNull(target);
        assertTrue(source.equalsWithoutTableNameAndDropCns(target));
        assertNotEquals(source.id, target.id);
        assertNotEquals(source.name, target.name);
    } 

It will then allow to remove the getTableMetadata methods in CQLTester.

Copy link
Contributor Author

@Maxwell-Guo Maxwell-Guo Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertTableMetaEquals will aslo be used for other test classes like
CreateLikeTest -> testTableShemaCopy ()

  1. I remove the getTableMetadata in CQLTester
  2. I change the test code in DescribeStatementTest and remove the use of assertTableMetaEquals, and verify them one by one :
       assertNotNull(source);
       String targetTable = createTableLike("create table %s like %s", souceTable, KEYSPACE_PER_TEST, KEYSPACE_PER_TEST);
       TableMetadata target = getTableMetadata(KEYSPACE_PER_TEST, currentTable());
       assertNotNull(target);
       assertTrue(equalsWithoutTableNameAndDropCns(source, target));
       assertNotEquals(source.id, target.id);
       assertNotEquals(source.name, target.name); 

That is because
(1) currentTableMetadata(); only return the table meta under KEYSPACE;
(2) the assertTableMetaEquals is only used once, but it is used many time in CreateLikeTest so I move assertTableMetaEquals to CreateLikeTest,

WDYT ? @blerer

Comment on lines 1581 to 1590
protected TableMetadata getTableMetadata(String table)
{
return Schema.instance.getTableMetadata(KEYSPACE, table);
}

protected TableMetadata getTableMetadata(String keyspace, String table)
{
return Schema.instance.getTableMetadata(keyspace, table);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed after the changes suggested in DescribeStatementTest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the comment on assertTableMetaEquals

Copy link
Contributor Author

@Maxwell-Guo Maxwell-Guo Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blerer
I removed

    {
        return Schema.instance.getTableMetadata(KEYSPACE, table);
    }

But kept

protected TableMetadata getTableMetadata(String keyspace, String table)
    {
        return Schema.instance.getTableMetadata(keyspace, table);
    }

as getTableMetadata(String keyspace, String table) is used by some test cases, is it ok ?

@@ -479,12 +479,21 @@ public String defaultCompactValueName()
}
}

public static TableMetadata.Builder parse(String cql, String keyspace, String table, Types types, UserFunctions userFunctions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that I am missing something. Why did you change that code?

Copy link
Contributor

@smiklosovic smiklosovic Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blerer the original version of that method (which calls this one) is still there. We just create this one in order to be able to specify custom table name with types and functions. If we parse CQL for old table, we just want to be able to put there table name etc for new table directly upon parsing.

@Maxwell-Guo Maxwell-Guo force-pushed the CASSANDRA-19964 branch 2 times, most recently from 0a7b1f5 to ed700a8 Compare December 9, 2024 12:13
Comment on lines 1581 to 1590
protected TableMetadata getTableMetadata(String table)
{
return Schema.instance.getTableMetadata(KEYSPACE, table);
}

protected TableMetadata getTableMetadata(String keyspace, String table)
{
return Schema.instance.getTableMetadata(keyspace, table);
}

Copy link
Contributor Author

@Maxwell-Guo Maxwell-Guo Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blerer
I removed

    {
        return Schema.instance.getTableMetadata(KEYSPACE, table);
    }

But kept

protected TableMetadata getTableMetadata(String keyspace, String table)
    {
        return Schema.instance.getTableMetadata(keyspace, table);
    }

as getTableMetadata(String keyspace, String table) is used by some test cases, is it ok ?

@Test
public void testIfNotExists() throws Throwable
{
String sourceTb = createTable(sourceKs, "CREATE TABLE %s (a int, b text, c duration, d float, PRIMARY KEY(a, b));");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blerer I add one test case for the IF NOT EXISTS as you suggested

assertTableMetaEquals(sourceKs, targetKs, tableOtherOptions, tbLikeCompactionOthers);

// table copy with params setting
String tableCopyAndSetCompression = createTableLike("CREATE TABLE %s LIKE %s WITH compression = { 'class' : 'SnappyCompressor', 'chunk_length_in_kb' : 64 };",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smiklosovic I update the pr and implemented setting table params when doing CREARE TABLE LIKE

import org.apache.cassandra.transport.Event.SchemaChange;

/**
* {@code CREATE TABLE [IF NOT EXISTS] <newtable> LIKE <oldtable> WITH <property> = <value>}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blerer this is a brief introduction of cql to CopyTableStatement, and I have already create CASSANDRA-20120 to create CEP-43's doc

.build();
table.validate();

if (targetKeyspaceMeta.replicationStrategy.hasTransientReplicas()
Copy link
Contributor

@smiklosovic smiklosovic Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Maxwell-Guo This is interesting corner-case. I think that if there was not a possibility to specify different source keyspace from the target one, then we would not need this ... Hm. I think it needs to be there then.

Copy link
Contributor Author

@Maxwell-Guo Maxwell-Guo Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea at the time was that copying a table is also an act of creating a new table, the only difference is that target table has same schema with source table.

We have no way to control the target keyspace because the keyspaces of the two tables may be the same or completely different.

So I kept it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

throw ire("read_repair must be set to 'NONE' for transiently replicated keyspaces");
}

if (!table.params.compression.isEnabled())
Copy link
Contributor

@smiklosovic smiklosovic Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Maxwell-Guo

So this says that: "if the compression is not enabled on the source table, then be sure that guardrail enables uncompressed sstables".

While it does make sense when creating the original table for the first time, does it still make sense to do when we are copying it?

If we disabled the compression on source table, then I think that we already do have a guardrail for that, correct? But if we change the guardrail in runtime after the original table was created (e.g. via JMX), then copying such table will not be possible anymore?

So, whether we can copy a table or not when the source table has disabled compression only depends on what the guardrail is set to in the runtime.

I prefer to not check this. I would rather create a copy of the source table no matter what and we can propagate back to the client (the message will appear in cqlsh when copying table) that if guardrail is different, then a user might notified about that, but this is optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we disabled the compression on source table, then I think that we already do have a guardrail for that, correct? But if we change the guardrail in runtime after the original table was created (e.g. via JMX), then copying such table will not be possible anymore?

Yes, that is why I kept it.

The interval between creating two tables may be long, and it is uncertain whether the user has any new ideas about table compression during this period. Therefore, for the behavior of copying the table, except for copying the table schema(column, datatype, mask, table params), creating udt, index, trigger (if needed), and do some validations on udt、indexes and triggers, I have aligned other behaviors and validations with creating a new table.

Copy link
Contributor

@smiklosovic smiklosovic Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it again, I think how it is done is OK, because we can also do create table ks.tb_copy like ks.tb with compression = disabled and using the guardrail here is perfectly sensible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants