Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client.insert(...) fails if registered class doesn't match all existing columns #2118

Open
gabrielpablorc opened this issue Jan 27, 2025 · 4 comments

Comments

@gabrielpablorc
Copy link

Describe the bug

Attempting an insert fails if any column from the target table is not defined in the registered POJO.

Steps to reproduce

  1. Register a POJO associated with any given table. The table should contain at least one additional column that is not defined in the POJO.
  2. Attempt to call client.insert(...) for one of these POJO instances.
  3. Receive an IllegalArgumentException (No serializer found for column 'example'. Did you forget to register it?).

From my understanding, the problem comes from the fact that client.insert(...) assumes the resulting INSERT command should include values for all existing columns (which should not be the case), throwing an exception in line 1350 of com.clickhouse.client.api.Client when this is not the case.

Expected behaviour

client.insert(...) should ideally work even if the registered class is 'missing' some columns from the mapped table. While this can be partially solved by just adding said missing column into the registered class, this particular behavior can be problematic in certain cases

For example, when adding a new column to an existing table, this would theoretically break the mapping of the version running in production - which would not have the corresponding attribute in the POJO class.

Error log

java.lang.IllegalArgumentException: No serializer found for column 'example'. Did you forget to register it?
	at com.clickhouse.client.api.Client.insert(Client.java:1350)

Thanks in advance for reading this!

@chernser
Copy link
Contributor

Good day, @gabrielpablorc !
Is the column that misses matching field in the POJO has default value or is it nullable?

@gabrielpablorc
Copy link
Author

@chernser thanks for the quick turnaround!

There's actually an important detail I overlooked: The 'missing' column in my registered class is actually a materialized one. For reference, here's what that bit of the table creation script looks like:

...
date DateTime64(3),
date_only Date MATERIALIZED toDate(date),
...

date_only is the one causing issues here - which, being a materialized column, shouldn't even be required for the actual insert.

I'd like to suggest a fix, but can't figure one out easily. From debugging, this particular materialized column doesn't seem to be distinguishable from the other (non-materialized) ones - at least not from checking the attributes I'm seeing in either ClickHouseColumn or TableSchema.

I now see this is somewhat related to another reported issue, so feel free to close if you consider this duplicated: #2025

@gabrielpablorc
Copy link
Author

I took the liberty of debugging a bit further, and thought it might be useful to actually track (through an additional attribute in ClickHouseColumn) whether a column is materialized or not. This could be done for example when initially parsing, inside TableSchema.addColumn(String name, String type, String defaultType):

public void addColumn(String name, String type, String defaultType) {
        ClickHouseColumn column = ClickHouseColumn.of(name, type);
        if (defaultType.toUpperCase().contains("DEFAULT")) {
            hasDefaults = true;
            column.setHasDefault(true);
        }
        **// Maybe handle another if here flagging the column as materialized?**

        columns.add(column);

        Map<String, Object> columnMetadata = metadata.computeIfAbsent(name, k -> new HashMap<>());
            columnMetadata.put("type", type);
        colIndex.put(name, columns.size() - 1);
    }

Thought this might help!

@mshustov
Copy link
Member

I took the liberty of debugging a bit further, and thought it might be useful to actually track (through an additional attribute in ClickHouseColumn) whether a column is materialized or not.

thank you @gabrielpablorc for the thorough analysis.
@chernser I believe the client needs a similar check for materialized, alias and ephemeral columns https://github.com/ClickHouse/clickhouse-kafka-connect/blob/aa63c7217f2940bf06e840a199f7597ba4262fc5/src/main/java/com/clickhouse/kafka/connect/sink/db/helper/ClickHouseHelperClient.java#L367-L369

@chernser chernser added this to the Priority Backlog milestone Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants