feat: Add Kafka Connect Cloud Bigtable sink connector #2466

prawilny · 2025-01-12T14:07:20Z

This PR adds Kafka Connect sink connector.

The code is to land in a different repository, but the repository hasn't been created yet, so we bring the code for early review here.
The fact that it's targetting another repo is the reason of the following:

modification of Github Actions CI so that it executes the sink's tests instead of adding it as a new maven submodule and plugging it into existing CI
the code is not a submodule in the root pom.xml nor does it use any information from outside of the directory
maven plugins' config is duplicated
there is another copy of the license in the directory

Things yet to be done (in future PRs):

Logical types support
More comprehensive integration tests
- they might include more detailed compatibility checks against the Confluent sink
Use of kokoro in CI

google-cla · 2025-01-12T14:07:25Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

generated-files-bot · 2025-01-12T14:07:30Z

Warning: This pull request is touching the following templated files:

.github/workflows/ci.yaml

brandtnewton

This is shaping up nicely! All my comments are pretty minor. I did not review the Integration tests yet. I'll get back to you soon on how to handle logical types.

.../src/main/java/com/google/cloud/kafka/connect/bigtable/autocreate/BigtableSchemaManager.java

...-connect-sink/src/main/java/com/google/cloud/kafka/connect/bigtable/mapping/ValueMapper.java

...onnect-sink/src/test/java/com/google/cloud/kafka/connect/bigtable/mapping/KeyMapperTest.java

…nputErrors()

…stGetTableName

…Test

prawilny · 2025-01-24T09:34:02Z

@brandtnewton
I didn't resolve any of the conversations since I'm used to the reviewer doing that.

I also have a question: how do you want to review further commits (mainly logical types support, some more integration tests, and some minor tweaks throughout the codebase)? In this PR? In a new one? Or maybe do you want to create a new repository and have the PR(s) sent there? Please let me know.

brandtnewton · 2025-01-29T14:17:12Z

@prawilny you can just add those commits to this PR. GitHub makes it easy to see only the changes since my last review

brandtnewton

Thanks for all the changes! Feel free to add more changes here or in another PR, whatever is easier for you. See comments:

…oting configuration keys for consistency with Kafka convention

move code from BaseIT to BaseKafkaConnectIT get rid of protected and use public/private in base classes

prawilny · 2025-02-07T12:16:54Z

@brandtnewton, please take a look.
Some tasks (such as documenting generation of Confluent sink compatibility tests mentioned above) are yet to be done, but hopefully the code is close to its final form.
Note that some integration tests fail on the emulator, but they all work when using a real Bigtable instance.
Now I'm switching to the performance testing.

brandtnewton

I think I'm all caught up now, including the tests. This is getting super close! I'd like to run the IT tests locally, can you add a quick doc on how to do that? Also I don't have permission to resolve comments so feel free to resolve them yourself.

brandtnewton · 2025-02-03T17:11:14Z

...-sink/src/main/java/com/google/cloud/kafka/connect/bigtable/mapping/MutationDataBuilder.java

+    // b) modify the record as if it didn't contain the deletion of columns that don't exist so that
+    // other operations within the row have a chance to execute successfully.
+    // I think the option a) is clearer to reason about, so we mark the column family as required.
+    requiredColumnFamilies.add(columnFamily);


Does this mean we will create and then delete a column family if it doesn't exist and this method is called?

As per the design doc:
"""
The SinkRecord value has a top level field set to null
Example: {foo: null}. This will result in all columns in the "foo" column family being deleted for this row.
"""

We create a column family in the table's schema and then delete all the cells within the row.
Yes, the "global" (within a table) creation is required only so that the Bigtable modification request succeeds, but that modification is not reverted by a column family deletion within a row.

brandtnewton · 2025-02-07T14:52:25Z

...fka-connect-sink/src/main/java/com/google/cloud/kafka/connect/bigtable/BigtableSinkTask.java

+   */
+  @VisibleForTesting
+  // It is generic so that we can test it with naturally ordered values easily.
+  static <K, V> Map<K, V> orderMap(Map<K, V> map, Collection<K> order) {


Can prepareRecords() do this? It could create a LinkedHashMap instead of a HashMap

It could, but then we'd need to also return a LinkedHashMap from autoCreateTablesAndHandleErrors() and autoCreateColumnFamiliesAndHandleErrors(). I thought that it'd couple these functions too tightly. What do you think?

brandtnewton · 2025-02-07T15:19:48Z

.../src/main/java/com/google/cloud/kafka/connect/bigtable/autocreate/BigtableSchemaManager.java

@@ -461,7 +468,6 @@ static Optional<StatusCode> maybeExtractBigtableStatusCode(Throwable t) {
    static boolean isStatusCodeCausedByInputError(StatusCode.Code code) {
      switch (code) {
        case INVALID_ARGUMENT:
-        case FAILED_PRECONDITION:


Why remove this error code?

I see the setRetryableCodes code in BigtableSinkConfig.java - I assume that's why this got removed. Couldn't a user input error still cause a failed precondition error?

The docs state that it is a "user error", but also that "Depending on the circumstance, you might want to retry. For example, you can encounter this error if you attempt to perform operations on a table while it is still being created.".

I encountered this status code when creating different column families concurrently. The error message directly stated that it was due to concurrent table schema modification. That is why it needs to be retried in our case.

I also think that the other example error given by the docs, "attempt to perform operations on a table while it is still being created", should also be retried if it were encountered (due to either a bug in our resource auto creation or the user's too late manual creation of the table if the sink were used with the resource auto creation disabled).

Apart from that, I don't really know if a user input error could cause a failed precondition error - I didn't manage to find a direct answer in the docs. I only know that the invalid operations I test (such as invalid table name) cause INVALID_ARGUMENT and at least some of the FAILED_PRECONDITION errors should be retried.

Do you think we should read error messages and decide whether the exception is caused by user error depending on that? If so, do you have any pointers where to find a more or less complete list of the error messages?

Also, I think we should err on leniency's side with branding something a user error since in that case (if the user configured DLQ) we send the offending record to the DLQ with InvalidBigtableSchemaModificationException, which we describe in the docs as:

/** * An {@link Exception} that signifies that input {@link org.apache.kafka.connect.sink.SinkRecord * SinkRecord(s)} cause attempt of invalid Cloud Bigtable schema modification and thus is invalid * and should not be retried. */

brandtnewton · 2025-02-07T15:45:57Z

...ct-sink/src/main/java/com/google/cloud/kafka/connect/bigtable/config/BigtableSinkConfig.java

+        .modifyColumnFamiliesSettings()
+        .setRetrySettings(adminApiWriteRetrySettings)
+        // Retry createTable() for status codes other admin operations retry by default as
+        // seen in BigtableTableAdminStubSettings and for FAILED_PRECONDITION which is


Won't FAILED_PRECONDITION continue to fail if the column family has already been added by another request?

No, then the caller receives ALREADY_EXISTS error, which is not retriable. Then awaitResourceCreationAndHandleInvalidInputErrors() logs that information and then swallows the exception, then refreshTableColumnFamiliesCache() gets the table's schema and the returned table's schema is used to check if the column family creation succeeded.

brandtnewton · 2025-02-07T19:10:40Z

...ka-connect-sink/src/main/java/com/google/cloud/kafka/connect/bigtable/mapping/KeyMapper.java

   * @param fields Fields that need to be accessed before the target value is reached.
   * @return Extracted nested field.
   */
-  private Object extractField(Object value, Iterator<String> fields) {
+  private SchemaAndValue extractField(


nit: would it be cleaner to just pass a SchemaAndValue instead of a value and schema?

brandtnewton · 2025-02-10T18:31:56Z

...t-sink/src/test/java/com/google/cloud/kafka/connect/bigtable/integration/InsertUpsertIT.java

+    props.put(BigtableSinkConfig.AUTO_CREATE_TABLES_CONFIG, "true");
+    props.put(BigtableSinkConfig.AUTO_CREATE_COLUMN_FAMILIES_CONFIG, "true");
+    props.put(BigtableSinkConfig.INSERT_MODE_CONFIG, InsertMode.INSERT.name());
+    props.put(BigtableSinkConfig.ERROR_MODE_CONFIG, BigtableErrorMode.IGNORE.name());


Can we test that an error occurs in some way?

brandtnewton · 2025-02-10T18:32:19Z

...t-sink/src/test/java/com/google/cloud/kafka/connect/bigtable/integration/InsertUpsertIT.java

+    Map<ByteString, Row> rows = readAllRows(bigtableData, testId);
+    Row row1 = rows.get(KEY1_BYTES);
+    Row row2 = rows.get(KEY2_BYTES);
+    assertEquals(1, row1.getCells().size());


can we test that VALUE1 is set and not VALUE2?

brandtnewton · 2025-02-10T18:41:44Z

...sink/src/test/java/com/google/cloud/kafka/connect/bigtable/util/NestedNullStructFactory.java

+  public static final ByteString NESTED_NULL_STRUCT_FIELD_NAME_BYTES =
+      ByteString.copyFrom(NESTED_NULL_STRUCT_FIELD_NAME.getBytes(StandardCharsets.UTF_8));
+
+  public static Struct getStructhWithNullOnNthNestingLevel(int n) {


nit: Structh typo

brandtnewton · 2025-02-10T20:06:17Z

...rc/test/java/com/google/cloud/kafka/connect/bigtable/integration/ResourceAutoCreationIT.java

+  public void testCreationOfInvalidTable() throws InterruptedException {
+    String dlqTopic = createDlq();
+    Map<String, String> props = baseConnectorProps();
+    String invalidTableName = "T".repeat(100);


nit: MAX_BIGTABLE_TABLE_NAME_LENGTH + 1 instead of 100 to make it more clear why this is invalid

brandtnewton · 2025-02-10T20:13:48Z

...rc/test/java/com/google/cloud/kafka/connect/bigtable/integration/ResourceAutoCreationIT.java

+  }
+
+  @Test
+  public void testDeletionFailsWhenAutoCreationDisabled() throws InterruptedException {


is this a desired behavior?

It's a test documenting consequences of design choices described in comments in MutationDataBuilder (in deleteFamily() and deleteCells()), so I think it is a desired behavior. The alternative is described in these comments and is - in my opinion - worse.

I added a comment to make that connection explicit.

…stEnumCaseInsensitivity

…tionOfInvalidTable()

prawilny · 2025-02-11T13:16:11Z

@brandtnewton, please take another look.
I responded to all your comments, resolved the conversations that were done (but I left the one you didn't mark as read, thinking that you might want to take a look at it since it still seems relevant).
I also created a small README about running integration tests.

Beware, the integration tests are a bit flaky right now.
I'm going to fix it right after preparing the performance tests.
The problematic code path is: auto creation of a table => auto creation of a column family => access to that created column family. Sometimes that access receives NOT_FOUND status code from its request and the test fails.

brandtnewton

Great doc. Thanks for adding it!

…family auto creation

curl "https://strimzi.io/install/latest?namespace=kafka" | yq -P 'sort_keys(..)' | sponge strimzi-cluster-operator-0.45.0.yaml

…eCloudPlatform/otlp-k8s-ingest/tree/3b383a083250487e5ca27e58abf0d81cb25a914c/k8s/base

…imzi/strimzi-kafka-operator/blob/release-0.45.x/examples/metrics/kafka-connect-metrics.yaml

prawilny · 2025-02-26T09:28:51Z

@brandtnewton, please take another look.
The chagnes are mostly with the performance test setup and compatibility tests, but a few lines (documentation strings) are also changed in the library code.

prawilny requested review from a team as code owners January 12, 2025 14:07

product-auto-label bot added size: xl Pull request size is extra large. api: bigtable Issues related to the googleapis/java-bigtable API. labels Jan 12, 2025

feat: Kafka Connect Cloud BigTable Sink

40ae074

prawilny force-pushed the kafka_connect_bigtable_sink branch from e2f0361 to 40ae074 Compare January 13, 2025 11:20

brandtnewton suggested changes Jan 22, 2025

View reviewed changes

prawilny added 13 commits January 23, 2025 13:10

Elaborate on error handling in awaitResourceCreationAndHandleInvalidI…

3738df2

…nputErrors()

rename createdColumnFamilyFuturesAndRecords

d488244

Tweak NullValueMode docstring

8f254ef

Rename test cases valdiating multiple config parameters

31458bc

Add Map ordering disclaimer

e79507e

Get rid of interpolationish suffix from the record topic's name in te…

6ebc75a

…stGetTableName

Remove mocks from testCreateRecordMutationDataNonemptyKey

1a70230

remove leftover empty test

a2b106e

Add ALREADY_EXISTS to the list of tested codes in SchemaApiExceptions…

02be96c

…Test

remove a wild semicolon

35aaf50

Split primitive test cases in KeyMapperTest

c9e6306

clean up testDifferentStructMappings

2b9c9c2

Explain why exceptions are expected to be thrown

9fce9ab

brandtnewton suggested changes Jan 29, 2025

View reviewed changes

prawilny added 4 commits January 30, 2025 13:40

split testStruct into three

af54ebe

update pom.xml with integration test dependencies

f5660b0

use config variable in JsonConverterFactory

5e27812

Unify quoting variables in log messages

d5c95d1

prawilny added 7 commits February 7, 2025 12:24

BigtableSinkConfig: use "_CONFIG" as a suffix of static variables den…

0e49556

…oting configuration keys for consistency with Kafka convention

Configure Admin API write retries

34b9101

Configure integration test logging

7479ee0

Integration test cleanup

35602df

move code from BaseIT to BaseKafkaConnectIT get rid of protected and use public/private in base classes

Tweak number of partitions for created topics

d959710

Integration test using multiple connector tasks

e02bda0

Remove a disabled test that nukes request quota

5cc4143

brandtnewton reviewed Feb 10, 2025

View reviewed changes

prawilny added 9 commits February 11, 2025 11:23

Use SchemaAndValue in KeyMapper#extractField().

86e588a

Implement hashCode() for SchemaUtils.SchemaAndField

3f851ff

Assert against case insensivity of enums in BigtableSinkConfigTest#te…

a78b5e3

…stEnumCaseInsensitivity

Fix typo in 'getStructhWithNullOnNthNestingLevel'

a92ee68

State explicitly why the operations in ErrorReportingIT fail

d9ac401

Improve InsertUpsertIT: check DLQ, assert against values

de359a8

Use MAX_BIGTABLE_TABLE_NAME_LENGTH in ResourceAutoCreationIT#testCrea…

4987b71

…tionOfInvalidTable()

Explain why testDeletionFailsWhenAutoCreationDisabled is needed

0e4ff3e

Readme for tests

74c284a

brandtnewton reviewed Feb 12, 2025

View reviewed changes

prawilny added 9 commits February 25, 2025 12:01

Use Confluent's sink in compatibility tests directly through a .jar

99e5567

Warn against latency unpredictability when using table and/or column …

9ca8133

…family auto creation

Download and normalize strimzi 0.45.0 with namespace 'kafka' configured

751a13e

curl "https://strimzi.io/install/latest?namespace=kafka" | yq -P 'sort_keys(..)' | sponge strimzi-cluster-operator-0.45.0.yaml

Copy OpenTelemetry collector deployment from https://github.com/Googl…

97f134d

…eCloudPlatform/otlp-k8s-ingest/tree/3b383a083250487e5ca27e58abf0d81cb25a914c/k8s/base

Copy Strimzi Kafka Connect metrics config from https://github.com/str…

c070136

…imzi/strimzi-kafka-operator/blob/release-0.45.x/examples/metrics/kafka-connect-metrics.yaml

Bump Google libraries

6a931a5

Performance tests

3e0f4df

MultipleConnectorTasksIT#testRestartPauseStop

d4c1570

Remove at-least assertions with exact ones

d2c8b10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Kafka Connect Cloud Bigtable sink connector #2466

feat: Add Kafka Connect Cloud Bigtable sink connector #2466

prawilny commented Jan 12, 2025

google-cla bot commented Jan 12, 2025

generated-files-bot bot commented Jan 12, 2025

brandtnewton left a comment

prawilny commented Jan 24, 2025

brandtnewton commented Jan 29, 2025

brandtnewton left a comment

prawilny commented Feb 7, 2025

brandtnewton left a comment

brandtnewton Feb 3, 2025

prawilny Feb 11, 2025

brandtnewton Feb 7, 2025

prawilny Feb 11, 2025

brandtnewton Feb 7, 2025

brandtnewton Feb 7, 2025

prawilny Feb 11, 2025

prawilny Feb 11, 2025

brandtnewton Feb 7, 2025

prawilny Feb 11, 2025

brandtnewton Feb 7, 2025

prawilny Feb 11, 2025

brandtnewton Feb 10, 2025

prawilny Feb 11, 2025

brandtnewton Feb 10, 2025

prawilny Feb 11, 2025

brandtnewton Feb 10, 2025

prawilny Feb 11, 2025

brandtnewton Feb 10, 2025

prawilny Feb 11, 2025

brandtnewton Feb 10, 2025

prawilny Feb 11, 2025

prawilny commented Feb 11, 2025

brandtnewton left a comment

prawilny commented Feb 26, 2025

feat: Add Kafka Connect Cloud Bigtable sink connector #2466

Are you sure you want to change the base?

feat: Add Kafka Connect Cloud Bigtable sink connector #2466

Conversation

prawilny commented Jan 12, 2025

google-cla bot commented Jan 12, 2025

generated-files-bot bot commented Jan 12, 2025

brandtnewton left a comment

Choose a reason for hiding this comment

prawilny commented Jan 24, 2025

brandtnewton commented Jan 29, 2025

brandtnewton left a comment

Choose a reason for hiding this comment

prawilny commented Feb 7, 2025

brandtnewton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prawilny commented Feb 11, 2025

brandtnewton left a comment

Choose a reason for hiding this comment

prawilny commented Feb 26, 2025