diff --git a/airbyte-cdk/java/airbyte-cdk/README.md b/airbyte-cdk/java/airbyte-cdk/README.md index 00e3302a7b71..afda18bc88ae 100644 --- a/airbyte-cdk/java/airbyte-cdk/README.md +++ b/airbyte-cdk/java/airbyte-cdk/README.md @@ -172,284 +172,285 @@ corresponds to that version. ### Java CDK -| Version | Date | Pull Request | Subject | -|:------------|:-----------| :--------------------------------------------------------- |:---------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Version | Date | Pull Request | Subject | +|:-----------|:-----------|:------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 0.44.0 | 2024-08-01 | [\#42405](https://github.com/airbytehq/airbyte/pull/42405) | s3-destinations: Use async framework, adapt to support refreshes | | 0.43.6 | 2024-07-30 | [\#42540](https://github.com/airbytehq/airbyte/pull/42540) | Fix generationId handling for destinations | | 0.43.6 | 2024-07-30 | [\#42514](https://github.com/airbytehq/airbyte/pull/42514) | Add tests around generationId handling for destinations. | -| 0.43.4 | 2024-07-28 | [\#42839](https://github.com/airbytehq/airbyte/pull/42839) | Fix error translation framework to not rethrow ConfigErrorException and TransientErrorException. | -| 0.43.3 | 2024-07-22 | [\#42417](https://github.com/airbytehq/airbyte/pull/42417) | Handle null exception message in ConnectorExceptionHandler. | -| 0.43.2 | 2024-07-22 | [\#42431](https://github.com/airbytehq/airbyte/pull/42431) | Filter out debezium message change events | -| 0.43.1 | 2024-07-22 | [\#41622](https://github.com/airbytehq/airbyte/pull/41622) | Fix null safety bug in debezium event processing | -| 0.43.0 | 2024-07-17 | [\#41954](https://github.com/airbytehq/airbyte/pull/41954) | fix refreshes for connectors using the old SqlOperations | -| 0.43.0 | 2024-07-17 | [\#42017](https://github.com/airbytehq/airbyte/pull/42017) | bump postgres-jdbc version | -| 0.43.0 | 2024-07-17 | [\#42015](https://github.com/airbytehq/airbyte/pull/42015) | wait until migration before creating the Writeconfig objects | -| 0.43.0 | 2024-07-17 | [\#41953](https://github.com/airbytehq/airbyte/pull/41953) | add generationId and syncId to SqlOperations functions | -| 0.43.0 | 2024-07-17 | [\#41952](https://github.com/airbytehq/airbyte/pull/41952) | rename and add fields in WriteConfig | -| 0.43.0 | 2024-07-17 | [\#41951](https://github.com/airbytehq/airbyte/pull/41951) | remove nullables in JdbcBufferedConsumerFactory | -| 0.43.0 | 2024-07-17 | [\#41950](https://github.com/airbytehq/airbyte/pull/41950) | remove unused classes | -| 0.42.2 | 2024-07-21 | [\#42122](https://github.com/airbytehq/airbyte/pull/42122) | Support for Debezium resync and shutdown scenarios. | -| 0.42.2 | 2024-07-04 | [\#40208](https://github.com/airbytehq/airbyte/pull/40208) | Implement a new connector error handling and translation framework | -| 0.41.8 | 2024-07-18 | [\#42068](https://github.com/airbytehq/airbyte/pull/42068) | Add analytics message for WASS occurrence. | -| 0.41.7 | 2024-07-17 | [\#42055](https://github.com/airbytehq/airbyte/pull/42055) | Add debezium heartbeat timeout back to shutdown debezium. | -| 0.41.6 | 2024-07-17 | [\#41996](https://github.com/airbytehq/airbyte/pull/41996) | Fix java interop compilation issue in Config/TransientErrorException. | -| 0.41.5 | 2024-07-16 | [\#42011] (https://github.com/airbytehq/airbyte/pull/42011) | Async consumer accepts null default namespace | -| 0.41.4 | 2024-07-15 | [\#41959](https://github.com/airbytehq/airbyte/pull/41959) | Allow setting `internal_message` in Config/TransientErrorException. Destinations: shorten error message for INCOMPLETE stream status. | -| 0.41.3 | 2024-07-15 | [\#41680](https://github.com/airbytehq/airbyte/pull/41680) | Fix: CompletableFutures.allOf now handles empty list and `Throwable` | -| 0.41.2 | 2024-07-12 | [\#40567](https://github.com/airbytehq/airbyte/pull/40567) | Fix BaseSqlGenerator test case (generation_id support); update minimum platform version for refreshes support. | -| 0.41.1 | 2024-07-11 | [\#41212](https://github.com/airbytehq/airbyte/pull/41212) | Improve debezium logging. | -| 0.41.0 | 2024-07-11 | [\#38240](https://github.com/airbytehq/airbyte/pull/38240) | Sources : Changes in CDC interfaces to support WASS algorithm | -| 0.40.11 | 2024-07-08 | [\#41041](https://github.com/airbytehq/airbyte/pull/41041) | Destinations: Fix truncate refreshes incorrectly discarding data if successful attempt had 0 records | -| 0.40.10 | 2024-07-05 | [\#40719](https://github.com/airbytehq/airbyte/pull/40719) | Update test to refrlect isResumable field in catalog | -| 0.40.9 | 2024-07-01 | [\#39473](https://github.com/airbytehq/airbyte/pull/39473) | minor changes around error logging and testing | -| 0.40.8 | 2024-07-01 | [\#40499](https://github.com/airbytehq/airbyte/pull/40499) | Make JdbcDatabase SQL statement logging optional; add generation_id support to JdbcSqlGenerator | -| 0.40.7 | 2024-07-01 | [\#40516](https://github.com/airbytehq/airbyte/pull/40516) | Remove dbz hearbeat. | -| ~~0.40.6~~ | | | (this version does not exist) | -| 0.40.5 | 2024-06-26 | [\#40517](https://github.com/airbytehq/airbyte/pull/40517) | JdbcDatabase.executeWithinTransaction allows disabling SQL statement logging | -| 0.40.4 | 2024-06-18 | [\#40254](https://github.com/airbytehq/airbyte/pull/40254) | Destinations: Do not throw on unrecognized airbyte message type (ignore message instead) | -| 0.40.3 | 2024-06-18 | [\#39526](https://github.com/airbytehq/airbyte/pull/39526) | Destinations: INCOMPLETE stream status is a TRANSIENT error rather than SYSTEM | -| 0.40.2 | 2024-06-18 | [\#39552](https://github.com/airbytehq/airbyte/pull/39552) | Destinations: Throw error if the ConfiguredCatalog has no streams | -| 0.40.1 | 2024-06-14 | [\#39349](https://github.com/airbytehq/airbyte/pull/39349) | Source stats for full refresh streams | -| 0.40.0 | 2024-06-17 | [\#38622](https://github.com/airbytehq/airbyte/pull/38622) | Destinations: Implement refreshes logic in AbstractStreamOperation | -| 0.39.0 | 2024-06-17 | [\#38067](https://github.com/airbytehq/airbyte/pull/38067) | Destinations: Breaking changes for refreshes (fail on INCOMPLETE stream status; ignore OVERWRITE sync mode) | -| 0.38.3 | 2024-06-25 | [\#40499](https://github.com/airbytehq/airbyte/pull/40499) | (backport) Make JdbcDatabase SQL statement logging optional; add generation_id support to JdbcSqlGenerator | -| 0.38.2 | 2024-06-14 | [\#39460](https://github.com/airbytehq/airbyte/pull/39460) | Bump postgres JDBC driver version | -| 0.38.1 | 2024-06-13 | [\#39445](https://github.com/airbytehq/airbyte/pull/39445) | Sources: More CDK changes to handle big initial snapshots. | -| 0.38.0 | 2024-06-11 | [\#39405](https://github.com/airbytehq/airbyte/pull/39405) | Sources: Debezium properties manager interface changed to accept a list of streams to scope to | -| 0.37.1 | 2024-06-10 | [\#38075](https://github.com/airbytehq/airbyte/pull/38075) | Destinations: Track stream statuses in async framework | -| 0.37.0 | 2024-06-10 | [\#38121](https://github.com/airbytehq/airbyte/pull/38121) | Destinations: Set default namespace via CatalogParser | -| 0.36.8 | 2024-06-07 | [\#38763](https://github.com/airbytehq/airbyte/pull/38763) | Increase Jackson message length limit | -| 0.36.7 | 2024-06-06 | [\#39220](https://github.com/airbytehq/airbyte/pull/39220) | Handle null messages in ConnectorExceptionUtil | -| 0.36.6 | 2024-06-05 | [\#39106](https://github.com/airbytehq/airbyte/pull/39106) | Skip write to storage with 0 byte file | -| 0.36.5 | 2024-06-01 | [\#38792](https://github.com/airbytehq/airbyte/pull/38792) | Throw config exception if no selectable table exists in user provided schemas | -| 0.36.4 | 2024-05-31 | [\#38824](https://github.com/airbytehq/airbyte/pull/38824) | Param marked as non-null to nullable in JdbcDestinationHandler for NPE fix | -| 0.36.2 | 2024-05-29 | [\#38538](https://github.com/airbytehq/airbyte/pull/38357) | Exit connector when encountering a config error. | -| 0.36.0 | 2024-05-29 | [\#38358](https://github.com/airbytehq/airbyte/pull/38358) | Plumb generation_id / sync_id to destinations code | -| 0.35.16 | 2024-06-25 | [\#40517](https://github.com/airbytehq/airbyte/pull/40517) | (backport) JdbcDatabase.executeWithinTransaction allows disabling SQL statement logging | -| 0.35.15 | 2024-05-31 | [\#38824](https://github.com/airbytehq/airbyte/pull/38824) | Param marked as non-null to nullable in JdbcDestinationHandler for NPE fix | -| 0.35.14 | 2024-05-28 | [\#38738](https://github.com/airbytehq/airbyte/pull/38738) | make ThreadCreationInfo cast as nullable | -| 0.35.13 | 2024-05-28 | [\#38632](https://github.com/airbytehq/airbyte/pull/38632) | minor changes to allow conversion of snowflake tests to kotlin | -| 0.35.12 | 2024-05-23 | [\#38638](https://github.com/airbytehq/airbyte/pull/38638) | Minor change to support Snowflake conversion to Kotlin | -| 0.35.11 | 2024-05-23 | [\#38357](https://github.com/airbytehq/airbyte/pull/38357) | This release fixes an error on the previous release. | -| 0.35.10 | 2024-05-23 | [\#38357](https://github.com/airbytehq/airbyte/pull/38357) | Add shared code for db sources stream status trace messages and testing. | -| 0.35.9 | 2024-05-23 | [\#38586](https://github.com/airbytehq/airbyte/pull/38586) | code cleanup | -| 0.35.9 | 2024-05-23 | [\#37583](https://github.com/airbytehq/airbyte/pull/37583) | code cleanup | -| 0.35.9 | 2024-05-23 | [\#37555](https://github.com/airbytehq/airbyte/pull/37555) | code cleanup | -| 0.35.9 | 2024-05-23 | [\#37540](https://github.com/airbytehq/airbyte/pull/37540) | code cleanup | -| 0.35.9 | 2024-05-23 | [\#37539](https://github.com/airbytehq/airbyte/pull/37539) | code cleanup | -| 0.35.9 | 2024-05-23 | [\#37538](https://github.com/airbytehq/airbyte/pull/37538) | code cleanup | -| 0.35.9 | 2024-05-23 | [\#37537](https://github.com/airbytehq/airbyte/pull/37537) | code cleanup | -| 0.35.9 | 2024-05-23 | [\#37518](https://github.com/airbytehq/airbyte/pull/37518) | code cleanup | -| 0.35.8 | 2024-05-22 | [\#38572](https://github.com/airbytehq/airbyte/pull/38572) | Add a temporary static method to decouple SnowflakeDestination from AbstractJdbcDestination | -| 0.35.7 | 2024-05-20 | [\#38357](https://github.com/airbytehq/airbyte/pull/38357) | Decouple create namespace from per stream operation interface. | -| 0.35.6 | 2024-05-17 | [\#38107](https://github.com/airbytehq/airbyte/pull/38107) | New interfaces for Destination connectors to plug into AsyncStreamConsumer | -| 0.35.5 | 2024-05-17 | [\#38204](https://github.com/airbytehq/airbyte/pull/38204) | add assume-role authentication to s3 | -| 0.35.2 | 2024-05-13 | [\#38104](https://github.com/airbytehq/airbyte/pull/38104) | Handle transient error messages | -| 0.35.0 | 2024-05-13 | [\#38127](https://github.com/airbytehq/airbyte/pull/38127) | Destinations: Populate generation/sync ID on StreamConfig | -| 0.34.4 | 2024-05-10 | [\#37712](https://github.com/airbytehq/airbyte/pull/37712) | make sure the exceptionHandler always terminates | -| 0.34.3 | 2024-05-10 | [\#38095](https://github.com/airbytehq/airbyte/pull/38095) | Minor changes for databricks connector | -| 0.34.1 | 2024-05-07 | [\#38030](https://github.com/airbytehq/airbyte/pull/38030) | Add support for transient errors | -| 0.34.0 | 2024-05-01 | [\#37712](https://github.com/airbytehq/airbyte/pull/37712) | Destinations: Remove incremental T+D | -| 0.33.2 | 2024-05-03 | [\#37824](https://github.com/airbytehq/airbyte/pull/37824) | improve source acceptance tests | -| 0.33.1 | 2024-05-03 | [\#37824](https://github.com/airbytehq/airbyte/pull/37824) | Add a unit test for cursor based sync | -| 0.33.0 | 2024-05-03 | [\#36935](https://github.com/airbytehq/airbyte/pull/36935) | Destinations: Enable non-safe-casting DV2 tests | -| 0.32.0 | 2024-05-03 | [\#36929](https://github.com/airbytehq/airbyte/pull/36929) | Destinations: Assorted DV2 changes for mysql | -| 0.31.7 | 2024-05-02 | [\#36910](https://github.com/airbytehq/airbyte/pull/36910) | changes for destination-snowflake | -| 0.31.6 | 2024-05-02 | [\#37746](https://github.com/airbytehq/airbyte/pull/37746) | debuggability improvements. | -| 0.31.5 | 2024-04-30 | [\#37758](https://github.com/airbytehq/airbyte/pull/37758) | Set debezium max retries to zero | -| 0.31.4 | 2024-04-30 | [\#37754](https://github.com/airbytehq/airbyte/pull/37754) | Add DebeziumEngine notification log | -| 0.31.3 | 2024-04-30 | [\#37726](https://github.com/airbytehq/airbyte/pull/37726) | Remove debezium retries | -| 0.31.2 | 2024-04-30 | [\#37507](https://github.com/airbytehq/airbyte/pull/37507) | Better error messages when switching between global/per-stream modes. | -| 0.31.0 | 2024-04-26 | [\#37584](https://github.com/airbytehq/airbyte/pull/37584) | Update S3 destination deps to exclude zookeeper and hadoop-yarn-common | -| 0.30.11 | 2024-04-25 | [\#36899](https://github.com/airbytehq/airbyte/pull/36899) | changes for bigQuery destination. | -| 0.30.10 | 2024-04-24 | [\#37541](https://github.com/airbytehq/airbyte/pull/37541) | remove excessive logging | -| 0.30.9 | 2024-04-24 | [\#37477](https://github.com/airbytehq/airbyte/pull/37477) | remove unnecessary logs | -| 0.30.7 | 2024-04-23 | [\#37477](https://github.com/airbytehq/airbyte/pull/37477) | fix kotlin warnings in core CDK submodule | -| 0.30.7 | 2024-04-23 | [\#37484](https://github.com/airbytehq/airbyte/pull/37484) | fix kotlin warnings in dependencies CDK submodule | -| 0.30.7 | 2024-04-23 | [\#37479](https://github.com/airbytehq/airbyte/pull/37479) | fix kotlin warnings in azure-destination, datastore-{bigquery,mongo,postgres} CDK submodules | -| 0.30.7 | 2024-04-23 | [\#37481](https://github.com/airbytehq/airbyte/pull/37481) | fix kotlin warnings in destination CDK submodules | -| 0.30.7 | 2024-04-23 | [\#37482](https://github.com/airbytehq/airbyte/pull/37482) | fix kotlin warnings in db-sources CDK submodule | -| 0.30.6 | 2024-04-19 | [\#37442](https://github.com/airbytehq/airbyte/pull/37442) | Destinations: Rename File format related classes to be agnostic of S3 | -| 0.30.3 | 2024-04-12 | [\#37106](https://github.com/airbytehq/airbyte/pull/37106) | Destinations: Simplify constructors in `AsyncStreamConsumer` | -| 0.30.2 | 2024-04-12 | [\#36926](https://github.com/airbytehq/airbyte/pull/36926) | Destinations: Remove `JdbcSqlOperations#formatData`; misc changes for java interop | -| 0.30.1 | 2024-04-11 | [\#36919](https://github.com/airbytehq/airbyte/pull/36919) | Fix regression in sources conversion of null values | -| 0.30.0 | 2024-04-11 | [\#36974](https://github.com/airbytehq/airbyte/pull/36974) | Destinations: Pass config to jdbc sqlgenerator; allow cascade drop | -| 0.29.13 | 2024-04-10 | [\#36981](https://github.com/airbytehq/airbyte/pull/36981) | DB sources : Emit analytics for data type serialization errors. | -| 0.29.12 | 2024-04-10 | [\#36973](https://github.com/airbytehq/airbyte/pull/36973) | Destinations: Make flush batch size configurable for JdbcInsertFlush | -| 0.29.11 | 2024-04-10 | [\#36865](https://github.com/airbytehq/airbyte/pull/36865) | Sources : Remove noisy log line. | -| 0.29.10 | 2024-04-10 | [\#36805](https://github.com/airbytehq/airbyte/pull/36805) | Destinations: Enhance CatalogParser name collision handling; add DV2 tests for long identifiers | -| 0.29.9 | 2024-04-09 | [\#36047](https://github.com/airbytehq/airbyte/pull/36047) | Destinations: CDK updates for raw-only destinations | -| 0.29.8 | 2024-04-08 | [\#36868](https://github.com/airbytehq/airbyte/pull/36868) | Destinations: s3-destinations Compilation fixes for connector | -| 0.29.7 | 2024-04-08 | [\#36768](https://github.com/airbytehq/airbyte/pull/36768) | Destinations: Make destination state fetch/commit logic more resilient to errors | -| 0.29.6 | 2024-04-05 | [\#36577](https://github.com/airbytehq/airbyte/pull/36577) | Do not send system_error trace message for config exceptions. | -| 0.29.5 | 2024-04-05 | [\#36620](https://github.com/airbytehq/airbyte/pull/36620) | Missed changes - open for extension for destination-postgres | -| 0.29.3 | 2024-04-04 | [\#36759](https://github.com/airbytehq/airbyte/pull/36759) | Minor fixes. | -| 0.29.3 | 2024-04-04 | [\#36706](https://github.com/airbytehq/airbyte/pull/36706) | Enabling spotbugs for s3-destination. | -| 0.29.3 | 2024-04-03 | [\#36705](https://github.com/airbytehq/airbyte/pull/36705) | Enabling spotbugs for db-sources. | -| 0.29.3 | 2024-04-03 | [\#36704](https://github.com/airbytehq/airbyte/pull/36704) | Enabling spotbugs for datastore-postgres. | -| 0.29.3 | 2024-04-03 | [\#36703](https://github.com/airbytehq/airbyte/pull/36703) | Enabling spotbugs for gcs-destination. | -| 0.29.3 | 2024-04-03 | [\#36702](https://github.com/airbytehq/airbyte/pull/36702) | Enabling spotbugs for db-destinations. | -| 0.29.3 | 2024-04-03 | [\#36701](https://github.com/airbytehq/airbyte/pull/36701) | Enabling spotbugs for typing_and_deduping. | -| 0.29.3 | 2024-04-03 | [\#36612](https://github.com/airbytehq/airbyte/pull/36612) | Enabling spotbugs for dependencies. | -| 0.29.5 | 2024-04-05 | [\#36577](https://github.com/airbytehq/airbyte/pull/36577) | Do not send system_error trace message for config exceptions. | -| 0.29.3 | 2024-04-04 | [\#36759](https://github.com/airbytehq/airbyte/pull/36759) | Minor fixes. | -| 0.29.3 | 2024-04-04 | [\#36706](https://github.com/airbytehq/airbyte/pull/36706) | Enabling spotbugs for s3-destination. | -| 0.29.3 | 2024-04-03 | [\#36705](https://github.com/airbytehq/airbyte/pull/36705) | Enabling spotbugs for db-sources. | -| 0.29.3 | 2024-04-03 | [\#36704](https://github.com/airbytehq/airbyte/pull/36704) | Enabling spotbugs for datastore-postgres. | -| 0.29.3 | 2024-04-03 | [\#36703](https://github.com/airbytehq/airbyte/pull/36703) | Enabling spotbugs for gcs-destination. | -| 0.29.3 | 2024-04-03 | [\#36702](https://github.com/airbytehq/airbyte/pull/36702) | Enabling spotbugs for db-destinations. | -| 0.29.3 | 2024-04-03 | [\#36701](https://github.com/airbytehq/airbyte/pull/36701) | Enabling spotbugs for typing_and_deduping. | -| 0.29.3 | 2024-04-03 | [\#36612](https://github.com/airbytehq/airbyte/pull/36612) | Enabling spotbugs for dependencies. | -| 0.29.2 | 2024-04-04 | [\#36845](https://github.com/airbytehq/airbyte/pull/36772) | Changes to make source-mongo compileable | -| 0.29.1 | 2024-04-03 | [\#36772](https://github.com/airbytehq/airbyte/pull/36772) | Changes to make source-mssql compileable | -| 0.29.0 | 2024-04-02 | [\#36759](https://github.com/airbytehq/airbyte/pull/36759) | Build artifact publication changes and fixes. | -| 0.28.21 | 2024-04-02 | [\#36673](https://github.com/airbytehq/airbyte/pull/36673) | Change the destination message parsing to use standard java/kotlin classes. Adds logging to catch empty lines. | -| 0.28.20 | 2024-04-01 | [\#36584](https://github.com/airbytehq/airbyte/pull/36584) | Changes to make source-postgres compileable | -| 0.28.19 | 2024-03-29 | [\#36619](https://github.com/airbytehq/airbyte/pull/36619) | Changes to make destination-postgres compileable | -| 0.28.19 | 2024-03-29 | [\#36588](https://github.com/airbytehq/airbyte/pull/36588) | Changes to make destination-redshift compileable | -| 0.28.19 | 2024-03-29 | [\#36610](https://github.com/airbytehq/airbyte/pull/36610) | remove airbyte-api generation, pull depdendency jars instead | -| 0.28.19 | 2024-03-29 | [\#36611](https://github.com/airbytehq/airbyte/pull/36611) | disable spotbugs for CDK tes and testFixtures tasks | -| 0.28.18 | 2024-03-28 | [\#36606](https://github.com/airbytehq/airbyte/pull/36574) | disable spotbugs for CDK tes and testFixtures tasks | -| 0.28.18 | 2024-03-28 | [\#36574](https://github.com/airbytehq/airbyte/pull/36574) | Fix ContainerFactory | -| 0.28.18 | 2024-03-27 | [\#36570](https://github.com/airbytehq/airbyte/pull/36570) | Convert missing s3-destinations tests to Kotlin | -| 0.28.18 | 2024-03-27 | [\#36446](https://github.com/airbytehq/airbyte/pull/36446) | Convert dependencies submodule to Kotlin | -| 0.28.18 | 2024-03-27 | [\#36445](https://github.com/airbytehq/airbyte/pull/36445) | Convert functional out Checked interfaces to kotlin | -| 0.28.18 | 2024-03-27 | [\#36444](https://github.com/airbytehq/airbyte/pull/36444) | Use apache-commons classes in our Checked functional interfaces | -| 0.28.18 | 2024-03-27 | [\#36467](https://github.com/airbytehq/airbyte/pull/36467) | Convert #36465 to Kotlin | -| 0.28.18 | 2024-03-27 | [\#36473](https://github.com/airbytehq/airbyte/pull/36473) | Convert convert #36396 to Kotlin | -| 0.28.18 | 2024-03-27 | [\#36439](https://github.com/airbytehq/airbyte/pull/36439) | Convert db-destinations submodule to Kotlin | -| 0.28.18 | 2024-03-27 | [\#36438](https://github.com/airbytehq/airbyte/pull/36438) | Convert db-sources submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36437](https://github.com/airbytehq/airbyte/pull/36437) | Convert gsc submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36421](https://github.com/airbytehq/airbyte/pull/36421) | Convert typing-deduping submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36420](https://github.com/airbytehq/airbyte/pull/36420) | Convert s3-destinations submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36419](https://github.com/airbytehq/airbyte/pull/36419) | Convert azure submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36413](https://github.com/airbytehq/airbyte/pull/36413) | Convert postgres submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36412](https://github.com/airbytehq/airbyte/pull/36412) | Convert mongodb submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36411](https://github.com/airbytehq/airbyte/pull/36411) | Convert datastore-bigquery submodule to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36205](https://github.com/airbytehq/airbyte/pull/36205) | Convert core/main to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36204](https://github.com/airbytehq/airbyte/pull/36204) | Convert core/test to Kotlin | -| 0.28.18 | 2024-03-26 | [\#36190](https://github.com/airbytehq/airbyte/pull/36190) | Convert core/testFixtures to Kotlin | -| 0.28.0 | 2024-03-26 | [\#36514](https://github.com/airbytehq/airbyte/pull/36514) | Bump CDK version to 0.28.0 | -| 0.27.7 | 2024-03-26 | [\#36466](https://github.com/airbytehq/airbyte/pull/36466) | Destinations: fix support for case-sensitive fields in destination state. | -| 0.27.6 | 2024-03-26 | [\#36432](https://github.com/airbytehq/airbyte/pull/36432) | Sources support for AirbyteRecordMessageMeta during reading source data types. | -| 0.27.5 | 2024-03-25 | [\#36461](https://github.com/airbytehq/airbyte/pull/36461) | Destinations: Handle case-sensitive columns in destination state handling. | -| 0.27.4 | 2024-03-25 | [\#36333](https://github.com/airbytehq/airbyte/pull/36333) | Sunset DebeziumSourceDecoratingIterator. | -| 0.27.1 | 2024-03-22 | [\#36296](https://github.com/airbytehq/airbyte/pull/36296) | Destinations: (async framework) Do not log invalid message data. | -| 0.27.0 | 2024-03-21 | [\#36364](https://github.com/airbytehq/airbyte/pull/36364) | Sources: Increase debezium initial record wait time to 40 minute. | -| 0.26.1 | 2024-03-19 | [\#35599](https://github.com/airbytehq/airbyte/pull/35599) | Sunset SourceDecoratingIterator. | -| 0.26.0 | 2024-03-19 | [\#36263](https://github.com/airbytehq/airbyte/pull/36263) | Improve conversion of debezium Date type for some edge case in mssql. | -| 0.25.0 | 2024-03-18 | [\#36203](https://github.com/airbytehq/airbyte/pull/36203) | Wiring of Transformer to StagingConsumerFactory and JdbcBufferedConsumerFactory; import changes for Kotlin conversion; State message logs to debug | -| 0.24.1 | 2024-03-13 | [\#36022](https://github.com/airbytehq/airbyte/pull/36022) | Move log4j2-test.xml to test fixtures, away from runtime classpath. | -| 0.24.0 | 2024-03-13 | [\#35944](https://github.com/airbytehq/airbyte/pull/35944) | Add `_airbyte_meta` in raw table and test fixture updates | -| 0.23.20 | 2024-03-12 | [\#36011](https://github.com/airbytehq/airbyte/pull/36011) | Debezium configuration for conversion of null value on a column with default value. | -| 0.23.19 | 2024-03-11 | [\#35904](https://github.com/airbytehq/airbyte/pull/35904) | Add retries to the debezium engine. | -| 0.23.18 | 2024-03-07 | [\#35899](https://github.com/airbytehq/airbyte/pull/35899) | Null check when retrieving destination state | -| 0.23.16 | 2024-03-06 | [\#35842](https://github.com/airbytehq/airbyte/pull/35842) | Improve logging in debezium processing. | -| 0.23.15 | 2024-03-05 | [\#35827](https://github.com/airbytehq/airbyte/pull/35827) | improving the Junit interceptor. | -| 0.23.14 | 2024-03-05 | [\#35739](https://github.com/airbytehq/airbyte/pull/35739) | Add logging to the CDC queue size. Fix the ContainerFactory. | -| 0.23.13 | 2024-03-04 | [\#35774](https://github.com/airbytehq/airbyte/pull/35774) | minor changes to the CDK test fixtures. | -| 0.23.12 | 2024-03-01 | [\#35767](https://github.com/airbytehq/airbyte/pull/35767) | introducing a timeout for java tests. | -| 0.23.11 | 2024-03-01 | [\#35313](https://github.com/airbytehq/airbyte/pull/35313) | Preserve timezone offset in CSV writer for destinations | -| 0.23.10 | 2024-03-01 | [\#35303](https://github.com/airbytehq/airbyte/pull/35303) | Migration framework with DestinationState for softReset | -| 0.23.9 | 2024-02-29 | [\#35720](https://github.com/airbytehq/airbyte/pull/35720) | various improvements for tests TestDataHolder | -| 0.23.8 | 2024-02-28 | [\#35529](https://github.com/airbytehq/airbyte/pull/35529) | Refactor on state iterators | -| 0.23.7 | 2024-02-28 | [\#35376](https://github.com/airbytehq/airbyte/pull/35376) | Extract typereduper migrations to separte method | -| 0.23.6 | 2024-02-26 | [\#35647](https://github.com/airbytehq/airbyte/pull/35647) | Add a getNamespace into TestDataHolder | -| 0.23.5 | 2024-02-26 | [\#35512](https://github.com/airbytehq/airbyte/pull/35512) | Remove @DisplayName from all CDK tests. | -| 0.23.4 | 2024-02-26 | [\#35507](https://github.com/airbytehq/airbyte/pull/35507) | Add more logs into TestDatabase. | -| 0.23.3 | 2024-02-26 | [\#35495](https://github.com/airbytehq/airbyte/pull/35495) | Fix Junit Interceptor to print better stacktraces | -| 0.23.2 | 2024-02-22 | [\#35385](https://github.com/airbytehq/airbyte/pull/35342) | Bugfix: inverted logic of disableTypeDedupe flag | -| 0.23.1 | 2024-02-22 | [\#35527](https://github.com/airbytehq/airbyte/pull/35527) | reduce shutdow timeouts | -| 0.23.0 | 2024-02-22 | [\#35342](https://github.com/airbytehq/airbyte/pull/35342) | Consolidate and perform upfront gathering of DB metadata state | -| 0.21.4 | 2024-02-21 | [\#35511](https://github.com/airbytehq/airbyte/pull/35511) | Reduce CDC state compression limit to 1MB | -| 0.21.3 | 2024-02-20 | [\#35394](https://github.com/airbytehq/airbyte/pull/35394) | Add Junit progress information to the test logs | -| 0.21.2 | 2024-02-20 | [\#34978](https://github.com/airbytehq/airbyte/pull/34978) | Reduce log noise in NormalizationLogParser. | -| 0.21.1 | 2024-02-20 | [\#35199](https://github.com/airbytehq/airbyte/pull/35199) | Add thread names to the logs. | -| 0.21.0 | 2024-02-16 | [\#35314](https://github.com/airbytehq/airbyte/pull/35314) | Delete S3StreamCopier classes. These have been superseded by the async destinations framework. | -| 0.20.9 | 2024-02-15 | [\#35240](https://github.com/airbytehq/airbyte/pull/35240) | Make state emission to platform inside state manager itself. | -| 0.20.8 | 2024-02-15 | [\#35285](https://github.com/airbytehq/airbyte/pull/35285) | Improve blobstore module structure. | -| 0.20.7 | 2024-02-13 | [\#35236](https://github.com/airbytehq/airbyte/pull/35236) | output logs to files in addition to stdout when running tests | -| 0.20.6 | 2024-02-12 | [\#35036](https://github.com/airbytehq/airbyte/pull/35036) | Add trace utility to emit analytics messages. | -| 0.20.5 | 2024-02-13 | [\#34869](https://github.com/airbytehq/airbyte/pull/34869) | Don't emit final state in SourceStateIterator there is an underlying stream failure. | -| 0.20.4 | 2024-02-12 | [\#35042](https://github.com/airbytehq/airbyte/pull/35042) | Use delegate's isDestinationV2 invocation in SshWrappedDestination. | -| 0.20.3 | 2024-02-09 | [\#34580](https://github.com/airbytehq/airbyte/pull/34580) | Support special chars in mysql/mssql database name. | -| 0.20.2 | 2024-02-12 | [\#35111](https://github.com/airbytehq/airbyte/pull/35144) | Make state emission from async framework synchronized. | -| 0.20.1 | 2024-02-11 | [\#35111](https://github.com/airbytehq/airbyte/pull/35111) | Fix GlobalAsyncStateManager stats counting logic. | -| 0.20.0 | 2024-02-09 | [\#34562](https://github.com/airbytehq/airbyte/pull/34562) | Add new test cases to BaseTypingDedupingTest to exercise special characters. | -| 0.19.0 | 2024-02-01 | [\#34745](https://github.com/airbytehq/airbyte/pull/34745) | Reorganize CDK module structure. | -| 0.18.0 | 2024-02-08 | [\#33606](https://github.com/airbytehq/airbyte/pull/33606) | Add updated Initial and Incremental Stream State definitions for DB Sources. | -| 0.17.1 | 2024-02-08 | [\#35027](https://github.com/airbytehq/airbyte/pull/35027) | Make state handling thread safe in async destination framework. | -| 0.17.0 | 2024-02-08 | [\#34502](https://github.com/airbytehq/airbyte/pull/34502) | Enable configuring async destination batch size. | -| 0.16.6 | 2024-02-07 | [\#34892](https://github.com/airbytehq/airbyte/pull/34892) | Improved testcontainers logging and support for unshared containers. | -| 0.16.5 | 2024-02-07 | [\#34948](https://github.com/airbytehq/airbyte/pull/34948) | Fix source state stats counting logic | -| 0.16.4 | 2024-02-01 | [\#34727](https://github.com/airbytehq/airbyte/pull/34727) | Add future based stdout consumer in BaseTypingDedupingTest | -| 0.16.3 | 2024-01-30 | [\#34669](https://github.com/airbytehq/airbyte/pull/34669) | Fix org.apache.logging.log4j:log4j-slf4j-impl version conflicts. | -| 0.16.2 | 2024-01-29 | [\#34630](https://github.com/airbytehq/airbyte/pull/34630) | expose NamingTransformer to sub-classes in destinations JdbcSqlGenerator. | -| 0.16.1 | 2024-01-29 | [\#34533](https://github.com/airbytehq/airbyte/pull/34533) | Add a safe method to execute DatabaseMetadata's Resultset returning queries. | -| 0.16.0 | 2024-01-26 | [\#34573](https://github.com/airbytehq/airbyte/pull/34573) | Untangle Debezium harness dependencies. | -| 0.15.2 | 2024-01-25 | [\#34441](https://github.com/airbytehq/airbyte/pull/34441) | Improve airbyte-api build performance. | -| 0.15.1 | 2024-01-25 | [\#34451](https://github.com/airbytehq/airbyte/pull/34451) | Async destinations: Better logging when we fail to parse an AirbyteMessage | -| 0.15.0 | 2024-01-23 | [\#34441](https://github.com/airbytehq/airbyte/pull/34441) | Removed connector registry and micronaut dependencies. | -| 0.14.2 | 2024-01-24 | [\#34458](https://github.com/airbytehq/airbyte/pull/34458) | Handle case-sensitivity in sentry error grouping | -| 0.14.1 | 2024-01-24 | [\#34468](https://github.com/airbytehq/airbyte/pull/34468) | Add wait for process to be done before ending sync in destination BaseTDTest | -| 0.14.0 | 2024-01-23 | [\#34461](https://github.com/airbytehq/airbyte/pull/34461) | Revert non backward compatible signature changes from 0.13.1 | -| 0.13.3 | 2024-01-23 | [\#34077](https://github.com/airbytehq/airbyte/pull/34077) | Denote if destinations fully support Destinations V2 | -| 0.13.2 | 2024-01-18 | [\#34364](https://github.com/airbytehq/airbyte/pull/34364) | Better logging in mongo db source connector | -| 0.13.1 | 2024-01-18 | [\#34236](https://github.com/airbytehq/airbyte/pull/34236) | Add postCreateTable hook in destination JdbcSqlGenerator | -| 0.13.0 | 2024-01-16 | [\#34177](https://github.com/airbytehq/airbyte/pull/34177) | Add `useExpensiveSafeCasting` param in JdbcSqlGenerator methods; add JdbcTypingDedupingTest fixture; other DV2-related changes | -| 0.12.1 | 2024-01-11 | [\#34186](https://github.com/airbytehq/airbyte/pull/34186) | Add hook for additional destination specific checks to JDBC destination check method | -| 0.12.0 | 2024-01-10 | [\#33875](https://github.com/airbytehq/airbyte/pull/33875) | Upgrade sshd-mina to 2.11.1 | -| 0.11.5 | 2024-01-10 | [\#34119](https://github.com/airbytehq/airbyte/pull/34119) | Remove wal2json support for postgres+debezium. | -| 0.11.4 | 2024-01-09 | [\#33305](https://github.com/airbytehq/airbyte/pull/33305) | Source stats in incremental syncs | -| 0.11.3 | 2023-01-09 | [\#33658](https://github.com/airbytehq/airbyte/pull/33658) | Always fail when debezium fails, even if it happened during the setup phase. | -| 0.11.2 | 2024-01-09 | [\#33969](https://github.com/airbytehq/airbyte/pull/33969) | Destination state stats implementation | -| 0.11.1 | 2024-01-04 | [\#33727](https://github.com/airbytehq/airbyte/pull/33727) | SSH bastion heartbeats for Destinations | -| 0.11.0 | 2024-01-04 | [\#33730](https://github.com/airbytehq/airbyte/pull/33730) | DV2 T+D uses Sql struct to represent transactions; other T+D-related changes | -| 0.10.4 | 2023-12-20 | [\#33071](https://github.com/airbytehq/airbyte/pull/33071) | Add the ability to parse JDBC parameters with another delimiter than '&' | -| 0.10.3 | 2024-01-03 | [\#33312](https://github.com/airbytehq/airbyte/pull/33312) | Send out count in AirbyteStateMessage | -| 0.10.1 | 2023-12-21 | [\#33723](https://github.com/airbytehq/airbyte/pull/33723) | Make memory-manager log message less scary | -| 0.10.0 | 2023-12-20 | [\#33704](https://github.com/airbytehq/airbyte/pull/33704) | JdbcDestinationHandler now properly implements `getInitialRawTableState`; reenable SqlGenerator test | -| 0.9.0 | 2023-12-18 | [\#33124](https://github.com/airbytehq/airbyte/pull/33124) | Make Schema Creation Separate from Table Creation, exclude the T&D module from the CDK | -| 0.8.0 | 2023-12-18 | [\#33506](https://github.com/airbytehq/airbyte/pull/33506) | Improve async destination shutdown logic; more JDBC async migration work; improve DAT test schema handling | -| 0.7.9 | 2023-12-18 | [\#33549](https://github.com/airbytehq/airbyte/pull/33549) | Improve MongoDB logging. | -| 0.7.8 | 2023-12-18 | [\#33365](https://github.com/airbytehq/airbyte/pull/33365) | Emit stream statuses more consistently | -| 0.7.7 | 2023-12-18 | [\#33434](https://github.com/airbytehq/airbyte/pull/33307) | Remove LEGACY state | -| 0.7.6 | 2023-12-14 | [\#32328](https://github.com/airbytehq/airbyte/pull/33307) | Add schema less mode for mongodb CDC. Fixes for non standard mongodb id type. | -| 0.7.4 | 2023-12-13 | [\#33232](https://github.com/airbytehq/airbyte/pull/33232) | Track stream record count during sync; only run T+D if a stream had nonzero records or the previous sync left unprocessed records. | -| 0.7.3 | 2023-12-13 | [\#33369](https://github.com/airbytehq/airbyte/pull/33369) | Extract shared JDBC T+D code. | -| 0.7.2 | 2023-12-11 | [\#33307](https://github.com/airbytehq/airbyte/pull/33307) | Fix DV2 JDBC type mappings (code changes in [\#33307](https://github.com/airbytehq/airbyte/pull/33307)). | -| 0.7.1 | 2023-12-01 | [\#33027](https://github.com/airbytehq/airbyte/pull/33027) | Add the abstract DB source debugger. | -| 0.7.0 | 2023-12-07 | [\#32326](https://github.com/airbytehq/airbyte/pull/32326) | Destinations V2 changes for JDBC destinations | -| 0.6.4 | 2023-12-06 | [\#33082](https://github.com/airbytehq/airbyte/pull/33082) | Improvements to schema snapshot error handling + schema snapshot history scope (scoped to configured DB). | -| 0.6.2 | 2023-11-30 | [\#32573](https://github.com/airbytehq/airbyte/pull/32573) | Update MSSQLConverter to enforce 6-digit microsecond precision for timestamp fields | -| 0.6.1 | 2023-11-30 | [\#32610](https://github.com/airbytehq/airbyte/pull/32610) | Support DB initial sync using binary as primary key. | -| 0.6.0 | 2023-11-30 | [\#32888](https://github.com/airbytehq/airbyte/pull/32888) | JDBC destinations now use the async framework | -| 0.5.3 | 2023-11-28 | [\#32686](https://github.com/airbytehq/airbyte/pull/32686) | Better attribution of debezium engine shutdown due to heartbeat. | -| 0.5.1 | 2023-11-27 | [\#32662](https://github.com/airbytehq/airbyte/pull/32662) | Debezium initialization wait time will now read from initial setup time. | -| 0.5.0 | 2023-11-22 | [\#32656](https://github.com/airbytehq/airbyte/pull/32656) | Introduce TestDatabase test fixture, refactor database source test base classes. | -| 0.4.11 | 2023-11-14 | [\#32526](https://github.com/airbytehq/airbyte/pull/32526) | Clean up memory manager logs. | -| 0.4.10 | 2023-11-13 | [\#32285](https://github.com/airbytehq/airbyte/pull/32285) | Fix UUID codec ordering for MongoDB connector | -| 0.4.9 | 2023-11-13 | [\#32468](https://github.com/airbytehq/airbyte/pull/32468) | Further error grouping improvements for DV2 connectors | -| 0.4.8 | 2023-11-09 | [\#32377](https://github.com/airbytehq/airbyte/pull/32377) | source-postgres tests: skip dropping database | -| 0.4.7 | 2023-11-08 | [\#31856](https://github.com/airbytehq/airbyte/pull/31856) | source-postgres: support for infinity date and timestamps | -| 0.4.5 | 2023-11-07 | [\#32112](https://github.com/airbytehq/airbyte/pull/32112) | Async destinations framework: Allow configuring the queue flush threshold | -| 0.4.4 | 2023-11-06 | [\#32119](https://github.com/airbytehq/airbyte/pull/32119) | Add STANDARD UUID codec to MongoDB debezium handler | -| 0.4.2 | 2023-11-06 | [\#32190](https://github.com/airbytehq/airbyte/pull/32190) | Improve error deinterpolation | -| 0.4.1 | 2023-11-02 | [\#32192](https://github.com/airbytehq/airbyte/pull/32192) | Add 's3-destinations' CDK module. | -| 0.4.0 | 2023-11-02 | [\#32050](https://github.com/airbytehq/airbyte/pull/32050) | Fix compiler warnings. | -| 0.3.0 | 2023-11-02 | [\#31983](https://github.com/airbytehq/airbyte/pull/31983) | Add deinterpolation feature to AirbyteExceptionHandler. | -| 0.2.4 | 2023-10-31 | [\#31807](https://github.com/airbytehq/airbyte/pull/31807) | Handle case of debezium update and delete of records in mongodb. | -| 0.2.3 | 2023-10-31 | [\#32022](https://github.com/airbytehq/airbyte/pull/32022) | Update Debezium version from 2.20 -> 2.4.0. | -| 0.2.2 | 2023-10-31 | [\#31976](https://github.com/airbytehq/airbyte/pull/31976) | Debezium tweaks to make tests run faster. | -| 0.2.0 | 2023-10-30 | [\#31960](https://github.com/airbytehq/airbyte/pull/31960) | Hoist top-level gradle subprojects into CDK. | -| 0.1.12 | 2023-10-24 | [\#31674](https://github.com/airbytehq/airbyte/pull/31674) | Fail sync when Debezium does not shut down properly. | -| 0.1.11 | 2023-10-18 | [\#31486](https://github.com/airbytehq/airbyte/pull/31486) | Update constants in AdaptiveSourceRunner. | -| 0.1.9 | 2023-10-12 | [\#31309](https://github.com/airbytehq/airbyte/pull/31309) | Use toPlainString() when handling BigDecimals in PostgresConverter | -| 0.1.8 | 2023-10-11 | [\#31322](https://github.com/airbytehq/airbyte/pull/31322) | Cap log line length to 32KB to prevent loss of records | -| 0.1.7 | 2023-10-10 | [\#31194](https://github.com/airbytehq/airbyte/pull/31194) | Deallocate unused per stream buffer memory when empty | -| 0.1.6 | 2023-10-10 | [\#31083](https://github.com/airbytehq/airbyte/pull/31083) | Fix precision of numeric values in async destinations | -| 0.1.5 | 2023-10-09 | [\#31196](https://github.com/airbytehq/airbyte/pull/31196) | Update typo in CDK (CDN_LSN -> CDC_LSN) | -| 0.1.4 | 2023-10-06 | [\#31139](https://github.com/airbytehq/airbyte/pull/31139) | Reduce async buffer | -| 0.1.1 | 2023-09-28 | [\#30835](https://github.com/airbytehq/airbyte/pull/30835) | JDBC destinations now avoid staging area name collisions by using the raw table name as the stage name. (previously we used the stream name as the stage name) | -| 0.1.0 | 2023-09-27 | [\#30445](https://github.com/airbytehq/airbyte/pull/30445) | First launch, including shared classes for all connectors. | -| 0.0.2 | 2023-08-21 | [\#28687](https://github.com/airbytehq/airbyte/pull/28687) | Version bump only (no other changes). | -| 0.0.1 | 2023-08-08 | [\#28687](https://github.com/airbytehq/airbyte/pull/28687) | Initial release for testing. | +| 0.43.4 | 2024-07-28 | [\#42839](https://github.com/airbytehq/airbyte/pull/42839) | Fix error translation framework to not rethrow ConfigErrorException and TransientErrorException. | +| 0.43.3 | 2024-07-22 | [\#42417](https://github.com/airbytehq/airbyte/pull/42417) | Handle null exception message in ConnectorExceptionHandler. | +| 0.43.2 | 2024-07-22 | [\#42431](https://github.com/airbytehq/airbyte/pull/42431) | Filter out debezium message change events | +| 0.43.1 | 2024-07-22 | [\#41622](https://github.com/airbytehq/airbyte/pull/41622) | Fix null safety bug in debezium event processing | +| 0.43.0 | 2024-07-17 | [\#41954](https://github.com/airbytehq/airbyte/pull/41954) | fix refreshes for connectors using the old SqlOperations | +| 0.43.0 | 2024-07-17 | [\#42017](https://github.com/airbytehq/airbyte/pull/42017) | bump postgres-jdbc version | +| 0.43.0 | 2024-07-17 | [\#42015](https://github.com/airbytehq/airbyte/pull/42015) | wait until migration before creating the Writeconfig objects | +| 0.43.0 | 2024-07-17 | [\#41953](https://github.com/airbytehq/airbyte/pull/41953) | add generationId and syncId to SqlOperations functions | +| 0.43.0 | 2024-07-17 | [\#41952](https://github.com/airbytehq/airbyte/pull/41952) | rename and add fields in WriteConfig | +| 0.43.0 | 2024-07-17 | [\#41951](https://github.com/airbytehq/airbyte/pull/41951) | remove nullables in JdbcBufferedConsumerFactory | +| 0.43.0 | 2024-07-17 | [\#41950](https://github.com/airbytehq/airbyte/pull/41950) | remove unused classes | +| 0.42.2 | 2024-07-21 | [\#42122](https://github.com/airbytehq/airbyte/pull/42122) | Support for Debezium resync and shutdown scenarios. | +| 0.42.2 | 2024-07-04 | [\#40208](https://github.com/airbytehq/airbyte/pull/40208) | Implement a new connector error handling and translation framework | +| 0.41.8 | 2024-07-18 | [\#42068](https://github.com/airbytehq/airbyte/pull/42068) | Add analytics message for WASS occurrence. | +| 0.41.7 | 2024-07-17 | [\#42055](https://github.com/airbytehq/airbyte/pull/42055) | Add debezium heartbeat timeout back to shutdown debezium. | +| 0.41.6 | 2024-07-17 | [\#41996](https://github.com/airbytehq/airbyte/pull/41996) | Fix java interop compilation issue in Config/TransientErrorException. | +| 0.41.5 | 2024-07-16 | [\#42011] (https://github.com/airbytehq/airbyte/pull/42011) | Async consumer accepts null default namespace | +| 0.41.4 | 2024-07-15 | [\#41959](https://github.com/airbytehq/airbyte/pull/41959) | Allow setting `internal_message` in Config/TransientErrorException. Destinations: shorten error message for INCOMPLETE stream status. | +| 0.41.3 | 2024-07-15 | [\#41680](https://github.com/airbytehq/airbyte/pull/41680) | Fix: CompletableFutures.allOf now handles empty list and `Throwable` | +| 0.41.2 | 2024-07-12 | [\#40567](https://github.com/airbytehq/airbyte/pull/40567) | Fix BaseSqlGenerator test case (generation_id support); update minimum platform version for refreshes support. | +| 0.41.1 | 2024-07-11 | [\#41212](https://github.com/airbytehq/airbyte/pull/41212) | Improve debezium logging. | +| 0.41.0 | 2024-07-11 | [\#38240](https://github.com/airbytehq/airbyte/pull/38240) | Sources : Changes in CDC interfaces to support WASS algorithm | +| 0.40.11 | 2024-07-08 | [\#41041](https://github.com/airbytehq/airbyte/pull/41041) | Destinations: Fix truncate refreshes incorrectly discarding data if successful attempt had 0 records | +| 0.40.10 | 2024-07-05 | [\#40719](https://github.com/airbytehq/airbyte/pull/40719) | Update test to refrlect isResumable field in catalog | +| 0.40.9 | 2024-07-01 | [\#39473](https://github.com/airbytehq/airbyte/pull/39473) | minor changes around error logging and testing | +| 0.40.8 | 2024-07-01 | [\#40499](https://github.com/airbytehq/airbyte/pull/40499) | Make JdbcDatabase SQL statement logging optional; add generation_id support to JdbcSqlGenerator | +| 0.40.7 | 2024-07-01 | [\#40516](https://github.com/airbytehq/airbyte/pull/40516) | Remove dbz hearbeat. | +| ~~0.40.6~~ | | | (this version does not exist) | +| 0.40.5 | 2024-06-26 | [\#40517](https://github.com/airbytehq/airbyte/pull/40517) | JdbcDatabase.executeWithinTransaction allows disabling SQL statement logging | +| 0.40.4 | 2024-06-18 | [\#40254](https://github.com/airbytehq/airbyte/pull/40254) | Destinations: Do not throw on unrecognized airbyte message type (ignore message instead) | +| 0.40.3 | 2024-06-18 | [\#39526](https://github.com/airbytehq/airbyte/pull/39526) | Destinations: INCOMPLETE stream status is a TRANSIENT error rather than SYSTEM | +| 0.40.2 | 2024-06-18 | [\#39552](https://github.com/airbytehq/airbyte/pull/39552) | Destinations: Throw error if the ConfiguredCatalog has no streams | +| 0.40.1 | 2024-06-14 | [\#39349](https://github.com/airbytehq/airbyte/pull/39349) | Source stats for full refresh streams | +| 0.40.0 | 2024-06-17 | [\#38622](https://github.com/airbytehq/airbyte/pull/38622) | Destinations: Implement refreshes logic in AbstractStreamOperation | +| 0.39.0 | 2024-06-17 | [\#38067](https://github.com/airbytehq/airbyte/pull/38067) | Destinations: Breaking changes for refreshes (fail on INCOMPLETE stream status; ignore OVERWRITE sync mode) | +| 0.38.3 | 2024-06-25 | [\#40499](https://github.com/airbytehq/airbyte/pull/40499) | (backport) Make JdbcDatabase SQL statement logging optional; add generation_id support to JdbcSqlGenerator | +| 0.38.2 | 2024-06-14 | [\#39460](https://github.com/airbytehq/airbyte/pull/39460) | Bump postgres JDBC driver version | +| 0.38.1 | 2024-06-13 | [\#39445](https://github.com/airbytehq/airbyte/pull/39445) | Sources: More CDK changes to handle big initial snapshots. | +| 0.38.0 | 2024-06-11 | [\#39405](https://github.com/airbytehq/airbyte/pull/39405) | Sources: Debezium properties manager interface changed to accept a list of streams to scope to | +| 0.37.1 | 2024-06-10 | [\#38075](https://github.com/airbytehq/airbyte/pull/38075) | Destinations: Track stream statuses in async framework | +| 0.37.0 | 2024-06-10 | [\#38121](https://github.com/airbytehq/airbyte/pull/38121) | Destinations: Set default namespace via CatalogParser | +| 0.36.8 | 2024-06-07 | [\#38763](https://github.com/airbytehq/airbyte/pull/38763) | Increase Jackson message length limit | +| 0.36.7 | 2024-06-06 | [\#39220](https://github.com/airbytehq/airbyte/pull/39220) | Handle null messages in ConnectorExceptionUtil | +| 0.36.6 | 2024-06-05 | [\#39106](https://github.com/airbytehq/airbyte/pull/39106) | Skip write to storage with 0 byte file | +| 0.36.5 | 2024-06-01 | [\#38792](https://github.com/airbytehq/airbyte/pull/38792) | Throw config exception if no selectable table exists in user provided schemas | +| 0.36.4 | 2024-05-31 | [\#38824](https://github.com/airbytehq/airbyte/pull/38824) | Param marked as non-null to nullable in JdbcDestinationHandler for NPE fix | +| 0.36.2 | 2024-05-29 | [\#38538](https://github.com/airbytehq/airbyte/pull/38357) | Exit connector when encountering a config error. | +| 0.36.0 | 2024-05-29 | [\#38358](https://github.com/airbytehq/airbyte/pull/38358) | Plumb generation_id / sync_id to destinations code | +| 0.35.16 | 2024-06-25 | [\#40517](https://github.com/airbytehq/airbyte/pull/40517) | (backport) JdbcDatabase.executeWithinTransaction allows disabling SQL statement logging | +| 0.35.15 | 2024-05-31 | [\#38824](https://github.com/airbytehq/airbyte/pull/38824) | Param marked as non-null to nullable in JdbcDestinationHandler for NPE fix | +| 0.35.14 | 2024-05-28 | [\#38738](https://github.com/airbytehq/airbyte/pull/38738) | make ThreadCreationInfo cast as nullable | +| 0.35.13 | 2024-05-28 | [\#38632](https://github.com/airbytehq/airbyte/pull/38632) | minor changes to allow conversion of snowflake tests to kotlin | +| 0.35.12 | 2024-05-23 | [\#38638](https://github.com/airbytehq/airbyte/pull/38638) | Minor change to support Snowflake conversion to Kotlin | +| 0.35.11 | 2024-05-23 | [\#38357](https://github.com/airbytehq/airbyte/pull/38357) | This release fixes an error on the previous release. | +| 0.35.10 | 2024-05-23 | [\#38357](https://github.com/airbytehq/airbyte/pull/38357) | Add shared code for db sources stream status trace messages and testing. | +| 0.35.9 | 2024-05-23 | [\#38586](https://github.com/airbytehq/airbyte/pull/38586) | code cleanup | +| 0.35.9 | 2024-05-23 | [\#37583](https://github.com/airbytehq/airbyte/pull/37583) | code cleanup | +| 0.35.9 | 2024-05-23 | [\#37555](https://github.com/airbytehq/airbyte/pull/37555) | code cleanup | +| 0.35.9 | 2024-05-23 | [\#37540](https://github.com/airbytehq/airbyte/pull/37540) | code cleanup | +| 0.35.9 | 2024-05-23 | [\#37539](https://github.com/airbytehq/airbyte/pull/37539) | code cleanup | +| 0.35.9 | 2024-05-23 | [\#37538](https://github.com/airbytehq/airbyte/pull/37538) | code cleanup | +| 0.35.9 | 2024-05-23 | [\#37537](https://github.com/airbytehq/airbyte/pull/37537) | code cleanup | +| 0.35.9 | 2024-05-23 | [\#37518](https://github.com/airbytehq/airbyte/pull/37518) | code cleanup | +| 0.35.8 | 2024-05-22 | [\#38572](https://github.com/airbytehq/airbyte/pull/38572) | Add a temporary static method to decouple SnowflakeDestination from AbstractJdbcDestination | +| 0.35.7 | 2024-05-20 | [\#38357](https://github.com/airbytehq/airbyte/pull/38357) | Decouple create namespace from per stream operation interface. | +| 0.35.6 | 2024-05-17 | [\#38107](https://github.com/airbytehq/airbyte/pull/38107) | New interfaces for Destination connectors to plug into AsyncStreamConsumer | +| 0.35.5 | 2024-05-17 | [\#38204](https://github.com/airbytehq/airbyte/pull/38204) | add assume-role authentication to s3 | +| 0.35.2 | 2024-05-13 | [\#38104](https://github.com/airbytehq/airbyte/pull/38104) | Handle transient error messages | +| 0.35.0 | 2024-05-13 | [\#38127](https://github.com/airbytehq/airbyte/pull/38127) | Destinations: Populate generation/sync ID on StreamConfig | +| 0.34.4 | 2024-05-10 | [\#37712](https://github.com/airbytehq/airbyte/pull/37712) | make sure the exceptionHandler always terminates | +| 0.34.3 | 2024-05-10 | [\#38095](https://github.com/airbytehq/airbyte/pull/38095) | Minor changes for databricks connector | +| 0.34.1 | 2024-05-07 | [\#38030](https://github.com/airbytehq/airbyte/pull/38030) | Add support for transient errors | +| 0.34.0 | 2024-05-01 | [\#37712](https://github.com/airbytehq/airbyte/pull/37712) | Destinations: Remove incremental T+D | +| 0.33.2 | 2024-05-03 | [\#37824](https://github.com/airbytehq/airbyte/pull/37824) | improve source acceptance tests | +| 0.33.1 | 2024-05-03 | [\#37824](https://github.com/airbytehq/airbyte/pull/37824) | Add a unit test for cursor based sync | +| 0.33.0 | 2024-05-03 | [\#36935](https://github.com/airbytehq/airbyte/pull/36935) | Destinations: Enable non-safe-casting DV2 tests | +| 0.32.0 | 2024-05-03 | [\#36929](https://github.com/airbytehq/airbyte/pull/36929) | Destinations: Assorted DV2 changes for mysql | +| 0.31.7 | 2024-05-02 | [\#36910](https://github.com/airbytehq/airbyte/pull/36910) | changes for destination-snowflake | +| 0.31.6 | 2024-05-02 | [\#37746](https://github.com/airbytehq/airbyte/pull/37746) | debuggability improvements. | +| 0.31.5 | 2024-04-30 | [\#37758](https://github.com/airbytehq/airbyte/pull/37758) | Set debezium max retries to zero | +| 0.31.4 | 2024-04-30 | [\#37754](https://github.com/airbytehq/airbyte/pull/37754) | Add DebeziumEngine notification log | +| 0.31.3 | 2024-04-30 | [\#37726](https://github.com/airbytehq/airbyte/pull/37726) | Remove debezium retries | +| 0.31.2 | 2024-04-30 | [\#37507](https://github.com/airbytehq/airbyte/pull/37507) | Better error messages when switching between global/per-stream modes. | +| 0.31.0 | 2024-04-26 | [\#37584](https://github.com/airbytehq/airbyte/pull/37584) | Update S3 destination deps to exclude zookeeper and hadoop-yarn-common | +| 0.30.11 | 2024-04-25 | [\#36899](https://github.com/airbytehq/airbyte/pull/36899) | changes for bigQuery destination. | +| 0.30.10 | 2024-04-24 | [\#37541](https://github.com/airbytehq/airbyte/pull/37541) | remove excessive logging | +| 0.30.9 | 2024-04-24 | [\#37477](https://github.com/airbytehq/airbyte/pull/37477) | remove unnecessary logs | +| 0.30.7 | 2024-04-23 | [\#37477](https://github.com/airbytehq/airbyte/pull/37477) | fix kotlin warnings in core CDK submodule | +| 0.30.7 | 2024-04-23 | [\#37484](https://github.com/airbytehq/airbyte/pull/37484) | fix kotlin warnings in dependencies CDK submodule | +| 0.30.7 | 2024-04-23 | [\#37479](https://github.com/airbytehq/airbyte/pull/37479) | fix kotlin warnings in azure-destination, datastore-{bigquery,mongo,postgres} CDK submodules | +| 0.30.7 | 2024-04-23 | [\#37481](https://github.com/airbytehq/airbyte/pull/37481) | fix kotlin warnings in destination CDK submodules | +| 0.30.7 | 2024-04-23 | [\#37482](https://github.com/airbytehq/airbyte/pull/37482) | fix kotlin warnings in db-sources CDK submodule | +| 0.30.6 | 2024-04-19 | [\#37442](https://github.com/airbytehq/airbyte/pull/37442) | Destinations: Rename File format related classes to be agnostic of S3 | +| 0.30.3 | 2024-04-12 | [\#37106](https://github.com/airbytehq/airbyte/pull/37106) | Destinations: Simplify constructors in `AsyncStreamConsumer` | +| 0.30.2 | 2024-04-12 | [\#36926](https://github.com/airbytehq/airbyte/pull/36926) | Destinations: Remove `JdbcSqlOperations#formatData`; misc changes for java interop | +| 0.30.1 | 2024-04-11 | [\#36919](https://github.com/airbytehq/airbyte/pull/36919) | Fix regression in sources conversion of null values | +| 0.30.0 | 2024-04-11 | [\#36974](https://github.com/airbytehq/airbyte/pull/36974) | Destinations: Pass config to jdbc sqlgenerator; allow cascade drop | +| 0.29.13 | 2024-04-10 | [\#36981](https://github.com/airbytehq/airbyte/pull/36981) | DB sources : Emit analytics for data type serialization errors. | +| 0.29.12 | 2024-04-10 | [\#36973](https://github.com/airbytehq/airbyte/pull/36973) | Destinations: Make flush batch size configurable for JdbcInsertFlush | +| 0.29.11 | 2024-04-10 | [\#36865](https://github.com/airbytehq/airbyte/pull/36865) | Sources : Remove noisy log line. | +| 0.29.10 | 2024-04-10 | [\#36805](https://github.com/airbytehq/airbyte/pull/36805) | Destinations: Enhance CatalogParser name collision handling; add DV2 tests for long identifiers | +| 0.29.9 | 2024-04-09 | [\#36047](https://github.com/airbytehq/airbyte/pull/36047) | Destinations: CDK updates for raw-only destinations | +| 0.29.8 | 2024-04-08 | [\#36868](https://github.com/airbytehq/airbyte/pull/36868) | Destinations: s3-destinations Compilation fixes for connector | +| 0.29.7 | 2024-04-08 | [\#36768](https://github.com/airbytehq/airbyte/pull/36768) | Destinations: Make destination state fetch/commit logic more resilient to errors | +| 0.29.6 | 2024-04-05 | [\#36577](https://github.com/airbytehq/airbyte/pull/36577) | Do not send system_error trace message for config exceptions. | +| 0.29.5 | 2024-04-05 | [\#36620](https://github.com/airbytehq/airbyte/pull/36620) | Missed changes - open for extension for destination-postgres | +| 0.29.3 | 2024-04-04 | [\#36759](https://github.com/airbytehq/airbyte/pull/36759) | Minor fixes. | +| 0.29.3 | 2024-04-04 | [\#36706](https://github.com/airbytehq/airbyte/pull/36706) | Enabling spotbugs for s3-destination. | +| 0.29.3 | 2024-04-03 | [\#36705](https://github.com/airbytehq/airbyte/pull/36705) | Enabling spotbugs for db-sources. | +| 0.29.3 | 2024-04-03 | [\#36704](https://github.com/airbytehq/airbyte/pull/36704) | Enabling spotbugs for datastore-postgres. | +| 0.29.3 | 2024-04-03 | [\#36703](https://github.com/airbytehq/airbyte/pull/36703) | Enabling spotbugs for gcs-destination. | +| 0.29.3 | 2024-04-03 | [\#36702](https://github.com/airbytehq/airbyte/pull/36702) | Enabling spotbugs for db-destinations. | +| 0.29.3 | 2024-04-03 | [\#36701](https://github.com/airbytehq/airbyte/pull/36701) | Enabling spotbugs for typing_and_deduping. | +| 0.29.3 | 2024-04-03 | [\#36612](https://github.com/airbytehq/airbyte/pull/36612) | Enabling spotbugs for dependencies. | +| 0.29.5 | 2024-04-05 | [\#36577](https://github.com/airbytehq/airbyte/pull/36577) | Do not send system_error trace message for config exceptions. | +| 0.29.3 | 2024-04-04 | [\#36759](https://github.com/airbytehq/airbyte/pull/36759) | Minor fixes. | +| 0.29.3 | 2024-04-04 | [\#36706](https://github.com/airbytehq/airbyte/pull/36706) | Enabling spotbugs for s3-destination. | +| 0.29.3 | 2024-04-03 | [\#36705](https://github.com/airbytehq/airbyte/pull/36705) | Enabling spotbugs for db-sources. | +| 0.29.3 | 2024-04-03 | [\#36704](https://github.com/airbytehq/airbyte/pull/36704) | Enabling spotbugs for datastore-postgres. | +| 0.29.3 | 2024-04-03 | [\#36703](https://github.com/airbytehq/airbyte/pull/36703) | Enabling spotbugs for gcs-destination. | +| 0.29.3 | 2024-04-03 | [\#36702](https://github.com/airbytehq/airbyte/pull/36702) | Enabling spotbugs for db-destinations. | +| 0.29.3 | 2024-04-03 | [\#36701](https://github.com/airbytehq/airbyte/pull/36701) | Enabling spotbugs for typing_and_deduping. | +| 0.29.3 | 2024-04-03 | [\#36612](https://github.com/airbytehq/airbyte/pull/36612) | Enabling spotbugs for dependencies. | +| 0.29.2 | 2024-04-04 | [\#36845](https://github.com/airbytehq/airbyte/pull/36772) | Changes to make source-mongo compileable | +| 0.29.1 | 2024-04-03 | [\#36772](https://github.com/airbytehq/airbyte/pull/36772) | Changes to make source-mssql compileable | +| 0.29.0 | 2024-04-02 | [\#36759](https://github.com/airbytehq/airbyte/pull/36759) | Build artifact publication changes and fixes. | +| 0.28.21 | 2024-04-02 | [\#36673](https://github.com/airbytehq/airbyte/pull/36673) | Change the destination message parsing to use standard java/kotlin classes. Adds logging to catch empty lines. | +| 0.28.20 | 2024-04-01 | [\#36584](https://github.com/airbytehq/airbyte/pull/36584) | Changes to make source-postgres compileable | +| 0.28.19 | 2024-03-29 | [\#36619](https://github.com/airbytehq/airbyte/pull/36619) | Changes to make destination-postgres compileable | +| 0.28.19 | 2024-03-29 | [\#36588](https://github.com/airbytehq/airbyte/pull/36588) | Changes to make destination-redshift compileable | +| 0.28.19 | 2024-03-29 | [\#36610](https://github.com/airbytehq/airbyte/pull/36610) | remove airbyte-api generation, pull depdendency jars instead | +| 0.28.19 | 2024-03-29 | [\#36611](https://github.com/airbytehq/airbyte/pull/36611) | disable spotbugs for CDK tes and testFixtures tasks | +| 0.28.18 | 2024-03-28 | [\#36606](https://github.com/airbytehq/airbyte/pull/36574) | disable spotbugs for CDK tes and testFixtures tasks | +| 0.28.18 | 2024-03-28 | [\#36574](https://github.com/airbytehq/airbyte/pull/36574) | Fix ContainerFactory | +| 0.28.18 | 2024-03-27 | [\#36570](https://github.com/airbytehq/airbyte/pull/36570) | Convert missing s3-destinations tests to Kotlin | +| 0.28.18 | 2024-03-27 | [\#36446](https://github.com/airbytehq/airbyte/pull/36446) | Convert dependencies submodule to Kotlin | +| 0.28.18 | 2024-03-27 | [\#36445](https://github.com/airbytehq/airbyte/pull/36445) | Convert functional out Checked interfaces to kotlin | +| 0.28.18 | 2024-03-27 | [\#36444](https://github.com/airbytehq/airbyte/pull/36444) | Use apache-commons classes in our Checked functional interfaces | +| 0.28.18 | 2024-03-27 | [\#36467](https://github.com/airbytehq/airbyte/pull/36467) | Convert #36465 to Kotlin | +| 0.28.18 | 2024-03-27 | [\#36473](https://github.com/airbytehq/airbyte/pull/36473) | Convert convert #36396 to Kotlin | +| 0.28.18 | 2024-03-27 | [\#36439](https://github.com/airbytehq/airbyte/pull/36439) | Convert db-destinations submodule to Kotlin | +| 0.28.18 | 2024-03-27 | [\#36438](https://github.com/airbytehq/airbyte/pull/36438) | Convert db-sources submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36437](https://github.com/airbytehq/airbyte/pull/36437) | Convert gsc submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36421](https://github.com/airbytehq/airbyte/pull/36421) | Convert typing-deduping submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36420](https://github.com/airbytehq/airbyte/pull/36420) | Convert s3-destinations submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36419](https://github.com/airbytehq/airbyte/pull/36419) | Convert azure submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36413](https://github.com/airbytehq/airbyte/pull/36413) | Convert postgres submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36412](https://github.com/airbytehq/airbyte/pull/36412) | Convert mongodb submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36411](https://github.com/airbytehq/airbyte/pull/36411) | Convert datastore-bigquery submodule to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36205](https://github.com/airbytehq/airbyte/pull/36205) | Convert core/main to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36204](https://github.com/airbytehq/airbyte/pull/36204) | Convert core/test to Kotlin | +| 0.28.18 | 2024-03-26 | [\#36190](https://github.com/airbytehq/airbyte/pull/36190) | Convert core/testFixtures to Kotlin | +| 0.28.0 | 2024-03-26 | [\#36514](https://github.com/airbytehq/airbyte/pull/36514) | Bump CDK version to 0.28.0 | +| 0.27.7 | 2024-03-26 | [\#36466](https://github.com/airbytehq/airbyte/pull/36466) | Destinations: fix support for case-sensitive fields in destination state. | +| 0.27.6 | 2024-03-26 | [\#36432](https://github.com/airbytehq/airbyte/pull/36432) | Sources support for AirbyteRecordMessageMeta during reading source data types. | +| 0.27.5 | 2024-03-25 | [\#36461](https://github.com/airbytehq/airbyte/pull/36461) | Destinations: Handle case-sensitive columns in destination state handling. | +| 0.27.4 | 2024-03-25 | [\#36333](https://github.com/airbytehq/airbyte/pull/36333) | Sunset DebeziumSourceDecoratingIterator. | +| 0.27.1 | 2024-03-22 | [\#36296](https://github.com/airbytehq/airbyte/pull/36296) | Destinations: (async framework) Do not log invalid message data. | +| 0.27.0 | 2024-03-21 | [\#36364](https://github.com/airbytehq/airbyte/pull/36364) | Sources: Increase debezium initial record wait time to 40 minute. | +| 0.26.1 | 2024-03-19 | [\#35599](https://github.com/airbytehq/airbyte/pull/35599) | Sunset SourceDecoratingIterator. | +| 0.26.0 | 2024-03-19 | [\#36263](https://github.com/airbytehq/airbyte/pull/36263) | Improve conversion of debezium Date type for some edge case in mssql. | +| 0.25.0 | 2024-03-18 | [\#36203](https://github.com/airbytehq/airbyte/pull/36203) | Wiring of Transformer to StagingConsumerFactory and JdbcBufferedConsumerFactory; import changes for Kotlin conversion; State message logs to debug | +| 0.24.1 | 2024-03-13 | [\#36022](https://github.com/airbytehq/airbyte/pull/36022) | Move log4j2-test.xml to test fixtures, away from runtime classpath. | +| 0.24.0 | 2024-03-13 | [\#35944](https://github.com/airbytehq/airbyte/pull/35944) | Add `_airbyte_meta` in raw table and test fixture updates | +| 0.23.20 | 2024-03-12 | [\#36011](https://github.com/airbytehq/airbyte/pull/36011) | Debezium configuration for conversion of null value on a column with default value. | +| 0.23.19 | 2024-03-11 | [\#35904](https://github.com/airbytehq/airbyte/pull/35904) | Add retries to the debezium engine. | +| 0.23.18 | 2024-03-07 | [\#35899](https://github.com/airbytehq/airbyte/pull/35899) | Null check when retrieving destination state | +| 0.23.16 | 2024-03-06 | [\#35842](https://github.com/airbytehq/airbyte/pull/35842) | Improve logging in debezium processing. | +| 0.23.15 | 2024-03-05 | [\#35827](https://github.com/airbytehq/airbyte/pull/35827) | improving the Junit interceptor. | +| 0.23.14 | 2024-03-05 | [\#35739](https://github.com/airbytehq/airbyte/pull/35739) | Add logging to the CDC queue size. Fix the ContainerFactory. | +| 0.23.13 | 2024-03-04 | [\#35774](https://github.com/airbytehq/airbyte/pull/35774) | minor changes to the CDK test fixtures. | +| 0.23.12 | 2024-03-01 | [\#35767](https://github.com/airbytehq/airbyte/pull/35767) | introducing a timeout for java tests. | +| 0.23.11 | 2024-03-01 | [\#35313](https://github.com/airbytehq/airbyte/pull/35313) | Preserve timezone offset in CSV writer for destinations | +| 0.23.10 | 2024-03-01 | [\#35303](https://github.com/airbytehq/airbyte/pull/35303) | Migration framework with DestinationState for softReset | +| 0.23.9 | 2024-02-29 | [\#35720](https://github.com/airbytehq/airbyte/pull/35720) | various improvements for tests TestDataHolder | +| 0.23.8 | 2024-02-28 | [\#35529](https://github.com/airbytehq/airbyte/pull/35529) | Refactor on state iterators | +| 0.23.7 | 2024-02-28 | [\#35376](https://github.com/airbytehq/airbyte/pull/35376) | Extract typereduper migrations to separte method | +| 0.23.6 | 2024-02-26 | [\#35647](https://github.com/airbytehq/airbyte/pull/35647) | Add a getNamespace into TestDataHolder | +| 0.23.5 | 2024-02-26 | [\#35512](https://github.com/airbytehq/airbyte/pull/35512) | Remove @DisplayName from all CDK tests. | +| 0.23.4 | 2024-02-26 | [\#35507](https://github.com/airbytehq/airbyte/pull/35507) | Add more logs into TestDatabase. | +| 0.23.3 | 2024-02-26 | [\#35495](https://github.com/airbytehq/airbyte/pull/35495) | Fix Junit Interceptor to print better stacktraces | +| 0.23.2 | 2024-02-22 | [\#35385](https://github.com/airbytehq/airbyte/pull/35342) | Bugfix: inverted logic of disableTypeDedupe flag | +| 0.23.1 | 2024-02-22 | [\#35527](https://github.com/airbytehq/airbyte/pull/35527) | reduce shutdow timeouts | +| 0.23.0 | 2024-02-22 | [\#35342](https://github.com/airbytehq/airbyte/pull/35342) | Consolidate and perform upfront gathering of DB metadata state | +| 0.21.4 | 2024-02-21 | [\#35511](https://github.com/airbytehq/airbyte/pull/35511) | Reduce CDC state compression limit to 1MB | +| 0.21.3 | 2024-02-20 | [\#35394](https://github.com/airbytehq/airbyte/pull/35394) | Add Junit progress information to the test logs | +| 0.21.2 | 2024-02-20 | [\#34978](https://github.com/airbytehq/airbyte/pull/34978) | Reduce log noise in NormalizationLogParser. | +| 0.21.1 | 2024-02-20 | [\#35199](https://github.com/airbytehq/airbyte/pull/35199) | Add thread names to the logs. | +| 0.21.0 | 2024-02-16 | [\#35314](https://github.com/airbytehq/airbyte/pull/35314) | Delete S3StreamCopier classes. These have been superseded by the async destinations framework. | +| 0.20.9 | 2024-02-15 | [\#35240](https://github.com/airbytehq/airbyte/pull/35240) | Make state emission to platform inside state manager itself. | +| 0.20.8 | 2024-02-15 | [\#35285](https://github.com/airbytehq/airbyte/pull/35285) | Improve blobstore module structure. | +| 0.20.7 | 2024-02-13 | [\#35236](https://github.com/airbytehq/airbyte/pull/35236) | output logs to files in addition to stdout when running tests | +| 0.20.6 | 2024-02-12 | [\#35036](https://github.com/airbytehq/airbyte/pull/35036) | Add trace utility to emit analytics messages. | +| 0.20.5 | 2024-02-13 | [\#34869](https://github.com/airbytehq/airbyte/pull/34869) | Don't emit final state in SourceStateIterator there is an underlying stream failure. | +| 0.20.4 | 2024-02-12 | [\#35042](https://github.com/airbytehq/airbyte/pull/35042) | Use delegate's isDestinationV2 invocation in SshWrappedDestination. | +| 0.20.3 | 2024-02-09 | [\#34580](https://github.com/airbytehq/airbyte/pull/34580) | Support special chars in mysql/mssql database name. | +| 0.20.2 | 2024-02-12 | [\#35111](https://github.com/airbytehq/airbyte/pull/35144) | Make state emission from async framework synchronized. | +| 0.20.1 | 2024-02-11 | [\#35111](https://github.com/airbytehq/airbyte/pull/35111) | Fix GlobalAsyncStateManager stats counting logic. | +| 0.20.0 | 2024-02-09 | [\#34562](https://github.com/airbytehq/airbyte/pull/34562) | Add new test cases to BaseTypingDedupingTest to exercise special characters. | +| 0.19.0 | 2024-02-01 | [\#34745](https://github.com/airbytehq/airbyte/pull/34745) | Reorganize CDK module structure. | +| 0.18.0 | 2024-02-08 | [\#33606](https://github.com/airbytehq/airbyte/pull/33606) | Add updated Initial and Incremental Stream State definitions for DB Sources. | +| 0.17.1 | 2024-02-08 | [\#35027](https://github.com/airbytehq/airbyte/pull/35027) | Make state handling thread safe in async destination framework. | +| 0.17.0 | 2024-02-08 | [\#34502](https://github.com/airbytehq/airbyte/pull/34502) | Enable configuring async destination batch size. | +| 0.16.6 | 2024-02-07 | [\#34892](https://github.com/airbytehq/airbyte/pull/34892) | Improved testcontainers logging and support for unshared containers. | +| 0.16.5 | 2024-02-07 | [\#34948](https://github.com/airbytehq/airbyte/pull/34948) | Fix source state stats counting logic | +| 0.16.4 | 2024-02-01 | [\#34727](https://github.com/airbytehq/airbyte/pull/34727) | Add future based stdout consumer in BaseTypingDedupingTest | +| 0.16.3 | 2024-01-30 | [\#34669](https://github.com/airbytehq/airbyte/pull/34669) | Fix org.apache.logging.log4j:log4j-slf4j-impl version conflicts. | +| 0.16.2 | 2024-01-29 | [\#34630](https://github.com/airbytehq/airbyte/pull/34630) | expose NamingTransformer to sub-classes in destinations JdbcSqlGenerator. | +| 0.16.1 | 2024-01-29 | [\#34533](https://github.com/airbytehq/airbyte/pull/34533) | Add a safe method to execute DatabaseMetadata's Resultset returning queries. | +| 0.16.0 | 2024-01-26 | [\#34573](https://github.com/airbytehq/airbyte/pull/34573) | Untangle Debezium harness dependencies. | +| 0.15.2 | 2024-01-25 | [\#34441](https://github.com/airbytehq/airbyte/pull/34441) | Improve airbyte-api build performance. | +| 0.15.1 | 2024-01-25 | [\#34451](https://github.com/airbytehq/airbyte/pull/34451) | Async destinations: Better logging when we fail to parse an AirbyteMessage | +| 0.15.0 | 2024-01-23 | [\#34441](https://github.com/airbytehq/airbyte/pull/34441) | Removed connector registry and micronaut dependencies. | +| 0.14.2 | 2024-01-24 | [\#34458](https://github.com/airbytehq/airbyte/pull/34458) | Handle case-sensitivity in sentry error grouping | +| 0.14.1 | 2024-01-24 | [\#34468](https://github.com/airbytehq/airbyte/pull/34468) | Add wait for process to be done before ending sync in destination BaseTDTest | +| 0.14.0 | 2024-01-23 | [\#34461](https://github.com/airbytehq/airbyte/pull/34461) | Revert non backward compatible signature changes from 0.13.1 | +| 0.13.3 | 2024-01-23 | [\#34077](https://github.com/airbytehq/airbyte/pull/34077) | Denote if destinations fully support Destinations V2 | +| 0.13.2 | 2024-01-18 | [\#34364](https://github.com/airbytehq/airbyte/pull/34364) | Better logging in mongo db source connector | +| 0.13.1 | 2024-01-18 | [\#34236](https://github.com/airbytehq/airbyte/pull/34236) | Add postCreateTable hook in destination JdbcSqlGenerator | +| 0.13.0 | 2024-01-16 | [\#34177](https://github.com/airbytehq/airbyte/pull/34177) | Add `useExpensiveSafeCasting` param in JdbcSqlGenerator methods; add JdbcTypingDedupingTest fixture; other DV2-related changes | +| 0.12.1 | 2024-01-11 | [\#34186](https://github.com/airbytehq/airbyte/pull/34186) | Add hook for additional destination specific checks to JDBC destination check method | +| 0.12.0 | 2024-01-10 | [\#33875](https://github.com/airbytehq/airbyte/pull/33875) | Upgrade sshd-mina to 2.11.1 | +| 0.11.5 | 2024-01-10 | [\#34119](https://github.com/airbytehq/airbyte/pull/34119) | Remove wal2json support for postgres+debezium. | +| 0.11.4 | 2024-01-09 | [\#33305](https://github.com/airbytehq/airbyte/pull/33305) | Source stats in incremental syncs | +| 0.11.3 | 2023-01-09 | [\#33658](https://github.com/airbytehq/airbyte/pull/33658) | Always fail when debezium fails, even if it happened during the setup phase. | +| 0.11.2 | 2024-01-09 | [\#33969](https://github.com/airbytehq/airbyte/pull/33969) | Destination state stats implementation | +| 0.11.1 | 2024-01-04 | [\#33727](https://github.com/airbytehq/airbyte/pull/33727) | SSH bastion heartbeats for Destinations | +| 0.11.0 | 2024-01-04 | [\#33730](https://github.com/airbytehq/airbyte/pull/33730) | DV2 T+D uses Sql struct to represent transactions; other T+D-related changes | +| 0.10.4 | 2023-12-20 | [\#33071](https://github.com/airbytehq/airbyte/pull/33071) | Add the ability to parse JDBC parameters with another delimiter than '&' | +| 0.10.3 | 2024-01-03 | [\#33312](https://github.com/airbytehq/airbyte/pull/33312) | Send out count in AirbyteStateMessage | +| 0.10.1 | 2023-12-21 | [\#33723](https://github.com/airbytehq/airbyte/pull/33723) | Make memory-manager log message less scary | +| 0.10.0 | 2023-12-20 | [\#33704](https://github.com/airbytehq/airbyte/pull/33704) | JdbcDestinationHandler now properly implements `getInitialRawTableState`; reenable SqlGenerator test | +| 0.9.0 | 2023-12-18 | [\#33124](https://github.com/airbytehq/airbyte/pull/33124) | Make Schema Creation Separate from Table Creation, exclude the T&D module from the CDK | +| 0.8.0 | 2023-12-18 | [\#33506](https://github.com/airbytehq/airbyte/pull/33506) | Improve async destination shutdown logic; more JDBC async migration work; improve DAT test schema handling | +| 0.7.9 | 2023-12-18 | [\#33549](https://github.com/airbytehq/airbyte/pull/33549) | Improve MongoDB logging. | +| 0.7.8 | 2023-12-18 | [\#33365](https://github.com/airbytehq/airbyte/pull/33365) | Emit stream statuses more consistently | +| 0.7.7 | 2023-12-18 | [\#33434](https://github.com/airbytehq/airbyte/pull/33307) | Remove LEGACY state | +| 0.7.6 | 2023-12-14 | [\#32328](https://github.com/airbytehq/airbyte/pull/33307) | Add schema less mode for mongodb CDC. Fixes for non standard mongodb id type. | +| 0.7.4 | 2023-12-13 | [\#33232](https://github.com/airbytehq/airbyte/pull/33232) | Track stream record count during sync; only run T+D if a stream had nonzero records or the previous sync left unprocessed records. | +| 0.7.3 | 2023-12-13 | [\#33369](https://github.com/airbytehq/airbyte/pull/33369) | Extract shared JDBC T+D code. | +| 0.7.2 | 2023-12-11 | [\#33307](https://github.com/airbytehq/airbyte/pull/33307) | Fix DV2 JDBC type mappings (code changes in [\#33307](https://github.com/airbytehq/airbyte/pull/33307)). | +| 0.7.1 | 2023-12-01 | [\#33027](https://github.com/airbytehq/airbyte/pull/33027) | Add the abstract DB source debugger. | +| 0.7.0 | 2023-12-07 | [\#32326](https://github.com/airbytehq/airbyte/pull/32326) | Destinations V2 changes for JDBC destinations | +| 0.6.4 | 2023-12-06 | [\#33082](https://github.com/airbytehq/airbyte/pull/33082) | Improvements to schema snapshot error handling + schema snapshot history scope (scoped to configured DB). | +| 0.6.2 | 2023-11-30 | [\#32573](https://github.com/airbytehq/airbyte/pull/32573) | Update MSSQLConverter to enforce 6-digit microsecond precision for timestamp fields | +| 0.6.1 | 2023-11-30 | [\#32610](https://github.com/airbytehq/airbyte/pull/32610) | Support DB initial sync using binary as primary key. | +| 0.6.0 | 2023-11-30 | [\#32888](https://github.com/airbytehq/airbyte/pull/32888) | JDBC destinations now use the async framework | +| 0.5.3 | 2023-11-28 | [\#32686](https://github.com/airbytehq/airbyte/pull/32686) | Better attribution of debezium engine shutdown due to heartbeat. | +| 0.5.1 | 2023-11-27 | [\#32662](https://github.com/airbytehq/airbyte/pull/32662) | Debezium initialization wait time will now read from initial setup time. | +| 0.5.0 | 2023-11-22 | [\#32656](https://github.com/airbytehq/airbyte/pull/32656) | Introduce TestDatabase test fixture, refactor database source test base classes. | +| 0.4.11 | 2023-11-14 | [\#32526](https://github.com/airbytehq/airbyte/pull/32526) | Clean up memory manager logs. | +| 0.4.10 | 2023-11-13 | [\#32285](https://github.com/airbytehq/airbyte/pull/32285) | Fix UUID codec ordering for MongoDB connector | +| 0.4.9 | 2023-11-13 | [\#32468](https://github.com/airbytehq/airbyte/pull/32468) | Further error grouping improvements for DV2 connectors | +| 0.4.8 | 2023-11-09 | [\#32377](https://github.com/airbytehq/airbyte/pull/32377) | source-postgres tests: skip dropping database | +| 0.4.7 | 2023-11-08 | [\#31856](https://github.com/airbytehq/airbyte/pull/31856) | source-postgres: support for infinity date and timestamps | +| 0.4.5 | 2023-11-07 | [\#32112](https://github.com/airbytehq/airbyte/pull/32112) | Async destinations framework: Allow configuring the queue flush threshold | +| 0.4.4 | 2023-11-06 | [\#32119](https://github.com/airbytehq/airbyte/pull/32119) | Add STANDARD UUID codec to MongoDB debezium handler | +| 0.4.2 | 2023-11-06 | [\#32190](https://github.com/airbytehq/airbyte/pull/32190) | Improve error deinterpolation | +| 0.4.1 | 2023-11-02 | [\#32192](https://github.com/airbytehq/airbyte/pull/32192) | Add 's3-destinations' CDK module. | +| 0.4.0 | 2023-11-02 | [\#32050](https://github.com/airbytehq/airbyte/pull/32050) | Fix compiler warnings. | +| 0.3.0 | 2023-11-02 | [\#31983](https://github.com/airbytehq/airbyte/pull/31983) | Add deinterpolation feature to AirbyteExceptionHandler. | +| 0.2.4 | 2023-10-31 | [\#31807](https://github.com/airbytehq/airbyte/pull/31807) | Handle case of debezium update and delete of records in mongodb. | +| 0.2.3 | 2023-10-31 | [\#32022](https://github.com/airbytehq/airbyte/pull/32022) | Update Debezium version from 2.20 -> 2.4.0. | +| 0.2.2 | 2023-10-31 | [\#31976](https://github.com/airbytehq/airbyte/pull/31976) | Debezium tweaks to make tests run faster. | +| 0.2.0 | 2023-10-30 | [\#31960](https://github.com/airbytehq/airbyte/pull/31960) | Hoist top-level gradle subprojects into CDK. | +| 0.1.12 | 2023-10-24 | [\#31674](https://github.com/airbytehq/airbyte/pull/31674) | Fail sync when Debezium does not shut down properly. | +| 0.1.11 | 2023-10-18 | [\#31486](https://github.com/airbytehq/airbyte/pull/31486) | Update constants in AdaptiveSourceRunner. | +| 0.1.9 | 2023-10-12 | [\#31309](https://github.com/airbytehq/airbyte/pull/31309) | Use toPlainString() when handling BigDecimals in PostgresConverter | +| 0.1.8 | 2023-10-11 | [\#31322](https://github.com/airbytehq/airbyte/pull/31322) | Cap log line length to 32KB to prevent loss of records | +| 0.1.7 | 2023-10-10 | [\#31194](https://github.com/airbytehq/airbyte/pull/31194) | Deallocate unused per stream buffer memory when empty | +| 0.1.6 | 2023-10-10 | [\#31083](https://github.com/airbytehq/airbyte/pull/31083) | Fix precision of numeric values in async destinations | +| 0.1.5 | 2023-10-09 | [\#31196](https://github.com/airbytehq/airbyte/pull/31196) | Update typo in CDK (CDN_LSN -> CDC_LSN) | +| 0.1.4 | 2023-10-06 | [\#31139](https://github.com/airbytehq/airbyte/pull/31139) | Reduce async buffer | +| 0.1.1 | 2023-09-28 | [\#30835](https://github.com/airbytehq/airbyte/pull/30835) | JDBC destinations now avoid staging area name collisions by using the raw table name as the stage name. (previously we used the stream name as the stage name) | +| 0.1.0 | 2023-09-27 | [\#30445](https://github.com/airbytehq/airbyte/pull/30445) | First launch, including shared classes for all connectors. | +| 0.0.2 | 2023-08-21 | [\#28687](https://github.com/airbytehq/airbyte/pull/28687) | Version bump only (no other changes). | +| 0.0.1 | 2023-08-08 | [\#28687](https://github.com/airbytehq/airbyte/pull/28687) | Initial release for testing. | diff --git a/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/BufferingStrategy.kt b/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/BufferingStrategy.kt index 2ad4a3d65285..52196376b8ad 100644 --- a/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/BufferingStrategy.kt +++ b/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/BufferingStrategy.kt @@ -32,9 +32,8 @@ interface BufferingStrategy : AutoCloseable { message: AirbyteMessage ): Optional - /** Flush buffered messages in a buffer from a particular stream */ - @Throws(Exception::class) - fun flushSingleBuffer(stream: AirbyteStreamNameNamespacePair, buffer: SerializableBuffer) + /** Flush the buffered messages from a single stream */ + @Throws(Exception::class) fun flushSingleStream(stream: AirbyteStreamNameNamespacePair) /** Flush all buffers that were buffering message data so far. */ @Throws(Exception::class) fun flushAllBuffers() diff --git a/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/InMemoryRecordBufferingStrategy.kt b/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/InMemoryRecordBufferingStrategy.kt index 4b5e59b7feb8..40b159ed2f01 100644 --- a/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/InMemoryRecordBufferingStrategy.kt +++ b/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/InMemoryRecordBufferingStrategy.kt @@ -62,10 +62,7 @@ class InMemoryRecordBufferingStrategy( } @Throws(Exception::class) - override fun flushSingleBuffer( - stream: AirbyteStreamNameNamespacePair, - buffer: SerializableBuffer - ) { + override fun flushSingleStream(stream: AirbyteStreamNameNamespacePair) { LOGGER.info { "Flushing single stream ${stream.name}: ${streamBuffer[stream]!!.size} records" } diff --git a/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/SerializedBufferingStrategy.kt b/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/SerializedBufferingStrategy.kt index 2d50f34e21cc..b05b6a4b446f 100644 --- a/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/SerializedBufferingStrategy.kt +++ b/airbyte-cdk/java/airbyte-cdk/core/src/main/kotlin/io/airbyte/cdk/integrations/destination/record_buffer/SerializedBufferingStrategy.kt @@ -46,7 +46,7 @@ class SerializedBufferingStrategy @Throws(Exception::class) override fun addRecord( stream: AirbyteStreamNameNamespacePair, - message: AirbyteMessage + message: AirbyteMessage, ): Optional { var flushed: Optional = Optional.empty() @@ -102,8 +102,7 @@ class SerializedBufferingStrategy } } - @Throws(Exception::class) - override fun flushSingleBuffer( + private fun flushSingleBuffer( stream: AirbyteStreamNameNamespacePair, buffer: SerializableBuffer ) { @@ -116,6 +115,11 @@ class SerializedBufferingStrategy LOGGER.info { "Flushing completed for ${stream.name}" } } + @Throws(Exception::class) + override fun flushSingleStream(stream: AirbyteStreamNameNamespacePair) { + allBuffers[stream]?.let { flushSingleBuffer(stream, it) } + } + @Throws(Exception::class) override fun flushAllBuffers() { LOGGER.info { diff --git a/airbyte-cdk/java/airbyte-cdk/core/src/main/resources/version.properties b/airbyte-cdk/java/airbyte-cdk/core/src/main/resources/version.properties index 1ef1b95223c2..9ca7ff4f0992 100644 --- a/airbyte-cdk/java/airbyte-cdk/core/src/main/resources/version.properties +++ b/airbyte-cdk/java/airbyte-cdk/core/src/main/resources/version.properties @@ -1 +1 @@ -version=0.43.6 +version=0.44.0 diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/standardtest/destination/DestinationAcceptanceTest.kt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/standardtest/destination/DestinationAcceptanceTest.kt index fd5f9e967692..4f6707864370 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/standardtest/destination/DestinationAcceptanceTest.kt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/standardtest/destination/DestinationAcceptanceTest.kt @@ -33,11 +33,15 @@ import io.airbyte.protocol.models.v0.AirbyteMessage import io.airbyte.protocol.models.v0.AirbyteMessage.Type import io.airbyte.protocol.models.v0.AirbyteRecordMessage import io.airbyte.protocol.models.v0.AirbyteStateMessage +import io.airbyte.protocol.models.v0.AirbyteStateStats import io.airbyte.protocol.models.v0.AirbyteStream +import io.airbyte.protocol.models.v0.AirbyteStreamStatusTraceMessage +import io.airbyte.protocol.models.v0.AirbyteTraceMessage import io.airbyte.protocol.models.v0.CatalogHelpers import io.airbyte.protocol.models.v0.ConfiguredAirbyteCatalog import io.airbyte.protocol.models.v0.ConnectorSpecification import io.airbyte.protocol.models.v0.DestinationSyncMode +import io.airbyte.protocol.models.v0.StreamDescriptor import io.airbyte.protocol.models.v0.SyncMode import io.airbyte.workers.exception.TestHarnessException import io.airbyte.workers.general.DbtTransformationRunner @@ -75,7 +79,10 @@ import org.mockito.Mockito private val LOGGER = KotlinLogging.logger {} -abstract class DestinationAcceptanceTest { +abstract class DestinationAcceptanceTest( + // If false, ignore counts and only verify the final state message. + private val verifyIndividualStateAndCounts: Boolean = false +) { protected var testSchemas: HashSet = HashSet() private lateinit var testEnv: TestDestinationEnv @@ -88,7 +95,7 @@ abstract class DestinationAcceptanceTest { open protected var _testDataComparator: TestDataComparator = getTestDataComparator() protected open fun getTestDataComparator(): TestDataComparator { - return BasicTestDataComparator { @Suppress("deprecated") this.resolveIdentifier(it) } + return BasicTestDataComparator { @Suppress("deprecation") this.resolveIdentifier(it) } } protected abstract val imageName: String @@ -198,7 +205,7 @@ abstract class DestinationAcceptanceTest { return null } val schema = config["schema"].asText() - testSchemas!!.add(schema) + testSchemas.add(schema) return schema } @@ -354,8 +361,8 @@ abstract class DestinationAcceptanceTest { val workspaceRoot = Files.createTempDirectory(testDir, "test") jobRoot = Files.createDirectories(Path.of(workspaceRoot.toString(), "job")) localRoot = Files.createTempDirectory(testDir, "output") - LOGGER.info("jobRoot: {}", jobRoot) - LOGGER.info("localRoot: {}", localRoot) + LOGGER.info { "${"jobRoot: {}"} $jobRoot" } + LOGGER.info { "${"localRoot: {}"} $localRoot" } testEnv = TestDestinationEnv(localRoot) mConnectorConfigUpdater = Mockito.mock(ConnectorConfigUpdater::class.java) testSchemas = HashSet() @@ -428,9 +435,12 @@ abstract class DestinationAcceptanceTest { AirbyteCatalog::class.java ) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) - val messages: List = + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } + val messages: List = MoreResources.readResource(messagesFilename).trim().lines().map { - Jsons.deserialize(it, io.airbyte.protocol.models.v0.AirbyteMessage::class.java) + Jsons.deserialize(it, AirbyteMessage::class.java) } val config = getConfig() @@ -454,30 +464,25 @@ abstract class DestinationAcceptanceTest { AirbyteCatalog::class.java ) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) - val messages: List = + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } + val messages: List = MoreResources.readResource(messagesFilename).trim().lines().map { - Jsons.deserialize(it, io.airbyte.protocol.models.v0.AirbyteMessage::class.java) + Jsons.deserialize(it, AirbyteMessage::class.java) } - val largeNumberRecords = - Collections.nCopies(400, messages) - .flatten() // regroup messages per stream - .sortedWith( - Comparator.comparing { obj: io.airbyte.protocol.models.v0.AirbyteMessage -> - obj.type - } - .thenComparing { message: io.airbyte.protocol.models.v0.AirbyteMessage -> - if ( - message.type == - io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD - ) - message.record.stream - else message.toString() - } - ) + /* Replicate the runs of messages and state hundreds of times, but keep trace messages at the end. */ + val lotsOfRecordAndStateBlocks = + Collections.nCopies( + 400, + messages.filter { it.type == Type.RECORD || it.type == Type.STATE } + ) + val traceMessages = messages.filter { it.type == Type.TRACE } + val concatenated = lotsOfRecordAndStateBlocks.flatten() + traceMessages val config = getConfig() - runSyncAndVerifyStateOutput(config, largeNumberRecords, configuredCatalog, false) + runSyncAndVerifyStateOutput(config, concatenated, configuredCatalog, false) } /** Verify that the integration overwrites the first sync with the second sync. */ @@ -485,7 +490,7 @@ abstract class DestinationAcceptanceTest { @Throws(Exception::class) fun testSecondSync() { if (!implementsOverwrite()) { - LOGGER.info("Destination's spec.json does not support overwrite sync mode.") + LOGGER.info { "Destination's spec.json does not support overwrite sync mode." } return } @@ -499,8 +504,10 @@ abstract class DestinationAcceptanceTest { AirbyteCatalog::class.java ) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) - - val firstSyncMessages: List = + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } + val firstSyncMessages: List = MoreResources.readResource( DataArgumentsProvider.Companion.EXCHANGE_RATE_CONFIG.getMessageFileVersion( getProtocolVersion() @@ -508,12 +515,7 @@ abstract class DestinationAcceptanceTest { ) .trim() .lines() - .map { - Jsons.deserialize( - it, - io.airbyte.protocol.models.v0.AirbyteMessage::class.java - ) - } + .map { Jsons.deserialize(it, AirbyteMessage::class.java) } val config = getConfig() runSyncAndVerifyStateOutput(config, firstSyncMessages, configuredCatalog, false) @@ -533,23 +535,31 @@ abstract class DestinationAcceptanceTest { ) dummyCatalog.streams[0].name = DUMMY_CATALOG_NAME val configuredDummyCatalog = CatalogHelpers.toDefaultConfiguredCatalog(dummyCatalog) + configuredDummyCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(20).withMinimumGenerationId(20) + } // update messages to set new dummy stream name firstSyncMessages - .filter { message: io.airbyte.protocol.models.v0.AirbyteMessage -> - message.record != null - } - .forEach { message: io.airbyte.protocol.models.v0.AirbyteMessage -> - message.record.stream = DUMMY_CATALOG_NAME + .filter { message: AirbyteMessage -> message.record != null } + .forEach { message: AirbyteMessage -> message.record.stream = DUMMY_CATALOG_NAME } + firstSyncMessages + .filter { message: AirbyteMessage -> message.type == Type.TRACE } + .forEach { message: AirbyteMessage -> + message.trace.streamStatus.streamDescriptor.name = DUMMY_CATALOG_NAME } // sync dummy data runSyncAndVerifyStateOutput(config, firstSyncMessages, configuredDummyCatalog, false) // Run second sync - val secondSyncMessages: List = + val configuredCatalog2 = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog2.streams.forEach { + it.withSyncId(43).withGenerationId(13).withMinimumGenerationId(13) + } + val descriptor = StreamDescriptor().withName(catalog.streams[0].name) + val secondSyncMessages: List = Lists.newArrayList( - io.airbyte.protocol.models.v0 - .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD) + AirbyteMessage() + .withType(Type.RECORD) .withRecord( AirbyteRecordMessage() .withStream(catalog.streams[0].name) @@ -571,16 +581,28 @@ abstract class DestinationAcceptanceTest { ) ) ), - io.airbyte.protocol.models.v0 - .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.STATE) + AirbyteMessage() + .withType(Type.STATE) .withState( AirbyteStateMessage() .withData(Jsons.jsonNode(ImmutableMap.of("checkpoint", 2))) + ), + AirbyteMessage() + .withType(Type.TRACE) + .withTrace( + AirbyteTraceMessage() + .withType(AirbyteTraceMessage.Type.STREAM_STATUS) + .withStreamStatus( + AirbyteStreamStatusTraceMessage() + .withStreamDescriptor(descriptor) + .withStatus( + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE + ) + ) ) ) - runSyncAndVerifyStateOutput(config, secondSyncMessages, configuredCatalog, false) + runSyncAndVerifyStateOutput(config, secondSyncMessages, configuredCatalog2, false) val defaultSchema = getDefaultSchema(config) retrieveRawRecordsAndAssertSameMessages(catalog, secondSyncMessages, defaultSchema) @@ -607,13 +629,15 @@ abstract class DestinationAcceptanceTest { AirbyteCatalog::class.java ) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } val config = getConfig() - val secondSyncMessages: List = + val secondSyncMessages: List = Lists.newArrayList( - io.airbyte.protocol.models.v0 - .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD) + AirbyteMessage() + .withType(Type.RECORD) .withRecord( AirbyteRecordMessage() .withStream(catalog.streams[0].name) @@ -635,12 +659,26 @@ abstract class DestinationAcceptanceTest { ) ) ), - io.airbyte.protocol.models.v0 - .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.STATE) + AirbyteMessage() + .withType(Type.STATE) .withState( AirbyteStateMessage() .withData(Jsons.jsonNode(ImmutableMap.of("checkpoint", 2))) + ), + AirbyteMessage() + .withType(Type.TRACE) + .withTrace( + AirbyteTraceMessage() + .withType(AirbyteTraceMessage.Type.STREAM_STATUS) + .withStreamStatus( + AirbyteStreamStatusTraceMessage() + .withStreamDescriptor( + StreamDescriptor().withName(catalog.streams[0].name) + ) + .withStatus( + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE + ) + ) ) ) @@ -678,7 +716,9 @@ abstract class DestinationAcceptanceTest { @Throws(Exception::class) fun testIncrementalSync() { if (!implementsAppend()) { - LOGGER.info("Destination's spec.json does not include '\"supportsIncremental\" ; true'") + LOGGER.info { + "Destination's spec.json does not include '\"supportsIncremental\" ; true'" + } return } @@ -692,12 +732,15 @@ abstract class DestinationAcceptanceTest { AirbyteCatalog::class.java ) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) - configuredCatalog.streams.forEach { s -> - s.withSyncMode(SyncMode.INCREMENTAL) - s.withDestinationSyncMode(DestinationSyncMode.APPEND) + configuredCatalog.streams.forEach { + it.withSyncMode(SyncMode.INCREMENTAL) + .withDestinationSyncMode(DestinationSyncMode.APPEND) + .withSyncId(42) + .withGenerationId(12) + .withMinimumGenerationId(12) } - val firstSyncMessages: List = + val firstSyncMessages: List = MoreResources.readResource( DataArgumentsProvider.Companion.EXCHANGE_RATE_CONFIG.getMessageFileVersion( getProtocolVersion() @@ -709,11 +752,12 @@ abstract class DestinationAcceptanceTest { val config = getConfig() runSyncAndVerifyStateOutput(config, firstSyncMessages, configuredCatalog, false) - val secondSyncMessages: List = + val descriptor = StreamDescriptor() + descriptor.name = catalog.streams[0].name + val secondSyncMessages: List = Lists.newArrayList( - io.airbyte.protocol.models.v0 - .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD) + AirbyteMessage() + .withType(Type.RECORD) .withRecord( AirbyteRecordMessage() .withStream(catalog.streams[0].name) @@ -735,19 +779,30 @@ abstract class DestinationAcceptanceTest { ) ) ), - io.airbyte.protocol.models.v0 - .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.STATE) + AirbyteMessage() + .withType(Type.STATE) .withState( AirbyteStateMessage() .withData(Jsons.jsonNode(ImmutableMap.of("checkpoint", 2))) + ), + AirbyteMessage() + .withType(Type.TRACE) + .withTrace( + AirbyteTraceMessage() + .withType(AirbyteTraceMessage.Type.STREAM_STATUS) + .withStreamStatus( + AirbyteStreamStatusTraceMessage() + .withStreamDescriptor(descriptor) + .withStatus( + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE + ) + ) ) ) + runSyncAndVerifyStateOutput(config, secondSyncMessages, configuredCatalog, false) - val expectedMessagesAfterSecondSync: - MutableList = - ArrayList() + val expectedMessagesAfterSecondSync: MutableList = ArrayList() expectedMessagesAfterSecondSync.addAll(firstSyncMessages) expectedMessagesAfterSecondSync.addAll(secondSyncMessages) @@ -829,7 +884,7 @@ abstract class DestinationAcceptanceTest { messages.addLast( Jsons.deserialize( "{\"type\": \"RECORD\", \"record\": {\"stream\": \"exchange_rate\", \"emitted_at\": 1602637989500, \"data\": { \"id\": 2, \"currency\": \"EUR\", \"date\": \"2020-09-02T00:00:00Z\", \"NZD\": 1.14, \"USD\": 10.16}}}\n", - io.airbyte.protocol.models.v0.AirbyteMessage::class.java + AirbyteMessage::class.java ) ) @@ -840,7 +895,7 @@ abstract class DestinationAcceptanceTest { // We expect all the of messages to be missing the removed column after normalization. val expectedMessages = - messages.map { message: io.airbyte.protocol.models.v0.AirbyteMessage -> + messages.map { message: AirbyteMessage -> if (message.record != null) { (message.record.data as ObjectNode).remove("HKD") } @@ -894,9 +949,9 @@ abstract class DestinationAcceptanceTest { @Throws(Exception::class) open fun testIncrementalDedupeSync() { if (!implementsAppendDedup()) { - LOGGER.info( + LOGGER.info { "Destination's spec.json does not include 'append_dedupe' in its '\"supportedDestinationSyncModes\"'" - ) + } return } @@ -936,11 +991,11 @@ abstract class DestinationAcceptanceTest { supportsNormalization() ) - val secondSyncMessages: List = + val secondSyncMessages: List = Lists.newArrayList( io.airbyte.protocol.models.v0 .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD) + .withType(AirbyteMessage.Type.RECORD) .withRecord( AirbyteRecordMessage() .withStream(catalog.streams[0].name) @@ -959,7 +1014,7 @@ abstract class DestinationAcceptanceTest { ), io.airbyte.protocol.models.v0 .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD) + .withType(AirbyteMessage.Type.RECORD) .withRecord( AirbyteRecordMessage() .withStream(catalog.streams[0].name) @@ -978,7 +1033,7 @@ abstract class DestinationAcceptanceTest { ), io.airbyte.protocol.models.v0 .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.STATE) + .withType(AirbyteMessage.Type.STATE) .withState( AirbyteStateMessage() .withData(Jsons.jsonNode(ImmutableMap.of("checkpoint", 2))) @@ -986,9 +1041,7 @@ abstract class DestinationAcceptanceTest { ) runSyncAndVerifyStateOutput(config, secondSyncMessages, configuredCatalog, false) - val expectedMessagesAfterSecondSync: - MutableList = - ArrayList() + val expectedMessagesAfterSecondSync: MutableList = ArrayList() expectedMessagesAfterSecondSync.addAll(firstSyncMessages) expectedMessagesAfterSecondSync.addAll(secondSyncMessages) @@ -1064,7 +1117,7 @@ abstract class DestinationAcceptanceTest { ) ) runner.start() - val transformationRoot = Files.createDirectories(jobRoot!!.resolve("transform")) + val transformationRoot = Files.createDirectories(jobRoot.resolve("transform")) val dbtConfig = OperatorDbt() // Forked from https://github.com/dbt-labs/jaffle_shop because they made a // change that would have @@ -1153,7 +1206,7 @@ abstract class DestinationAcceptanceTest { ) ) runner.start() - val transformationRoot = Files.createDirectories(jobRoot!!.resolve("transform")) + val transformationRoot = Files.createDirectories(jobRoot.resolve("transform")) val dbtConfig = OperatorDbt() .withGitRepoUrl("https://github.com/fishtown-analytics/dbt-learn-demo.git") @@ -1191,7 +1244,7 @@ abstract class DestinationAcceptanceTest { ) // A unique namespace is required to avoid test isolation problems. val namespace = TestingNamespaces.generate("source_namespace") - testSchemas!!.add(namespace) + testSchemas.add(namespace) catalog.streams.forEach(Consumer { stream: AirbyteStream -> stream.namespace = namespace }) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) @@ -1232,12 +1285,12 @@ abstract class DestinationAcceptanceTest { AirbyteCatalog::class.java ) val namespace1 = TestingNamespaces.generate("source_namespace") - testSchemas!!.add(namespace1) + testSchemas.add(namespace1) catalog.streams.forEach(Consumer { stream: AirbyteStream -> stream.namespace = namespace1 }) val diffNamespaceStreams = ArrayList() val namespace2 = TestingNamespaces.generate("diff_source_namespace") - testSchemas!!.add(namespace2) + testSchemas.add(namespace2) val mapper = MoreMappers.initMapper() for (stream in catalog.streams) { val clonedStream = @@ -1255,7 +1308,7 @@ abstract class DestinationAcceptanceTest { Jsons.deserialize(it, AirbyteMessage::class.java) } val ns1MessagesAtNamespace1 = getRecordMessagesWithNewNamespace(ns1Messages, namespace1) - val ns2Messages: List = + val ns2Messages: List = MoreResources.readResource(messageFile).trim().lines().map { Jsons.deserialize(it, AirbyteMessage::class.java) } @@ -1295,7 +1348,7 @@ abstract class DestinationAcceptanceTest { assertNamespaceNormalization( testCaseId, namespaceInDst, - namingConventionTransformer.getNamespace(namespaceInCatalog!!) + namingConventionTransformer.getNamespace(namespaceInCatalog) ) } @@ -1332,7 +1385,7 @@ abstract class DestinationAcceptanceTest { try { runSyncAndVerifyStateOutput(config, messagesWithNewNamespace, configuredCatalog, false) // Add to the list of schemas to clean up. - testSchemas!!.add(namespaceInCatalog) + testSchemas.add(namespaceInCatalog) } catch (e: Exception) { throw IOException( String.format( @@ -1376,7 +1429,7 @@ abstract class DestinationAcceptanceTest { @Throws(Exception::class) fun testSyncNotFailsWithNewFields() { if (!implementsOverwrite()) { - LOGGER.info("Destination's spec.json does not support overwrite sync mode.") + LOGGER.info { "Destination's spec.json does not support overwrite sync mode." } return } @@ -1390,6 +1443,9 @@ abstract class DestinationAcceptanceTest { AirbyteCatalog::class.java ) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } val firstSyncMessages = MoreResources.readResource( @@ -1405,11 +1461,15 @@ abstract class DestinationAcceptanceTest { val stream = catalog.streams[0] // Run second sync with new fields on the message - val secondSyncMessagesWithNewFields: - MutableList = + val configuredCatalog2 = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog2.streams.forEach { + it.withSyncId(43).withGenerationId(13).withMinimumGenerationId(13) + } + val descriptor = StreamDescriptor() + descriptor.name = stream.name + val secondSyncMessagesWithNewFields: MutableList = Lists.newArrayList( - io.airbyte.protocol.models.v0 - .AirbyteMessage() + AirbyteMessage() .withType(Type.RECORD) .withRecord( AirbyteRecordMessage() @@ -1429,12 +1489,24 @@ abstract class DestinationAcceptanceTest { ) ) ), - io.airbyte.protocol.models.v0 - .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.STATE) + AirbyteMessage() + .withType(Type.STATE) .withState( AirbyteStateMessage() .withData(Jsons.jsonNode(ImmutableMap.of("checkpoint", 2))) + ), + AirbyteMessage() + .withType(Type.TRACE) + .withTrace( + AirbyteTraceMessage() + .withType(AirbyteTraceMessage.Type.STREAM_STATUS) + .withStreamStatus( + AirbyteStreamStatusTraceMessage() + .withStreamDescriptor(descriptor) + .withStatus( + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE + ) + ) ) ) @@ -1442,15 +1514,14 @@ abstract class DestinationAcceptanceTest { runSyncAndVerifyStateOutput( config, secondSyncMessagesWithNewFields, - configuredCatalog, + configuredCatalog2, false ) val destinationOutput = retrieveRecords(testEnv, stream.name, getDefaultSchema(config)!!, stream.jsonSchema) // Remove state message - secondSyncMessagesWithNewFields.removeIf { - airbyteMessage: io.airbyte.protocol.models.v0.AirbyteMessage -> - airbyteMessage.type == io.airbyte.protocol.models.v0.AirbyteMessage.Type.STATE + secondSyncMessagesWithNewFields.removeIf { airbyteMessage: AirbyteMessage -> + airbyteMessage.type == Type.STATE || airbyteMessage.type == Type.TRACE } Assertions.assertEquals(secondSyncMessagesWithNewFields.size, destinationOutput.size) } @@ -1552,7 +1623,7 @@ abstract class DestinationAcceptanceTest { .checkConnection return standardCheckConnectionOutput.status } catch (e: Exception) { - LOGGER.error("Failed to check connection:" + e.message) + LOGGER.error { "Failed to check connection:" + e.message } } return StandardCheckConnectionOutput.Status.FAILED } @@ -1574,28 +1645,86 @@ abstract class DestinationAcceptanceTest { ) } + private fun getDestination(imageName: String): AirbyteDestination { + return DefaultAirbyteDestination( + integrationLauncher = + AirbyteIntegrationLauncher( + JOB_ID, + JOB_ATTEMPT, + imageName, + processFactory, + null, + null, + false, + EnvVariableFeatureFlags() + ) + ) + } + + protected fun runSyncAndVerifyStateOutput( + config: JsonNode, + messages: List, + catalog: ConfiguredAirbyteCatalog, + runNormalization: Boolean, + ) { + runSyncAndVerifyStateOutput( + config, + messages, + catalog, + runNormalization, + imageName, + verifyIndividualStateAndCounts + ) + } + @Throws(Exception::class) protected fun runSyncAndVerifyStateOutput( config: JsonNode, - messages: List, - catalog: io.airbyte.protocol.models.v0.ConfiguredAirbyteCatalog, - runNormalization: Boolean + messages: List, + catalog: ConfiguredAirbyteCatalog, + runNormalization: Boolean, + imageName: String, + verifyIndividualStateAndCounts: Boolean ) { - val destinationOutput = runSync(config, messages, catalog, runNormalization) + val destinationOutput = runSync(config, messages, catalog, runNormalization, imageName) + + var expected = messages.filter { it.type == Type.STATE } + var actual = destinationOutput.filter { it.type == Type.STATE } + + if (verifyIndividualStateAndCounts) { + /* Collect the counts and add them to each expected state message */ + val stateToCount = mutableMapOf() + messages.fold(0) { acc, message -> + if (message.type == Type.STATE) { + stateToCount[message.state.data] = acc + 0 + } else { + acc + 1 + } + } - val expectedStateMessage = - reversed(messages).firstOrNull { m: AirbyteMessage -> m.type == Type.STATE } - ?: throw IllegalArgumentException( - "All message sets used for testing should include a state record" - ) + expected.forEach { message -> + val clone = message.state + clone.destinationStats = + AirbyteStateStats().withRecordCount(stateToCount[clone.data]!!.toDouble()) + message.state = clone + } + } else { + /* Null the states and collect only the final messages */ + val finalActual = + actual.lastOrNull() + ?: throw IllegalArgumentException( + "All message sets used for testing should include a state record" + ) + val clone = finalActual.state + clone.destinationStats = null + finalActual.state = clone - Collections.reverse(destinationOutput) - val actualStateMessage = destinationOutput.filter { it.type == Type.STATE }.first() - val clone = actualStateMessage.state - clone.destinationStats = null - actualStateMessage.state = clone + expected = listOf(expected.last()) + actual = listOf(finalActual) + } - Assertions.assertEquals(expectedStateMessage, actualStateMessage) + Assertions.assertEquals(expected, actual) } @Throws(Exception::class) @@ -1603,7 +1732,8 @@ abstract class DestinationAcceptanceTest { config: JsonNode, messages: List, catalog: ConfiguredAirbyteCatalog, - runNormalization: Boolean + runNormalization: Boolean, + imageName: String, ): List { val destinationConfig = WorkerDestinationConfig() @@ -1616,7 +1746,7 @@ abstract class DestinationAcceptanceTest { ) .withDestinationConnectionConfiguration(config) - val destination = destination + val destination = getDestination(imageName) destination.start( destinationConfig, @@ -1624,7 +1754,7 @@ abstract class DestinationAcceptanceTest { inDestinationNormalizationFlags(runNormalization) ) messages.forEach( - Consumer { message: io.airbyte.protocol.models.v0.AirbyteMessage -> + Consumer { message: AirbyteMessage -> Exceptions.toRuntime { destination.accept( convertProtocolObject( @@ -1657,7 +1787,7 @@ abstract class DestinationAcceptanceTest { getNormalizationIntegrationType() ) runner.start() - val normalizationRoot = Files.createDirectories(jobRoot!!.resolve("normalize")) + val normalizationRoot = Files.createDirectories(jobRoot.resolve("normalize")) if ( !runner.normalize( JOB_ID, @@ -1677,7 +1807,7 @@ abstract class DestinationAcceptanceTest { @Throws(Exception::class) protected fun retrieveRawRecordsAndAssertSameMessages( catalog: AirbyteCatalog, - messages: List, + messages: List, defaultSchema: String? ) { val actualMessages: MutableList = ArrayList() @@ -1701,16 +1831,14 @@ abstract class DestinationAcceptanceTest { // ignores emitted at. protected fun assertSameMessages( - expected: List, + expected: List, actual: List, pruneAirbyteInternalFields: Boolean ) { val expectedProcessed = expected - .filter { message: io.airbyte.protocol.models.v0.AirbyteMessage -> - message.type == io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD - } - .map { obj: io.airbyte.protocol.models.v0.AirbyteMessage -> obj.record } + .filter { message: AirbyteMessage -> message.type == AirbyteMessage.Type.RECORD } + .map { obj: AirbyteMessage -> obj.record } .onEach { recordMessage: AirbyteRecordMessage -> recordMessage.emittedAt = null } .map { recordMessage: AirbyteRecordMessage -> if (pruneAirbyteInternalFields) safePrune(recordMessage) else recordMessage @@ -1832,14 +1960,14 @@ abstract class DestinationAcceptanceTest { // iterate through streams for (streamCounter in 0 until streamsSize) { - LOGGER.info("Started new stream processing with #$streamCounter") + LOGGER.info { "Started new stream processing with #$streamCounter" } // iterate through msm inside a particular stream // Generate messages and put it to stream for (msgCounter in 0 until messagesNumber) { val msg = io.airbyte.protocol.models.v0 .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.RECORD) + .withType(AirbyteMessage.Type.RECORD) .withRecord( AirbyteRecordMessage() .withStream(USERS_STREAM_NAME + streamCounter) @@ -1861,7 +1989,7 @@ abstract class DestinationAcceptanceTest { ) ) } catch (e: Exception) { - LOGGER.error("Failed to write a RECORD message: $e") + LOGGER.error { "Failed to write a RECORD message: $e" } throw RuntimeException(e) } @@ -1872,7 +2000,7 @@ abstract class DestinationAcceptanceTest { val msgState = io.airbyte.protocol.models.v0 .AirbyteMessage() - .withType(io.airbyte.protocol.models.v0.AirbyteMessage.Type.STATE) + .withType(AirbyteMessage.Type.STATE) .withState( AirbyteStateMessage() .withData( @@ -1891,20 +2019,20 @@ abstract class DestinationAcceptanceTest { ) ) } catch (e: Exception) { - LOGGER.error("Failed to write a STATE message: $e") + LOGGER.error { "Failed to write a STATE message: $e" } throw RuntimeException(e) } currentStreamNumber.set(streamCounter) } - LOGGER.info( + LOGGER.info { String.format( "Added %s messages to each of %s streams", currentRecordNumberForStream, currentStreamNumber ) - ) + } // Close destination destination.notifyEndOfInput() } @@ -1959,6 +2087,9 @@ abstract class DestinationAcceptanceTest { val catalog = readCatalogFromFile(catalogFilename) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } val messages = readMessagesFromFile(messagesFilename) runAndCheck(catalog, configuredCatalog, messages) @@ -1988,7 +2119,6 @@ abstract class DestinationAcceptanceTest { ) ) val config = getConfig() - val defaultSchema = getDefaultSchema(config) runAndCheck(catalog, configuredCatalog, messages) } @@ -2017,7 +2147,6 @@ abstract class DestinationAcceptanceTest { ) ) val config = getConfig() - val defaultSchema = getDefaultSchema(config) runAndCheck(catalog, configuredCatalog, messages) } @@ -2048,7 +2177,6 @@ abstract class DestinationAcceptanceTest { ) ) val config = getConfig() - val defaultSchema = getDefaultSchema(config) runAndCheck(catalog, configuredCatalog, messages) } @@ -2080,7 +2208,6 @@ abstract class DestinationAcceptanceTest { ) ) val config = getConfig() - val defaultSchema = getDefaultSchema(config) runAndCheck(catalog, configuredCatalog, messages) } @@ -2088,22 +2215,22 @@ abstract class DestinationAcceptanceTest { @Throws(Exception::class) private fun runAndCheck( catalog: AirbyteCatalog, - configuredCatalog: io.airbyte.protocol.models.v0.ConfiguredAirbyteCatalog, - messages: List + configuredCatalog: ConfiguredAirbyteCatalog, + messages: List ) { if (normalizationFromDefinition()) { - LOGGER.info("Normalization is supported! Run test with normalization.") + LOGGER.info { "Normalization is supported! Run test with normalization." } runAndCheckWithNormalization(messages, configuredCatalog, catalog) } else { - LOGGER.info("Normalization is not supported! Run test without normalization.") + LOGGER.info { "Normalization is not supported! Run test without normalization." } runAndCheckWithoutNormalization(messages, configuredCatalog, catalog) } } @Throws(Exception::class) private fun runAndCheckWithNormalization( - messages: List, - configuredCatalog: io.airbyte.protocol.models.v0.ConfiguredAirbyteCatalog, + messages: List, + configuredCatalog: ConfiguredAirbyteCatalog, catalog: AirbyteCatalog ) { val config = getConfig() @@ -2115,8 +2242,8 @@ abstract class DestinationAcceptanceTest { @Throws(Exception::class) private fun runAndCheckWithoutNormalization( - messages: List, - configuredCatalog: io.airbyte.protocol.models.v0.ConfiguredAirbyteCatalog, + messages: List, + configuredCatalog: ConfiguredAirbyteCatalog, catalog: AirbyteCatalog ) { val config = getConfig() @@ -2292,9 +2419,7 @@ abstract class DestinationAcceptanceTest { } @Throws(IOException::class) - private fun readMessagesFromFile( - messagesFilename: String - ): List { + private fun readMessagesFromFile(messagesFilename: String): List { return MoreResources.readResource(messagesFilename).trim().lines().map { Jsons.deserialize(it, AirbyteMessage::class.java) } @@ -2302,11 +2427,11 @@ abstract class DestinationAcceptanceTest { /** Mutate the input airbyte record message namespace. */ private fun getRecordMessagesWithNewNamespace( - airbyteMessages: List, + airbyteMessages: List, namespace: String? - ): List { + ): List { airbyteMessages.forEach( - Consumer { message: io.airbyte.protocol.models.v0.AirbyteMessage -> + Consumer { message: AirbyteMessage -> if (message.record != null) { message.record.namespace = namespace } diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_object_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_object_test_messages.txt index 88acd94fbc6f..fbcb5ccbf0d5 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_object_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_object_test_messages.txt @@ -1,2 +1,3 @@ {"type": "RECORD", "record": {"stream": "object_array_test_1", "emitted_at": 1602637589100, "data": { "property_string" : "qqq", "property_array" : [ { "property_string": "foo bar", "property_date": "2021-01-23", "property_timestamp_with_timezone": "2022-11-22T01:23:45+00:00", "property_timestamp_without_timezone": "2022-11-22T01:23:45", "property_number": 56.78, "property_big_number": "100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.1234", "property_integer": 42, "property_boolean": true } ] }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "object_array_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_test_messages.txt index a39ce73ac873..257ef56ee496 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_array_test_messages.txt @@ -1,2 +1,3 @@ {"type": "RECORD", "record": {"stream": "array_test_1", "emitted_at": 1602637589100, "data": { "string_array" : ["foo bar", "some random special characters: ࠈൡሗ"], "array_date" : ["2021-01-23", "1504-02-29"], "array_timestamp_with_timezone" : ["2022-11-22T01:23:45+05:00", "9999-12-21T01:23:45-05:00"], "array_timestamp_without_timezone" : ["2022-11-22T01:23:45", "1504-02-29T01:23:45"], "array_number" : [56.78, 0, -12345.678], "array_big_numberarray_integer" : [42, 0, 12345], "array_boolean" : [true, false] }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "array_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_basic_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_basic_test_messages.txt index ebcee0c19123..95203671eaf6 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_basic_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_basic_test_messages.txt @@ -21,3 +21,12 @@ {"type": "RECORD", "record": {"stream": "integer_test_1", "emitted_at": 1602637589300, "data": { "data" : -12345 }}} {"type": "RECORD", "record": {"stream": "boolean_test_1", "emitted_at": 1602637589100, "data": { "data" : true }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "string_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "date_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "datetime_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "datetime_test_2"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "number_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "bignumber_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "integer_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "boolean_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} + diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_object_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_object_test_messages.txt index c92ed533f0c7..6ee65aa2cfe4 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_object_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/data_type_object_test_messages.txt @@ -1,2 +1,3 @@ {"type": "RECORD", "record": {"stream": "object_test_1", "emitted_at": 1602637589100, "data": {"property_string": "foo bar", "property_date": "2021-01-23", "property_timestamp_with_timezone": "2022-11-22T01:23:45+00:00", "property_timestamp_without_timezone": "2022-11-22T01:23:45", "property_number": 56.78, "property_big_number": "100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.1234", "property_integer": 42, "property_boolean": true }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "object_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/edge_case_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/edge_case_messages.txt index 5f2adf5a21b4..7aa369a37727 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/edge_case_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/edge_case_messages.txt @@ -16,3 +16,15 @@ {"type": "RECORD", "record": {"stream": "stream_name_next", "emitted_at": 1602637589500, "data": { "some_id" : 203 }}} {"type": "RECORD", "record": {"stream": "stream_with_binary_data", "emitted_at": 1602637589500, "data": { "some_id" : 303, "binary_field_name":"dGVzdA==" }}} {"type": "STATE", "state": { "data": {"start_date": "2020-09-02"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "streamWithCamelCase"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_with_underscores"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_with_edge_case_field_names"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream-with:spécial:character_names"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "CapitalCase"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "reserved_keywords"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "groups"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "ProperCase"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_name"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_name_next"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_with_binary_data"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "STREAM_WITH_ALL_CAPS"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/exchange_rate_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/exchange_rate_messages.txt index 176d3461b616..37a1e6c463aa 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/exchange_rate_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v0/exchange_rate_messages.txt @@ -6,3 +6,4 @@ {"type": "RECORD", "record": {"stream": "exchange_rate", "emitted_at": 1602637889300, "data": { "id": 2, "currency": "EUR", "date": "2020-08-31T00:00:00Z", "NZD": 1.14, "HKD": 7.99, "USD": 10.99}}} {"type": "RECORD", "record": {"stream": "exchange_rate", "emitted_at": 1602637989400, "data": { "id": 2, "currency": "EUR", "date": "2020-09-01T00:00:00Z", "NZD": 1.14, "HKD": 7.15, "USD": 10.16}}} {"type": "STATE", "state": { "data": {"start_date": "2020-09-02"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "exchange_rate"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_object_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_object_test_messages.txt index 1f217a5dc1bc..20f49b274070 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_object_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_object_test_messages.txt @@ -1,2 +1,3 @@ {"type": "RECORD", "record": {"stream": "object_array_test_1", "emitted_at": 1602637589100, "data": { "property_string" : "qqq", "property_array" : [ { "property_string": "foo bar", "property_date": "2021-01-23", "property_timestamp_with_timezone": "2022-11-22T01:23:45+00:00", "property_timestamp_without_timezone": "2022-11-22T01:23:45", "property_number": "56.78", "property_integer": "42", "property_boolean": true, "property_binary_data" : "dGVzdA==" } ] }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "object_array_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} \ No newline at end of file diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_test_messages.txt index ecf027b74ac1..499f275d2293 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_array_test_messages.txt @@ -1,2 +1,3 @@ {"type": "RECORD", "record": {"stream": "array_test_1", "emitted_at": 1602637589100, "data": { "string_array" : ["foo bar", "some random special characters: ࠈൡሗ"], "array_date" : ["2021-01-23", "1504-02-29"], "array_timestamp_with_timezone" : ["2022-11-22T01:23:45+05:00", "9999-12-21T01:23:45-05:00"], "array_timestamp_without_timezone" : ["2022-11-22T01:23:45", "1504-02-29T01:23:45"], "array_number" : ["56.78", "0", "-12345.678"], "array_integer" : ["42", "0", "12345"], "array_boolean" : [true, false], "array_binary_data" : ["dGVzdA=="] }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "array_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} \ No newline at end of file diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_basic_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_basic_test_messages.txt index 2b75c2dde733..b2f5b2819337 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_basic_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_basic_test_messages.txt @@ -22,3 +22,13 @@ {"type": "RECORD", "record": {"stream": "boolean_test_1", "emitted_at": 1602637589200, "data": { "data" : true }}} {"type": "RECORD", "record": {"stream": "binary_test_1", "emitted_at": 1602637589300, "data": { "data" : "dGVzdA==" }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "string_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "date_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "datetime_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "datetime_test_2"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "time_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "time_test_2"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "number_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "integer_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "boolean_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "binary_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} \ No newline at end of file diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_object_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_object_test_messages.txt index c2284f74ab6e..d09fb2903698 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_object_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/data_type_object_test_messages.txt @@ -1,2 +1,3 @@ {"type": "RECORD", "record": {"stream": "object_test_1", "emitted_at": 1602637589100, "data": {"property_string": "foo bar", "property_date": "2021-01-23", "property_timestamp_with_timezone": "2022-11-22T01:23:45+00:00", "property_timestamp_without_timezone": "2022-11-22T01:23:45", "property_number": "56.78", "property_integer": "42", "property_boolean": true, "property_binary_data" : "dGVzdA==" }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "object_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} \ No newline at end of file diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/edge_case_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/edge_case_messages.txt index 80039c75c62b..11fe9b7d38d9 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/edge_case_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/edge_case_messages.txt @@ -16,3 +16,15 @@ {"type": "RECORD", "record": {"stream": "stream_name_next", "emitted_at": 1602637589500, "data": { "some_id" : "203" }}} {"type": "RECORD", "record": {"stream": "stream_with_binary_data", "emitted_at": 1602637589500, "data": { "some_id" : "303", "binary_field_name":"dGVzdA==" }}} {"type": "STATE", "state": { "data": {"start_date": "2020-09-02"}}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "streamWithCamelCase"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_with_underscores"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_with_edge_case_field_names"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream-with:spécial:character_names"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "CapitalCase"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "reserved_keywords"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "groups"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "ProperCase"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_name"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_name_next"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "stream_with_binary_data"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} +{"type": "TRACE", "trace": {"type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "STREAM_WITH_ALL_CAPS"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} \ No newline at end of file diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/exchange_rate_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/exchange_rate_messages.txt index 7eddc0d31ad3..5c15c041f22f 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/exchange_rate_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/exchange_rate_messages.txt @@ -6,3 +6,4 @@ {"type": "RECORD", "record": {"stream": "exchange_rate", "emitted_at": 1602637889300, "data": { "id": "2", "currency": "EUR", "date": "2020-08-31T00:00:00Z", "NZD": "1.14", "HKD": "7.99", "USD": "10.99"}}} {"type": "RECORD", "record": {"stream": "exchange_rate", "emitted_at": 1602637989400, "data": { "id": "2", "currency": "EUR", "date": "2020-09-01T00:00:00Z", "NZD": "1.14", "HKD": "7.15", "USD": "10.16"}}} {"type": "STATE", "state": { "data": {"start_date": "2020-09-02"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "exchange_rate"}, "status": "COMPLETE"}, "emitted_at": 1602637589101}} \ No newline at end of file diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_array_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_array_test_messages.txt index 3b981b9d2a86..c8e6efff6256 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_array_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_array_test_messages.txt @@ -1,2 +1,3 @@ {"type": "RECORD", "record": {"stream": "array_test_1", "emitted_at": 1602637589100, "data": { "array_number" : ["-12345.678", "100000000000000000.1234"],"array_float" : ["-12345.678", "0", "1000000000000000000000000000000000000000000000000000.1234"], "array_integer" : ["42", "0", "12345"]}}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "array_test_1"}, "status": "COMPLETE"}, "emitted_at": 1602637589301}} diff --git a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_test_messages.txt b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_test_messages.txt index 1b17eb58284a..a1a68b77a193 100644 --- a/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_test_messages.txt +++ b/airbyte-cdk/java/airbyte-cdk/db-destinations/src/testFixtures/resources/v1/number_data_type_test_messages.txt @@ -8,3 +8,6 @@ {"type": "RECORD", "record": {"stream": "default_number_test", "emitted_at": 1602637589200, "data": { "data" : "0" }}} {"type": "RECORD", "record": {"stream": "default_number_test", "emitted_at": 1602637589300, "data": { "data" : "-12345.678" }}} {"type": "STATE", "state": { "data": {"start_date": "2022-02-14"}}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "int_test"}, "status": "COMPLETE"}, "emitted_at": 1602637589301}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "float_test"}, "status": "COMPLETE"}, "emitted_at": 1602637589301}} +{"type": "TRACE", "trace": { "type": "STREAM_STATUS", "stream_status": {"stream_descriptor": {"name": "default_number_test"}, "status": "COMPLETE"}, "emitted_at": 1602637589301}} \ No newline at end of file diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BaseS3Destination.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BaseS3Destination.kt index 731747034d21..8aaca609840e 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BaseS3Destination.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BaseS3Destination.kt @@ -7,17 +7,14 @@ import com.fasterxml.jackson.databind.JsonNode import io.airbyte.cdk.integrations.BaseConnector import io.airbyte.cdk.integrations.base.AirbyteMessageConsumer import io.airbyte.cdk.integrations.base.Destination +import io.airbyte.cdk.integrations.base.SerializedAirbyteMessageConsumer import io.airbyte.cdk.integrations.destination.NamingConventionTransformer -import io.airbyte.cdk.integrations.destination.record_buffer.BufferStorage -import io.airbyte.cdk.integrations.destination.record_buffer.FileBuffer -import io.airbyte.cdk.integrations.destination.s3.SerializedBufferFactory.Companion.getCreateFunction import io.airbyte.cdk.integrations.destination.s3.util.S3NameTransformer import io.airbyte.protocol.models.v0.AirbyteConnectionStatus import io.airbyte.protocol.models.v0.AirbyteMessage import io.airbyte.protocol.models.v0.ConfiguredAirbyteCatalog import io.github.oshai.kotlinlogging.KotlinLogging import java.util.function.Consumer -import java.util.function.Function private val LOGGER = KotlinLogging.logger {} @@ -63,17 +60,19 @@ protected constructor( catalog: ConfiguredAirbyteCatalog, outputRecordCollector: Consumer ): AirbyteMessageConsumer? { + throw UnsupportedOperationException("getConsumer is not supported in S3 async destinations") + } + + override fun getSerializedMessageConsumer( + config: JsonNode, + catalog: ConfiguredAirbyteCatalog, + outputRecordCollector: Consumer + ): SerializedAirbyteMessageConsumer? { val s3Config = configFactory.getS3DestinationConfig(config, storageProvider(), environment) return S3ConsumerFactory() - .create( + .createAsync( outputRecordCollector, S3StorageOperations(nameTransformer, s3Config.getS3Client(), s3Config), - getCreateFunction( - s3Config, - Function { fileExtension: String -> - FileBuffer(fileExtension) - } - ), s3Config, catalog ) diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BlobStorageOperations.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BlobStorageOperations.kt index 9bea16421537..8a4c9c99865e 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BlobStorageOperations.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/BlobStorageOperations.kt @@ -30,7 +30,8 @@ abstract class BlobStorageOperations protected constructor() { abstract fun uploadRecordsToBucket( recordsData: SerializableBuffer, namespace: String?, - objectPath: String + objectPath: String, + generationId: Long, ): String? /** Remove files that were just stored in the bucket */ @@ -61,4 +62,17 @@ abstract class BlobStorageOperations protected constructor() { fun addBlobDecorator(blobDecorator: BlobDecorator) { blobDecorators.add(blobDecorator) } + + /** + * Provides the generationId from the last written object's metadata. If there are no objects in + * the given path format, returns nullå + */ + open fun getStageGeneration( + namespace: String?, + streamName: String, + objectPath: String, + pathFormat: String + ): Long? { + return null + } } diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3ConsumerFactory.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3ConsumerFactory.kt index ae0f121758d2..9d32bf7b1b68 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3ConsumerFactory.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3ConsumerFactory.kt @@ -6,14 +6,21 @@ package io.airbyte.cdk.integrations.destination.s3 import com.fasterxml.jackson.databind.JsonNode import com.google.common.base.Preconditions import io.airbyte.cdk.integrations.base.AirbyteMessageConsumer +import io.airbyte.cdk.integrations.base.SerializedAirbyteMessageConsumer import io.airbyte.cdk.integrations.destination.StreamSyncSummary +import io.airbyte.cdk.integrations.destination.async.AsyncStreamConsumer +import io.airbyte.cdk.integrations.destination.async.buffers.BufferManager import io.airbyte.cdk.integrations.destination.buffered_stream_consumer.BufferedStreamConsumer import io.airbyte.cdk.integrations.destination.buffered_stream_consumer.OnCloseFunction import io.airbyte.cdk.integrations.destination.buffered_stream_consumer.OnStartFunction import io.airbyte.cdk.integrations.destination.record_buffer.BufferCreateFunction +import io.airbyte.cdk.integrations.destination.record_buffer.BufferStorage +import io.airbyte.cdk.integrations.destination.record_buffer.FileBuffer import io.airbyte.cdk.integrations.destination.record_buffer.FlushBufferFunction import io.airbyte.cdk.integrations.destination.record_buffer.SerializableBuffer import io.airbyte.cdk.integrations.destination.record_buffer.SerializedBufferingStrategy +import io.airbyte.cdk.integrations.destination.s3.SerializedBufferFactory.Companion.getCreateFunction +import io.airbyte.commons.exceptions.ConfigErrorException import io.airbyte.commons.json.Jsons import io.airbyte.protocol.models.v0.* import io.github.oshai.kotlinlogging.KotlinLogging @@ -57,11 +64,11 @@ class S3ConsumerFactory { "Preparing bucket in destination started for ${writeConfigs.size} streams" } for (writeConfig in writeConfigs) { - if (writeConfig.syncMode == DestinationSyncMode.OVERWRITE) { - val namespace = writeConfig.namespace - val stream = writeConfig.streamName - val outputBucketPath = writeConfig.outputBucketPath - val pathFormat = writeConfig.pathFormat + val namespace = writeConfig.namespace + val stream = writeConfig.streamName + val outputBucketPath = writeConfig.outputBucketPath + val pathFormat = writeConfig.pathFormat + if (mustCleanUpExistingObjects(writeConfig, storageOperations)) { LOGGER.info { "Clearing storage area in destination started for namespace $namespace " + "stream $stream bucketObject $outputBucketPath pathFormat $pathFormat" @@ -75,6 +82,11 @@ class S3ConsumerFactory { LOGGER.info { "Clearing storage area in destination completed for namespace $namespace stream $stream bucketObject $outputBucketPath" } + } else { + LOGGER.info { + "Skipping clearing of storage area in destination for namespace $namespace " + + "stream $stream bucketObject $outputBucketPath pathFormat $pathFormat" + } } } LOGGER.info { "Preparing storage area in destination completed." } @@ -84,7 +96,7 @@ class S3ConsumerFactory { private fun flushBufferFunction( storageOperations: BlobStorageOperations, writeConfigs: List, - catalog: ConfiguredAirbyteCatalog? + catalog: ConfiguredAirbyteCatalog ): FlushBufferFunction { val pairToWriteConfig = writeConfigs.associateBy { toNameNamespacePair(it) } @@ -96,8 +108,9 @@ class S3ConsumerFactory { } require(pairToWriteConfig.containsKey(pair)) { String.format( - "Message contained record from a stream %s that was not in the catalog. \ncatalog: %s", - pair, + "Message contained record from a stream [namespace=\"%s\", name=\"%s\"] that was not in the catalog. \ncatalog: %s", + pair.namespace, + pair.name, Jsons.serialize(catalog) ) } @@ -110,7 +123,8 @@ class S3ConsumerFactory { storageOperations.uploadRecordsToBucket( writer, writeConfig.namespace, - writeConfig.fullOutputPath + writeConfig.fullOutputPath, + writeConfig.generationId )!! ) } @@ -140,6 +154,98 @@ class S3ConsumerFactory { } } + fun createAsync( + outputRecordCollector: Consumer, + storageOps: S3StorageOperations, + s3Config: S3DestinationConfig, + catalog: ConfiguredAirbyteCatalog + ): SerializedAirbyteMessageConsumer { + val writeConfigs = createWriteConfigs(storageOps, s3Config, catalog) + // Buffer creation function: yields a file buffer that converts + // incoming data to the correct format for the destination. + val createFunction = + getCreateFunction( + s3Config, + Function { fileExtension: String -> + FileBuffer(fileExtension) + } + ) + return AsyncStreamConsumer( + outputRecordCollector, + onStartFunction(storageOps, writeConfigs), + onCloseFunction(storageOps, writeConfigs), + S3DestinationFlushFunction( + // Ensure the file buffer is always larger than the memory buffer, + // as the file buffer will be flushed at the end of the memory flush. + optimalBatchSizeBytes = (FileBuffer.MAX_PER_STREAM_BUFFER_SIZE_BYTES * 0.9).toLong() + ) { + // Yield a new BufferingStrategy every time we flush (for thread-safety). + SerializedBufferingStrategy( + createFunction, + catalog, + flushBufferFunction(storageOps, writeConfigs, catalog) + ) + }, + catalog, + // S3 has no concept of default namespace + // In the "namespace from destination case", the namespace + // is simply omitted from the path. + BufferManager(defaultNamespace = null) + ) + } + + private fun mustCleanUpExistingObjects( + writeConfig: WriteConfig, + storageOperations: BlobStorageOperations + ): Boolean { + return when (writeConfig.minimumGenerationId) { + // This is an additional safety check, that this really is OVERWRITE + // mode, this avoids bad things happening like deleting all objects + // in APPEND mode. + 0L -> writeConfig.syncMode == DestinationSyncMode.OVERWRITE + writeConfig.generationId -> { + // This is truncate sync and try to determine if the current generation + // data is already present + val namespace = writeConfig.namespace + val stream = writeConfig.streamName + val outputBucketPath = writeConfig.outputBucketPath + val pathFormat = writeConfig.pathFormat + // generationId is missing, assume the last sync was ran in non-resumeable refresh + // mode, + // cleanup files + val currentGenerationId = + storageOperations.getStageGeneration( + namespace, + stream, + outputBucketPath, + pathFormat + ) + if (currentGenerationId == null) { + LOGGER.info { + "Missing generationId from the lastModified object, proceeding with cleanup for stream ${writeConfig.streamName}" + } + return true + } + // if minGen = gen = retrievedGen and skip clean up + val hasDataFromCurrentGeneration = currentGenerationId == writeConfig.generationId + if (hasDataFromCurrentGeneration) { + LOGGER.info { + "Preserving data from previous sync for stream ${writeConfig.streamName} since it matches the current generation ${writeConfig.generationId}" + } + } else { + LOGGER.info { + "No data exists from previous sync for stream ${writeConfig.streamName} from current generation ${writeConfig.generationId}, " + + "proceeding to clean up existing data" + } + } + return !hasDataFromCurrentGeneration + } + else -> { + throw IllegalArgumentException("Hybrid refreshes are not yet supported.") + } + } + } + companion object { private val SYNC_DATETIME: DateTime = DateTime.now(DateTimeZone.UTC) @@ -161,6 +267,11 @@ class S3ConsumerFactory { stream.destinationSyncMode, "Undefined destination sync mode" ) + if (stream.generationId == null || stream.minimumGenerationId == null) { + throw ConfigErrorException( + "You must upgrade your platform version to use this connector version. Either downgrade your connector or upgrade platform to 0.63.7" + ) + } val abStream = stream.stream val namespace: String? = abStream.namespace val streamName = abStream.name @@ -181,7 +292,9 @@ class S3ConsumerFactory { bucketPath!!, customOutputFormat, fullOutputPath!!, - syncMode + syncMode, + stream.generationId, + stream.minimumGenerationId ) LOGGER.info { "Write config: $writeConfig" } writeConfig diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3DestinationFlushFunction.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3DestinationFlushFunction.kt new file mode 100644 index 000000000000..2a25bf346960 --- /dev/null +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3DestinationFlushFunction.kt @@ -0,0 +1,54 @@ +/* + * Copyright (c) 2024 Airbyte, Inc., all rights reserved. + */ + +package io.airbyte.cdk.integrations.destination.s3 + +import io.airbyte.cdk.integrations.destination.async.function.DestinationFlushFunction +import io.airbyte.cdk.integrations.destination.async.model.PartialAirbyteMessage +import io.airbyte.cdk.integrations.destination.record_buffer.BufferingStrategy +import io.airbyte.commons.json.Jsons +import io.airbyte.protocol.models.v0.AirbyteMessage +import io.airbyte.protocol.models.v0.AirbyteRecordMessage +import io.airbyte.protocol.models.v0.AirbyteRecordMessageMeta +import io.airbyte.protocol.models.v0.AirbyteStreamNameNamespacePair +import io.airbyte.protocol.models.v0.StreamDescriptor +import java.util.stream.Stream + +class S3DestinationFlushFunction( + override val optimalBatchSizeBytes: Long, + private val strategyProvider: () -> BufferingStrategy +) : DestinationFlushFunction { + + override fun flush(streamDescriptor: StreamDescriptor, stream: Stream) { + val nameAndNamespace = + AirbyteStreamNameNamespacePair(streamDescriptor.name, streamDescriptor.namespace) + strategyProvider().use { strategy -> + for (partialMessage in stream) { + val partialRecord = partialMessage.record!! + val data = + /** + * This should always be null, but if something changes upstream to trigger a clone + * of the record, then `null` becomes `JsonNull` and `data == null` goes from `true` + * to `false` + */ + if (partialRecord.data == null || partialRecord.data!!.isNull) { + Jsons.deserialize(partialMessage.serialized) + } else { + partialRecord.data + } + val completeRecord = + AirbyteRecordMessage() + .withEmittedAt(partialRecord.emittedAt) + .withMeta(partialRecord.meta ?: AirbyteRecordMessageMeta()) + .withNamespace(partialRecord.namespace) + .withStream(partialRecord.stream!!) + .withData(data) + val completeMessage = + AirbyteMessage().withType(AirbyteMessage.Type.RECORD).withRecord(completeRecord) + strategy.addRecord(nameAndNamespace, completeMessage) + } + strategy.flushSingleStream(nameAndNamespace) + } + } +} diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3StorageOperations.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3StorageOperations.kt index 07b3d83db9b7..6bd55554688b 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3StorageOperations.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/S3StorageOperations.kt @@ -29,6 +29,7 @@ import java.util.concurrent.ConcurrentHashMap import java.util.concurrent.ConcurrentMap import java.util.concurrent.atomic.AtomicInteger import java.util.regex.Pattern +import kotlin.Comparator import org.apache.commons.io.FilenameUtils import org.joda.time.DateTime @@ -113,7 +114,8 @@ open class S3StorageOperations( override fun uploadRecordsToBucket( recordsData: SerializableBuffer, namespace: String?, - objectPath: String + objectPath: String, + generationId: Long, ): String { val exceptionsThrown: MutableList = ArrayList() while (exceptionsThrown.size < UPLOAD_RETRY_LIMIT) { @@ -126,7 +128,7 @@ open class S3StorageOperations( } try { - val fileName: String = loadDataIntoBucket(objectPath, recordsData) + val fileName: String = loadDataIntoBucket(objectPath, recordsData, generationId) logger.info { "Successfully loaded records to stage $objectPath with ${exceptionsThrown.size} re-attempt(s)" } @@ -164,7 +166,11 @@ open class S3StorageOperations( * */ @Throws(IOException::class) - private fun loadDataIntoBucket(objectPath: String, recordsData: SerializableBuffer): String { + private fun loadDataIntoBucket( + objectPath: String, + recordsData: SerializableBuffer, + generationId: Long + ): String { val partSize: Long = DEFAULT_PART_SIZE.toLong() val bucket: String? = s3Config.bucketName val partId: String = getPartId(objectPath) @@ -187,6 +193,9 @@ open class S3StorageOperations( for (blobDecorator: BlobDecorator in blobDecorators) { blobDecorator.updateMetadata(metadata, getMetadataMapping()) } + // Note when looking in the S3 object, the metadata is appended with x-amz-meta- + // and when retrieving, sdk takes care of removing the prefix + metadata[GENERATION_ID_USER_META_KEY] = generationId.toString() val uploadManager: StreamTransferManager = StreamTransferManagerFactory.create( bucket, @@ -286,14 +295,9 @@ open class S3StorageOperations( cleanUpBucketObject(objectPath, listOf()) } - override fun cleanUpBucketObject( - namespace: String?, - streamName: String, - objectPath: String, - pathFormat: String - ) { - val bucket: String? = s3Config.bucketName - var objects: ObjectListing = + private fun listObjects(objectPath: String): ObjectListing { + val bucket: String = s3Config.bucketName!! + val objects: ObjectListing = s3Client.listObjects( ListObjectsRequest() .withBucketName(bucket) @@ -303,6 +307,17 @@ open class S3StorageOperations( // so we need to recursively list them and filter files matching the pathFormat .withDelimiter(""), ) + return objects + } + + override fun cleanUpBucketObject( + namespace: String?, + streamName: String, + objectPath: String, + pathFormat: String + ) { + val bucket: String = s3Config.bucketName!! + var objects: ObjectListing = listObjects(objectPath) val regexFormat: Pattern = Pattern.compile(getRegexFormat(namespace, streamName, pathFormat)) while (objects.objectSummaries.size > 0) { @@ -333,6 +348,67 @@ open class S3StorageOperations( } } + override fun getStageGeneration( + namespace: String?, + streamName: String, + objectPath: String, + pathFormat: String + ): Long? { + val bucket: String = s3Config.bucketName!! + var objects: ObjectListing = listObjects(objectPath) + val regexFormat: Pattern = + Pattern.compile(getRegexFormat(namespace, streamName, pathFormat)) + val descendingComparator: Comparator = + Comparator.comparingLong { o: S3ObjectSummary -> o.lastModified.time }.reversed() + var lastModifiedObject: S3ObjectSummary? = null + + // We could be retrieving multiple pages of results based on when the last sync ran spanning + // across multiple + // date boundaries of object path format patterns. + // Maintaining a local maxima across pages and sorting at the end to get global maxima + // of last modified object to retrieve the object metadata header. + // Note: This logic will fall apart if the path format is changed between syncs + while (objects.objectSummaries.size > 0) { + val matchedObjects = + objects.objectSummaries + .filter { obj: S3ObjectSummary -> regexFormat.matcher(obj.key).matches() } + .sortedWith(descendingComparator) + if (matchedObjects.isNotEmpty()) { + val localMaximaLastModified: S3ObjectSummary = matchedObjects.first() + if ( + lastModifiedObject == null || + descendingComparator.compare(lastModifiedObject, localMaximaLastModified) > + 0 + ) { + lastModifiedObject = localMaximaLastModified + } + } + if (objects.isTruncated) { + objects = s3Client.listNextBatchOfObjects(objects) + } else { + break + } + } + if (lastModifiedObject == null) { + // Nothing to retrieve, fallback to null genId behavior + return null + } + // val lastModifiedObject = maxLastModifiedObjects.sortedWith(descendingComparator).first() + val objectMetadata = s3Client.getObjectMetadata(bucket, lastModifiedObject.key) + try { + val generationId = objectMetadata.getUserMetaDataOf(GENERATION_ID_USER_META_KEY) + if (!generationId.isNullOrBlank()) { + return generationId.toLong() + } + } catch (nfe: NumberFormatException) { + logger.warn { + "$GENERATION_ID_USER_META_KEY object metadata found in object ${lastModifiedObject.key} is not a number" + } + } + // If genId is missing or not parseable we return null + return null + } + fun getRegexFormat(namespace: String?, streamName: String, pathFormat: String): String { val namespaceStr: String = nameTransformer.getNamespace(namespace ?: "") val streamNameStr: String = nameTransformer.getIdentifier(streamName) @@ -433,6 +509,7 @@ open class S3StorageOperations( private const val FORMAT_VARIABLE_EPOCH: String = "\${EPOCH}" private const val FORMAT_VARIABLE_UUID: String = "\${UUID}" private const val GZ_FILE_EXTENSION: String = "gz" + private const val GENERATION_ID_USER_META_KEY = "ab-generation-id" @VisibleForTesting @JvmStatic fun getFilename(fullPath: String): String { diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/SerializedBufferFactory.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/SerializedBufferFactory.kt index 5a0def51c336..ab33c2c75e32 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/SerializedBufferFactory.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/SerializedBufferFactory.kt @@ -58,7 +58,7 @@ class SerializedBufferFactory { } return AvroSerializedBuffer.createFunction( formatConfig as UploadAvroFormatConfig, - createStorageFunctionWithExtension, + createStorageFunctionWithExtension ) } FileUploadFormat.CSV -> { @@ -69,7 +69,7 @@ class SerializedBufferFactory { } return CsvSerializedBuffer.createFunction( formatConfig as UploadCsvFormatConfig, - createStorageFunctionWithExtension, + createStorageFunctionWithExtension ) } FileUploadFormat.JSONL -> { diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/WriteConfig.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/WriteConfig.kt index c0170e24a8de..d9124831e2a2 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/WriteConfig.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/WriteConfig.kt @@ -15,6 +15,8 @@ constructor( val pathFormat: String, val fullOutputPath: String, val syncMode: DestinationSyncMode, + val generationId: Long, + val minimumGenerationId: Long, val storedFiles: MutableList = arrayListOf(), ) { diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBuffer.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBuffer.kt index ae8939cee23d..7c295c0c1ac0 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBuffer.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBuffer.kt @@ -115,7 +115,7 @@ class CsvSerializedBuffer( @Suppress("DEPRECATION") fun createFunction( config: UploadCsvFormatConfig?, - createStorageFunction: Callable + createStorageFunction: Callable, ): BufferCreateFunction { return BufferCreateFunction { stream: AirbyteStreamNameNamespacePair, diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/test/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBufferTest.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/test/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBufferTest.kt index 14de224b3cd8..7ccf1375d961 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/test/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBufferTest.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/test/kotlin/io/airbyte/cdk/integrations/destination/s3/csv/CsvSerializedBufferTest.kt @@ -166,6 +166,7 @@ class CsvSerializedBufferTest { expectedData: String ) { val outputFile = buffer.file + val defaultNamespace = "" (CsvSerializedBuffer.createFunction(config) { buffer } .apply( streamPair, diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3AvroParquetDestinationAcceptanceTest.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3AvroParquetDestinationAcceptanceTest.kt index 94a8a5960b45..8616b35c4cfe 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3AvroParquetDestinationAcceptanceTest.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3AvroParquetDestinationAcceptanceTest.kt @@ -33,6 +33,9 @@ protected constructor(fileUploadFormat: FileUploadFormat) : val config = this.getConfig() val defaultSchema = getDefaultSchema(config) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } runSyncAndVerifyStateOutput(config, messages, configuredCatalog, false) for (stream in catalog.streams) { diff --git a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3DestinationAcceptanceTest.kt b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3DestinationAcceptanceTest.kt index c9bb00f694bf..b2d855da5f8b 100644 --- a/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3DestinationAcceptanceTest.kt +++ b/airbyte-cdk/java/airbyte-cdk/s3-destinations/src/testFixtures/kotlin/io/airbyte/cdk/integrations/destination/s3/S3DestinationAcceptanceTest.kt @@ -9,20 +9,39 @@ import com.amazonaws.services.s3.model.S3ObjectSummary import com.fasterxml.jackson.databind.JsonNode import com.fasterxml.jackson.databind.ObjectMapper import com.fasterxml.jackson.databind.node.ObjectNode +import com.google.common.collect.ImmutableMap import io.airbyte.cdk.integrations.destination.NamingConventionTransformer import io.airbyte.cdk.integrations.destination.s3.util.S3NameTransformer import io.airbyte.cdk.integrations.standardtest.destination.DestinationAcceptanceTest +import io.airbyte.cdk.integrations.standardtest.destination.argproviders.DataArgumentsProvider import io.airbyte.cdk.integrations.standardtest.destination.comparator.AdvancedTestDataComparator import io.airbyte.cdk.integrations.standardtest.destination.comparator.TestDataComparator import io.airbyte.commons.io.IOs import io.airbyte.commons.jackson.MoreMappers import io.airbyte.commons.json.Jsons +import io.airbyte.commons.resources.MoreResources +import io.airbyte.protocol.models.v0.AirbyteCatalog +import io.airbyte.protocol.models.v0.AirbyteMessage +import io.airbyte.protocol.models.v0.AirbyteRecordMessage +import io.airbyte.protocol.models.v0.AirbyteStateMessage +import io.airbyte.protocol.models.v0.AirbyteStreamStatusTraceMessage +import io.airbyte.protocol.models.v0.AirbyteTraceMessage +import io.airbyte.protocol.models.v0.CatalogHelpers +import io.airbyte.protocol.models.v0.ConfiguredAirbyteCatalog +import io.airbyte.protocol.models.v0.DestinationSyncMode +import io.airbyte.protocol.models.v0.StreamDescriptor +import io.airbyte.protocol.models.v0.SyncMode +import io.airbyte.workers.exception.TestHarnessException import io.github.oshai.kotlinlogging.KotlinLogging import java.nio.file.Path +import java.time.Instant import java.util.* import org.apache.commons.lang3.RandomStringUtils import org.joda.time.DateTime import org.joda.time.DateTimeZone +import org.junit.jupiter.api.Assumptions.* +import org.junit.jupiter.api.Test +import org.junit.jupiter.api.fail import org.mockito.Mockito.mock private val LOGGER = KotlinLogging.logger {} @@ -37,7 +56,8 @@ private val LOGGER = KotlinLogging.logger {} * * Get the format config from [.getFormatConfig] */ abstract class S3DestinationAcceptanceTest -protected constructor(protected val outputFormat: FileUploadFormat) : DestinationAcceptanceTest() { +protected constructor(protected val outputFormat: FileUploadFormat) : + DestinationAcceptanceTest(verifyIndividualStateAndCounts = true) { protected val secretFilePath: String = "secrets/config.json" protected var configJson: JsonNode? = null protected var s3DestinationConfig: S3DestinationConfig = mock() @@ -92,12 +112,13 @@ protected constructor(protected val outputFormat: FileUploadFormat) : Destinatio .filter { o: S3ObjectSummary -> o.key.contains("$streamNameStr/") } .sortedWith(Comparator.comparingLong { o: S3ObjectSummary -> o.lastModified.time }) - LOGGER.info( - "All objects: {}", - objectSummaries.map { o: S3ObjectSummary -> - String.format("%s/%s", o.bucketName, o.key) - }, - ) + LOGGER.info { + "${"All objects: {}"} ${ + objectSummaries.map { o: S3ObjectSummary -> + String.format("%s/%s", o.bucketName, o.key) + } + }" + } return objectSummaries } @@ -129,11 +150,9 @@ protected constructor(protected val outputFormat: FileUploadFormat) : Destinatio storageProvider(), getConnectorEnv() ) - LOGGER.info( - "Test full path: {}/{}", - s3DestinationConfig.bucketName, - s3DestinationConfig.bucketPath, - ) + LOGGER.info { + "${"Test full path: {}/{}"} ${s3DestinationConfig.bucketName} ${s3DestinationConfig.bucketPath}" + } this.s3Client = s3DestinationConfig.getS3Client() this.s3nameTransformer = S3NameTransformer() @@ -153,16 +172,14 @@ protected constructor(protected val outputFormat: FileUploadFormat) : Destinatio } if (keysToDelete.size > 0) { - LOGGER.info( - "Tearing down test bucket path: {}/{}", - s3DestinationConfig.bucketName, - s3DestinationConfig.bucketPath, - ) + LOGGER.info { + "${"Tearing down test bucket path: {}/{}"} ${s3DestinationConfig.bucketName} ${s3DestinationConfig.bucketPath}" + } val result = s3Client!!.deleteObjects( DeleteObjectsRequest(s3DestinationConfig.bucketName).withKeys(keysToDelete), ) - LOGGER.info("Deleted {} file(s).", result.deletedObjects.size) + LOGGER.info { "${"Deleted {} file(s)."} ${result.deletedObjects.size}" } } } @@ -184,6 +201,416 @@ protected constructor(protected val outputFormat: FileUploadFormat) : Destinatio return StorageProvider.AWS_S3 } + private fun getTestCatalog( + syncMode: SyncMode, + destinationSyncMode: DestinationSyncMode, + syncId: Long?, + minimumGenerationId: Long?, + generationId: Long? + ): Pair { + val catalog = + Jsons.deserialize( + MoreResources.readResource( + DataArgumentsProvider.Companion.EXCHANGE_RATE_CONFIG.getCatalogFileVersion( + getProtocolVersion() + ) + ), + AirbyteCatalog::class.java + ) + val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog.streams.forEach { + it.withSyncMode(syncMode) + .withDestinationSyncMode(destinationSyncMode) + .withSyncId(syncId) + .withGenerationId(generationId) + .withMinimumGenerationId(minimumGenerationId) + } + return Pair(configuredCatalog, catalog) + } + + private fun getFirstSyncMessagesFixture1( + configuredCatalog: ConfiguredAirbyteCatalog, + streamStatus: AirbyteStreamStatusTraceMessage.AirbyteStreamStatus + ): List { + val descriptor = StreamDescriptor().withName(configuredCatalog.streams[0].stream.name) + return listOf( + AirbyteMessage() + .withType(AirbyteMessage.Type.RECORD) + .withRecord( + AirbyteRecordMessage() + .withStream(configuredCatalog.streams[0].stream.name) + .withEmittedAt(Instant.now().toEpochMilli()) + .withData( + Jsons.jsonNode( + ImmutableMap.builder() + .put("id", 1) + .put("currency", "USD") + .put("date", "2020-03-31T00:00:00Z") + .put("HKD", 10.1) + .put("NZD", 700.1) + .build(), + ), + ), + ), + AirbyteMessage() + .withType(AirbyteMessage.Type.STATE) + .withState( + AirbyteStateMessage() + .withData(Jsons.jsonNode(ImmutableMap.of("checkpoint", 2))), + ), + AirbyteMessage() + .withType(AirbyteMessage.Type.TRACE) + .withTrace( + AirbyteTraceMessage() + .withType(AirbyteTraceMessage.Type.STREAM_STATUS) + .withStreamStatus( + AirbyteStreamStatusTraceMessage() + .withStreamDescriptor(descriptor) + .withStatus(streamStatus) + ), + ), + ) + } + + private fun getSyncMessagesFixture2(): List { + return MoreResources.readResource( + DataArgumentsProvider.Companion.EXCHANGE_RATE_CONFIG.getMessageFileVersion( + getProtocolVersion(), + ), + ) + .trim() + .lines() + .map { Jsons.deserialize(it, AirbyteMessage::class.java) } + } + + /** + * Test 2 runs before refreshes support and after refreshes support in OVERWRITE mode. Verifies + * we clean up after ourselves correctly. + */ + @Test + fun testOverwriteSyncPreRefreshAndPostSupport() { + assumeTrue( + implementsOverwrite(), + "Destination's spec.json does not support overwrite sync mode." + ) + + // Run sync with OLD version connector + val catalogPair = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.OVERWRITE, 42, null, null) + val config = getConfig() + val firstSyncMessages = + getFirstSyncMessagesFixture1( + catalogPair.first, + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE, + ) + // Old connector doesn't have destinationStats so skip checking that. + runSyncAndVerifyStateOutput( + config, + firstSyncMessages, + catalogPair.first, + runNormalization = false, + "airbyte/destination-s3:0.6.4", + verifyIndividualStateAndCounts = false, + ) + + // This simulates first sync after enabling generationId in connector. null -> 1. + // legend has it that platform always increments to 1 and sends min and gen id as 1. + val catalogPair2 = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.OVERWRITE, 43, 1, 1) + + // Run and verify only second sync messages are present. + val secondSyncMessages = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, secondSyncMessages, catalogPair2.first, false) + + val defaultSchema = getDefaultSchema(config) + retrieveRawRecordsAndAssertSameMessages( + catalogPair2.second, + secondSyncMessages, + defaultSchema + ) + } + + /** + * This test is an impractical case. Running twice in APPEND mode with incrementing + * generationIds, switching to OVERWRITE mode without incrementing generationId This verifies + * that the previous data (including old generations data) is preserved. We don't know if the + * old data is synced in which mode this uses generationId as source of truth to NOT touch + * existing data. + */ + @Test + fun testSwitchingModesSyncWithPreviousData() { + assumeTrue( + implementsOverwrite(), + "Destination's spec.json does not support overwrite sync mode." + ) + + // Run sync with some messages and send incomplete status. + // This is to simulate crash + val catalogPair = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.APPEND, 42, 0, 1) + val config = getConfig() + val firstSyncMessages = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, firstSyncMessages, catalogPair.first, false) + + // Run second sync, even though the previous one was incomplete, intentionally incrementing + // genId and minGenId + // to test erratic behavior and we don't accidentally clean up stuff. + val catalogPair2 = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.APPEND, 43, 0, 2) + + // Run and verify only second sync messages are present. + val secondSyncMessages = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, secondSyncMessages, catalogPair2.first, false) + + // Run third sync. + val catalogPair3 = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.OVERWRITE, 44, 2, 2) + + // Run and verify only second sync messages are present. + val thirdSyncMessages = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, thirdSyncMessages, catalogPair3.first, false) + + val defaultSchema = getDefaultSchema(config) + retrieveRawRecordsAndAssertSameMessages( + catalogPair3.second, + firstSyncMessages + secondSyncMessages + thirdSyncMessages, + defaultSchema + ) + } + + /** Test runs 2 successfull overwrite syncs and verifies last sync is preserved */ + @Test + fun testOverwriteSyncSubsequentGenerations() { + assumeTrue( + implementsOverwrite(), + "Destination's spec.json does not support overwrite sync mode." + ) + + // Run sync with some messages + val catalogPair = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.OVERWRITE, 42, 12, 12) + val config = getConfig() + val firstSyncMessages = + getFirstSyncMessagesFixture1( + catalogPair.first, + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE, + ) + runSyncAndVerifyStateOutput(config, firstSyncMessages, catalogPair.first, false) + + // Change the generationId, we always assume platform sends a monotonically increasing + // number + val catalogPair2 = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.OVERWRITE, 43, 13, 13) + + // Run and verify only second sync messages are present. + val secondSyncMessages = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, secondSyncMessages, catalogPair2.first, false) + + val defaultSchema = getDefaultSchema(config) + retrieveRawRecordsAndAssertSameMessages( + catalogPair2.second, + secondSyncMessages, + defaultSchema + ) + } + + /** + * Test runs 1 failed and 1 successful OVERWRITE sync of same generation. Verified data from + * both syncs are preserved. + */ + @Test + fun testOverwriteSyncFailedResumedGeneration() { + assumeTrue( + implementsOverwrite(), + "Destination's spec.json does not support overwrite sync mode." + ) + val config = getConfig() + + // Run sync with some messages and incomplete stream status + val catalogPair = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.OVERWRITE, 42, 12, 12) + val firstSyncMessages: List = + getFirstSyncMessagesFixture1( + catalogPair.first, + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.INCOMPLETE + ) + try { + runSyncAndVerifyStateOutput(config, firstSyncMessages, catalogPair.first, false) + fail { "Should not succeed the sync when Trace message is INCOMPLETE" } + } catch (_: TestHarnessException) {} + + // Run second sync with the same messages from the previous failed sync. + val secondSyncMessages = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, secondSyncMessages, catalogPair.first, false) + + // verify records are preserved from first failed sync + second sync. + val defaultSchema = getDefaultSchema(config) + retrieveRawRecordsAndAssertSameMessages( + catalogPair.second, + firstSyncMessages + secondSyncMessages, + defaultSchema + ) + } + + /** + * Test runs 2 successful OVERWRITE syncs but with same generation and a sync to another catalog + * with no generationId, this shouldn't happen from platform but acts as a simulation for + * failure of first sync. This verifies that data from both syncs are preserved and the + * unrelated catalog sync data is untouched too. + */ + @Test + fun testOverwriteSyncWithGenerationId() { + assumeTrue( + implementsOverwrite(), + "Destination's spec.json does not support overwrite sync mode." + ) + + val config = getConfig() + + // First Sync + val catalogPair = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.OVERWRITE, 42, 12, 12) + val firstSyncMessages: List = + getFirstSyncMessagesFixture1( + catalogPair.first, + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE + ) + runSyncAndVerifyStateOutput(config, firstSyncMessages, catalogPair.first, false) + + // We need to make sure that other streams\tables\files in the same location will not be + // affected\deleted\overridden by our activities during first, second or any future sync. + // So let's create a dummy data that will be checked after all sync. It should remain the + // same + val dummyCatalogStream = "DummyStream" + val dummyCatalog = + Jsons.deserialize( + MoreResources.readResource( + DataArgumentsProvider.Companion.EXCHANGE_RATE_CONFIG.getCatalogFileVersion( + getProtocolVersion() + ) + ), + AirbyteCatalog::class.java + ) + dummyCatalog.streams[0].name = dummyCatalogStream + val configuredDummyCatalog = CatalogHelpers.toDefaultConfiguredCatalog(dummyCatalog) + configuredDummyCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(20).withMinimumGenerationId(20) + } + // update messages to set new dummy stream name + firstSyncMessages + .filter { message: AirbyteMessage -> message.record != null } + .forEach { message: AirbyteMessage -> message.record.stream = dummyCatalogStream } + firstSyncMessages + .filter { message: AirbyteMessage -> message.type == AirbyteMessage.Type.TRACE } + .forEach { message: AirbyteMessage -> + message.trace.streamStatus.streamDescriptor.name = dummyCatalogStream + } + // sync dummy data + runSyncAndVerifyStateOutput(config, firstSyncMessages, configuredDummyCatalog, false) + + // Run second sync + val secondSyncMessages: List = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, secondSyncMessages, catalogPair.first, false) + + // Verify records of both syncs are preserved. + val defaultSchema = getDefaultSchema(config) + retrieveRawRecordsAndAssertSameMessages( + catalogPair.second, + firstSyncMessages + secondSyncMessages, + defaultSchema + ) + // verify that other streams in the same location were not affected. If something fails + // here, + // then this need to be fixed in connectors logic to override only required streams + retrieveRawRecordsAndAssertSameMessages(dummyCatalog, firstSyncMessages, defaultSchema) + } + + /** + * This test is similar to testIncrementalSync with adding generationId to the ConfiguredCatalog + * This verifies that the core behavior of APPEND mode sync is unaltered when the + * minimumGenerationId is set to 0 + */ + @Test + @Throws(Exception::class) + fun testIncrementalSyncWithGenerationId() { + assumeTrue( + implementsAppend(), + "Destination's spec.json does not include '\"supportsIncremental\" ; true'" + ) + + val catalogPair = + getTestCatalog(SyncMode.INCREMENTAL, DestinationSyncMode.APPEND, 42, 0, 12) + val config = getConfig() + + // First sync + val firstSyncMessages: List = + getFirstSyncMessagesFixture1( + catalogPair.first, + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE, + ) + runSyncAndVerifyStateOutput(config, firstSyncMessages, catalogPair.first, false) + + // Second sync + val secondSyncMessages: List = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, secondSyncMessages, catalogPair.first, false) + + // Verify records + val defaultSchema = getDefaultSchema(config) + retrieveRawRecordsAndAssertSameMessages( + catalogPair.second, + firstSyncMessages + secondSyncMessages, + defaultSchema + ) + } + + /** + * Test 2 runs before refreshes support and after refreshes support in APPEND mode. Verifies we + * don't accidentally delete any data when generationId is encountered. + */ + @Test + fun testAppendSyncPreRefreshAndPostSupport() { + assumeTrue( + implementsOverwrite(), + "Destination's spec.json does not support overwrite sync mode." + ) + + // Run sync with some messages + val catalogPair = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.APPEND, 42, null, null) + val config = getConfig() + val firstSyncMessages = + getFirstSyncMessagesFixture1( + catalogPair.first, + AirbyteStreamStatusTraceMessage.AirbyteStreamStatus.COMPLETE, + ) + // Old connector doesn't have destinationStats so skip checking that. + runSyncAndVerifyStateOutput( + config, + firstSyncMessages, + catalogPair.first, + runNormalization = false, + "airbyte/destination-s3:0.6.4", + verifyIndividualStateAndCounts = false, + ) + + // This simulates first sync after enabling generationId in connector. null -> 0. + // Apparently we encountered a behavior where for APPEND mode min and genID are not + // incremented and sent as 0 + val catalogPair2 = + getTestCatalog(SyncMode.FULL_REFRESH, DestinationSyncMode.APPEND, 43, 0, 0) + + // Run and verify only second sync messages are present. + val secondSyncMessages = getSyncMessagesFixture2() + runSyncAndVerifyStateOutput(config, secondSyncMessages, catalogPair2.first, false) + + val defaultSchema = getDefaultSchema(config) + retrieveRawRecordsAndAssertSameMessages( + catalogPair2.second, + firstSyncMessages + secondSyncMessages, + defaultSchema + ) + } + companion object { @JvmStatic protected val MAPPER: ObjectMapper = MoreMappers.initMapper() diff --git a/airbyte-integrations/connectors/destination-s3/build.gradle b/airbyte-integrations/connectors/destination-s3/build.gradle index 5a55de5842a1..dac6e37f252f 100644 --- a/airbyte-integrations/connectors/destination-s3/build.gradle +++ b/airbyte-integrations/connectors/destination-s3/build.gradle @@ -4,9 +4,9 @@ plugins { } airbyteJavaConnector { - cdkVersionRequired = '0.35.5' + cdkVersionRequired = '0.44.0' features = ['db-destinations', 's3-destinations'] - useLocalCdk = false + useLocalCdk = true // TODO: Version CDK, bump required version, and set this to false } airbyteJavaConnector.addCdkDependencies() diff --git a/airbyte-integrations/connectors/destination-s3/finalize_build.sh b/airbyte-integrations/connectors/destination-s3/finalize_build.sh index f6684932e999..0734b4e26c61 100644 --- a/airbyte-integrations/connectors/destination-s3/finalize_build.sh +++ b/airbyte-integrations/connectors/destination-s3/finalize_build.sh @@ -4,7 +4,6 @@ set -e echo "Running destination-s3 docker custom steps..." ARCH=$(uname -m) - if [ "$ARCH" == "x86_64" ] || [ "$ARCH" = "amd64" ]; then echo "$ARCH" yum install lzop lzo lzo-devel -y diff --git a/airbyte-integrations/connectors/destination-s3/metadata.yaml b/airbyte-integrations/connectors/destination-s3/metadata.yaml index 95f2a9f8f6fa..d089c267937d 100644 --- a/airbyte-integrations/connectors/destination-s3/metadata.yaml +++ b/airbyte-integrations/connectors/destination-s3/metadata.yaml @@ -2,7 +2,7 @@ data: connectorSubtype: file connectorType: destination definitionId: 4816b78f-1489-44c1-9060-4b19d5fa9362 - dockerImageTag: 0.6.4 + dockerImageTag: 0.6.5 dockerRepository: airbyte/destination-s3 githubIssueLabel: destination-s3 icon: s3.svg @@ -27,6 +27,7 @@ data: sl: 300 ql: 300 supportLevel: certified + supportsRefreshes: true connectorTestSuitesOptions: - suite: unitTests - suite: integrationTests diff --git a/airbyte-integrations/connectors/destination-s3/src/test-integration/kotlin/io/airbyte/integrations/destination/s3/S3ParquetDestinationAcceptanceTest.kt b/airbyte-integrations/connectors/destination-s3/src/test-integration/kotlin/io/airbyte/integrations/destination/s3/S3ParquetDestinationAcceptanceTest.kt index f699e247f015..a3c6170ba586 100644 --- a/airbyte-integrations/connectors/destination-s3/src/test-integration/kotlin/io/airbyte/integrations/destination/s3/S3ParquetDestinationAcceptanceTest.kt +++ b/airbyte-integrations/connectors/destination-s3/src/test-integration/kotlin/io/airbyte/integrations/destination/s3/S3ParquetDestinationAcceptanceTest.kt @@ -52,6 +52,9 @@ class S3ParquetDestinationAcceptanceTest : S3BaseParquetDestinationAcceptanceTes AirbyteCatalog::class.java ) val configuredCatalog = CatalogHelpers.toDefaultConfiguredCatalog(catalog) + configuredCatalog.streams.forEach { + it.withSyncId(42).withGenerationId(12).withMinimumGenerationId(12) + } val messages: List = readResource( DataArgumentsProvider.EXCHANGE_RATE_CONFIG.getMessageFileVersion( diff --git a/docs/integrations/destinations/s3.md b/docs/integrations/destinations/s3.md index cc76a0f4f264..17c57c80ad58 100644 --- a/docs/integrations/destinations/s3.md +++ b/docs/integrations/destinations/s3.md @@ -513,7 +513,8 @@ To see connector limitations, or troubleshoot your S3 connector, see more [in ou Expand to review | Version | Date | Pull Request | Subject | -| :------ |:-----------| :--------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------- | +|:--------|:-----------|:-----------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------| +| 0.6.5 | 2024-08-01 | [42405](https://github.com/airbytehq/airbyte/pull/42405) | S3 parallelizes workloads, checkpoints, submits counts, support for generationId in metadata for refreshes. | | 0.6.4 | 2024-04-16 | [42006](https://github.com/airbytehq/airbyte/pull/42006) | remove unnecessary zookeeper dependency | | 0.6.3 | 2024-04-15 | [38204](https://github.com/airbytehq/airbyte/pull/38204) | convert all production code to kotlin | | 0.6.2 | 2024-04-15 | [38204](https://github.com/airbytehq/airbyte/pull/38204) | add assume role auth |