Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jdbc-connectors: remove LEGACY state #33201

Closed
wants to merge 39 commits into from

Conversation

subodh1810
Copy link
Contributor

@subodh1810 subodh1810 commented Dec 6, 2023

Issue : #33290

@subodh1810 subodh1810 self-assigned this Dec 6, 2023
Copy link

vercel bot commented Dec 6, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Dec 13, 2023 0:09am

Copy link
Contributor

github-actions bot commented Dec 6, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

subodh1810 and others added 10 commits December 8, 2023 13:31
# Conflicts:
#	airbyte-integrations/connectors/source-postgres/src/test/java/io/airbyte/integrations/source/postgres/CdcPostgresSourceTest.java
#	airbyte-integrations/connectors/source-postgres/src/test/java/io/airbyte/integrations/source/postgres/PostgresSourceTest.java
Copy link
Contributor

github-actions bot commented Dec 12, 2023

Coverage report for source-postgres

File Coverage [88.6%] 🍏
PostgresSource.java 88.6% 🍏
Total Project Coverage 71.69% 🍏

@subodh1810 subodh1810 requested a review from akashkulk December 12, 2023 19:48
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking a while to review this. There's a lot here! Kudos.

  1. The CDK changes LGTM. Note that there are comments I left in non-CDK files about things which could be in the CDK.
  2. This PR does other things besides removing LEGACY: it bumps the CDK dependency on all java connectors and tries to get the connectors tests to pass, which in some cases hasn't been done in more than a year.
  3. Given that, I suggest you limit this PR to CDK changes only and open separate PRs for each connector. We can then deal with each one independently and on our own schedule, in a way that doesn't involve disabling tests unless we have a good reason to (and then state that reason in the annotation). Those PRs will also be easier to review. I'll be happy to help you do this.

import org.junit.jupiter.api.Test;
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.utility.DockerImageName;

@Disabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this deliberate? I don't understand why this test is disabled. This comment applies elsewhere in this PR as well.


}

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works. Though, the point of these objects is that many instances of these can share one container. It might be worth adding a // TODO: implement shared container comment or something.

@@ -57,6 +58,8 @@ public static void setup() {
}

@Test
@Disabled("Flaky on CI, See run https://github.com/airbytehq/airbyte/actions/runs/7126781640/job/19405426141?pr=33201 " +
"org.opentest4j.AssertionFailedError: Expected size between 964 and 985, but actual size was 991 ==> expected: <true> but was: <false>")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgh, this again. These tests should be rewritten to use deterministic inputs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is out of scope for this PR, naturally.


@Override
public String getDriverClassName() {
return SNOWFLAKE.getDriverClassName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why SNOWFLAKE?


import org.testcontainers.containers.JdbcDatabaseContainer;

public class NonContainer extends JdbcDatabaseContainer<NonContainer> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commentary required. Do I understand correctly that this is a shim for when a test relies on a shared instance somewhere rather than a testcontainer?

public String getDriverClass() {
return TiDBSource.DRIVER_CLASS;
public JsonNode config() {
return Jsons.clone(testdb.configBuilder().build());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloning the return value of build is not required, it's a fresh JsonNode. This comment applies elsewhere in this PR as well.

import java.util.stream.Stream;
import org.jooq.SQLDialect;

public class TeradataTestDatabase extends TestDatabase<NonContainer, TeradataTestDatabase, TeradataTestDatabase.TeradataDbConfigBuilder> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider defining a NonContainerTestDatabase class in the CDK which extends TestDatabase<NonContainer, NonContainerTestDatabase, NonContainerTestDatabase.NonContainerTestDatabaseConfigBuilder> and use that here and in other connectors which don't rely on testcontainers. Inject the username, password, sql dialect etc. as constructor args.

public Db2TestDatabase initialized() {
if (!containerStarted) {
container.start();
containerStarted = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .start() method is idempotent so containerStarted isn't needed. Plus, it's static for some reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@postamar The start method is called from createTestDatabase within setup which is executed before each unit test, I wanted to use the same connector for all the unit tests and not have docker container start and stop again. In order to do that I had to introduce containerStarted and make it static

  @BeforeEach
  public void setup() throws Exception {
    customSetup();
    testdb = createTestDatabase();

Copy link
Contributor

@postamar postamar Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. The place to put the .start() call is in the @BeforeAll static method.

- config_path: "secrets/sat-config.json"
# discovery:
# tests:
# - config_path: "secrets/sat-config.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this deliberate?

@subodh1810
Copy link
Contributor Author

@postamar The aim of the work is to stop emitting LEGACY state and in order to do that I need to release these uncertified connectors so that we are sure that no place is emitting LEGACY state with the latest piece of code. I would like to go ahead with the PR as it is. Remember am not changing anything in the certified connectors and they stay the same, the aim of this work is the uncertified connectors.

  1. Given that, I suggest you limit this PR to CDK changes only and open separate PRs for each connector. We can then deal with each one independently and on our own schedule, in a way that doesn't involve disabling tests unless we have a good reason to (and then state that reason in the annotation). Those PRs will also be easier to review. I'll be happy to help you do this.

@subodh1810 subodh1810 requested a review from postamar December 13, 2023 15:03
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your aim. I strongly believe that it's a bad idea to merge a PR like this one because the risk of breaking the build for a certified connector is too high. For a start, right now, source-mysql doesn't even compile. At least make separate PRs for certified connectors, please. It's really not that much extra work and being sure that you're not creating more work for others is worth it.

I don't care much about the uncertified connectors but they should at least compile. This can be ascertained by the "Repository Health Check". I'm OK with you keeping those in this PR if you insist, since those changes are going to be merged bypassing the branch protections in any case.

public Db2TestDatabase initialized() {
if (!containerStarted) {
container.start();
containerStarted = true;
Copy link
Contributor

@postamar postamar Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. The place to put the .start() call is in the @BeforeAll static method.

@subodh1810
Copy link
Contributor Author

closing in favor of the following PRs
CDK changes : #33434
Postgres : #33437
MySQL : #33436
Mongo : #33438
MSSQL : #33481
Bigquery + Redshift + Snowflake : https://github.com/airbytehq/airbyte/pull/33484/files
Remaining ones : #33485

@subodh1810 subodh1810 closed this Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment