Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[airbyte-cdk] Promote low-code types and cursor interface into Python CDK #38077

Merged
merged 8 commits into from
May 13, 2024

Conversation

brianjlai
Copy link
Contributor

What

As I was trying to implement RFR for low-code, a consistent theme was that I was writing boilerplate and making some questionable decisions on how to format and read state from checkpoint_reader.py. The reality is that a number of these concepts like StreamSlice that has separate cursor/partition value and the Cursor interface belong in the top most CDK and using them here makes a huge difference in the understandability of the RFR flow.

I decided to just yolo this thing out cuz why not. Looks like a lot of changes, but assuming we can get CATs and unit tests to pass, I have pretty high confidence because if I refactored the references wrong it would fail very loudly.

I also don't plan to republish connectors

How

Moved the types.py from /sources/declarative to /sources

  • One thing to note here is I kept the existing types.py. I migrated all our repos over, but we might have customer's who've implemented custom low-code connectors w/ custom components and this avoids breaking them

Moved the Cursor to /sources/streams/checkpoint

  • The promoted Cursor interface no longer inherits from StreamSlicer which is a low-code concept
  • We have a new DeclarativeCursor which implements Cursor + StreamSlicer which actually makes more sense because a generic cursor as a concept doesn't actually need to be responsible for injecting values

IntelliJ/copilot magic to refactor everything to reference the appropriate types.

Review guide

  1. /sources/types.py
  2. /sources/streams/checkpoint/cursor.py

User Impact

Should be non-breaking as mentioned above.

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

@brianjlai brianjlai requested a review from a team as a code owner May 9, 2024 00:03
Copy link

vercel bot commented May 9, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview May 10, 2024 0:25am

@brianjlai brianjlai added area/connectors Connector related issues and removed connectors/source/alpha-vantage labels May 9, 2024
@brianjlai brianjlai requested a review from girarda May 10, 2024 00:31
Copy link
Contributor

@girarda girarda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

@brianjlai brianjlai merged commit 8fdd981 into master May 13, 2024
32 checks passed
@brianjlai brianjlai deleted the brian/move_cursors_and_stream_slice_to_python branch May 13, 2024 19:51
@assaadhjb
Copy link
Contributor

Hello @brianjlai @girarda, If I understood correctly, DeclarativeCursor would allow us to leverage stream slicers for example for incremental sync where no cursor field can be selected?
Any chance I can find documentation on how to use this DeclarativeCursor if it is the case, because I am interested in its usage! Thanks 🙏

@brianjlai
Copy link
Contributor Author

Hi @assaadhjb, assuming that your connector is written in the low-code framework. You don't actually need to define the DeclarativeCursor within the manifest file itself, but rather by defining certain attributes of a stream like a paginator and that it hasn't defined an incrementalfield, then Resumable Full Refresh will be applied during a sync automatically. If you want to read more about this you refer to our blog post on it's usage: https://airbyte.com/blog/resumable-full-refresh-building-resilient-systems-for-syncing-data#leveraging-the-power-of-the-low-code-framework

If your custom connector is written using the Python CDK, then we do have some docs here that talk about synthetic cursors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants