Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve write operations to be closer to target-postgres #13

Merged
merged 1 commit into from
Dec 21, 2023

Conversation

amotl
Copy link
Contributor

@amotl amotl commented Dec 19, 2023

Introduction

Meltano's target-postgres uses a temporary table to receive data first, and
then update the effective target table with information from that.

CrateDB's target-cratedb additonally offers the possibility to also write directly into
the target table, yielding speed improvements, which may be important in certain
situations.

Details

The environment variable MELTANO_CRATEDB_STRATEGY_DIRECT controls that behavior.

  • MELTANO_CRATEDB_STRATEGY_DIRECT=true: Directly write to the target table.
  • MELTANO_CRATEDB_STRATEGY_DIRECT=false: Use a temporary table to stage updates.

Note: The current default value is true, effectively short-cutting the native
way of how Meltano handles database updates. The reason is that the vanilla way
does not satisfy all test cases, yet.

References

The last step of the upsert procedure uses an UPDATE ... FROM
statement, which needs to be emulated per Python code, because CrateDB
does not provide that feature yet.

@amotl amotl force-pushed the write-strategy branch 6 times, most recently from 2d56d58 to 2765f24 Compare December 19, 2023 03:00
Copy link

codecov bot commented Dec 19, 2023

Codecov Report

Attention: 57 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (main@14cf7c2). Click here to learn what that means.

❗ Current head 4c8c2c4 differs from pull request most recent head 9226d65. Consider uploading reports for the commit 9226d65 to get more accurate results

Files Patch % Lines
target_cratedb/sqlalchemy/vector.py 51.85% 26 Missing ⚠️
target_cratedb/tests/test_standard_target.py 84.34% 18 Missing ⚠️
target_cratedb/connector.py 77.77% 8 Missing ⚠️
target_cratedb/sinks.py 92.98% 4 Missing ⚠️
target_cratedb/sqlalchemy/patch.py 94.73% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main      #13   +/-   ##
=======================================
  Coverage        ?   83.47%           
=======================================
  Files           ?        7           
  Lines           ?      714           
  Branches        ?        0           
=======================================
  Hits            ?      596           
  Misses          ?      118           
  Partials        ?        0           
Flag Coverage Δ
main 83.47% <79.78%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@amotl amotl requested review from seut, matriv and surister December 19, 2023 03:03
@amotl amotl marked this pull request as ready for review December 19, 2023 03:03
@amotl amotl force-pushed the harmonize-sqlalchemy branch from 74296f7 to ab563dc Compare December 21, 2023 19:37
Base automatically changed from harmonize-sqlalchemy to main December 21, 2023 19:46
Meltano's `target-postgres` uses a temporary table to receive data
first, and then update the effective target table with information from
that.

CrateDB's `target-cratedb` additonally offers the possibility to also
write directly into the target table, yielding speed improvements, which
may be important in certain situations.

The environment variable `MELTANO_CRATEDB_STRATEGY_DIRECT` controls that
behavior.

- `MELTANO_CRATEDB_STRATEGY_DIRECT=true`: Directly write to the target
  table.
- `MELTANO_CRATEDB_STRATEGY_DIRECT=false`: Use a temporary table to
  stage updates.

Note: The current default value is `true`, effectively short-cutting the
native way of how Meltano handles database updates. The reason is that
the vanilla way does not satisfy all test cases, yet.
@amotl amotl merged commit 9db4bed into main Dec 21, 2023
2 checks passed
@amotl amotl deleted the write-strategy branch December 21, 2023 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant