Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow overwriting delta lake entries with same timestamp #363

Merged
merged 1 commit into from
Nov 19, 2024

Conversation

mikix
Copy link
Contributor

@mikix mikix commented Nov 18, 2024

Without allowing this, we can't meaningfully update rows as ETL evolves to allow more content (_data and _url fields recently, maybe future allowed extensions, that sort of thing).

This does allow more data churn, but correctness takes priority. We could maybe get both if we inserted the ETL version into every row? But for now, this is an easy tweak.

Checklist

  • Consider if documentation (like in docs/) needs to be updated
  • Consider if tests should be added

Without allowing this, we can't meaningfully update rows as ETL evolves
to allow more content (_data and _url fields recently, maybe future
allowed extensions, that sort of thing).

This does allow more data churn, but correctness takes priority.
@mikix mikix force-pushed the mikix/timestamp-equal branch from 1f45159 to 4edad9c Compare November 18, 2024 21:00
Copy link

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
3607 3546 98% 98% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
cumulus_etl/formats/deltalake.py 100% 🟢
TOTAL 100% 🟢

updated for commit: 4edad9c by action🐍

@mikix mikix marked this pull request as ready for review November 19, 2024 14:26
Comment on lines +212 to +213
# If we eventually decide that sub-second updates are a real concern, we can additionally
# compare versionId. But I don't know how you extracted both versions so quickly. :)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i don't think we're going to have this problem any time soon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But correctness Maaaatt!

@mikix mikix merged commit 468dcfb into main Nov 19, 2024
3 checks passed
@mikix mikix deleted the mikix/timestamp-equal branch November 19, 2024 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants