Replies: 1 comment
-
Hi @bnimam, thank you for this suggestion, it could certainly be an interesting additional pattern to include in our Redshit api methods. I have transferred this to an issue. Looking forward to your PR and collaborating on this |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello awslabs team.
I figured I'd start a discussion on this before writing an implementation.
Our team has a need for a more complex upsert method into redshift borrowing an idea from Apache Hudi called a precombine field
Essentially, we receive files with IDs used as primary keys, and we want to update these records based on files we receive. Building in file processing order guarantees is harder to implement and maintain for us than it would be to more intelligently upsert data. The precombine field would allow us to specify a column of which to keep the highest value between the target table and the stage table (file being processed).
Imagine we have table users like
And we want to use
Date
as the precombine field, which is extracted from our processed file. So if we had candidate file likeOur resulting table would be
This could be achieved by first deleting from the stage table with a query like
Result
and
Result
Finally
If this is a feature that would be an accepted addition, I will go ahead with implementing it, otherwise I will not waste my time.
Beta Was this translation helpful? Give feedback.
All reactions