Replies: 2 comments 4 replies
-
I can only think of one other option than the ones you've already listed: you could store a |
Beta Was this translation helpful? Give feedback.
-
Dataform does a Or perhaps I am missing something else? |
Beta Was this translation helpful? Give feedback.
-
Assuming a large source table with the following data:
And an incremental model:
Then the next day, the table looks like this:
If we don't do anything in particular, the incremental model will miss the row that was inserted "before" the latest in the target table.
If we use a "lookback" value in the
FROM
clause (saymax(received_at) - interval 1 day
), we'll end up with duplicated data.We could use the
unique_key
approach, but this isn't always applicable and is potentially prohibitively expensive on large tables / partitions.What is the recommended approach to deal with this kind of late-arriving data?
Beta Was this translation helpful? Give feedback.
All reactions