Would it make sense to batch upserts? #1298
-
I'll be pulling in a array of records, some of which will be inserts and others which will be updates. Often hundreds of records, sometimes thousands. Quite a few of the updates could be no ops, if I want to compare them to the existing records and check if a certain value is the same. However, I'm not sure this is worth the complication. It's at least simpler to run the update every time the primary key matches, whether the update is "meaningful" or not. The simplest way I see to handle the data pull would be to set up a batch and run upserts over the entire array. However, I'm not sure upserts and batching really work together the way I'd like. I don't know what batching really does under the hood. Would you expect batching to usefully group/optimize upserts? I know there's not a lot of detail here, but does this seem reasonable, is there a likely better approach? (I see there's an insert-or-replace concept that I'm not familiar with that might be a fit here, for example.) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I do think that upserts make sense in a batch. In moor, a batch is essentially doing two things:
Point 1 is typically the major factor to increase throughput. Creating a transaction for each write can be a bit slow, grouping writes in a transaction improves performance a lot. Point 2 also improves performance a bit further, but not as much as the implicit transaction used in a batch. So yes, in general it always makes sense to use batches when you have a group of writes that you want to run. |
Beta Was this translation helpful? Give feedback.
I do think that upserts make sense in a batch. In moor, a batch is essentially doing two things:
Point 1 is typically the major factor to increase throughput. Creating a transaction for each write can be a bit slow, grouping writes in a transaction improves performance a lot. Point 2 also improves performance a bit further, but not as much as the implicit transaction used in a batch.
So yes, in general it always makes sense to use batches when you have a group of writes that you want to run.