TransactionError::VersionAlreadyExists needs exponential backoff with commit peeking #3066
Labels
binding/rust
Issues for the Rust crate
bug
Something isn't working
storage/aws
AWS S3 storage related
Milestone
Environment
Delta-rs version: irrelephant 🐘
Binding: both
Bug
What happened:
For systems with high write throughput, and therefore version contention, conflicting writes can result in
VersionAlreadyExists
being raised. The retry logic in thePreparedCommit
operation immediately and blindly retries the operation with state_version +1 rather than peeking at the next commit version from the table.This can result in logs such as:
Basically there are two problems:
while true {}
without any pause/sleep in between retries, so it's very easy for the retry to hit the max retry limit (hardcoded to 15 for AWS)loaded + 1
which is almost 100% guaranteed to be wrong if there is more than one concurrent writing process. This causes additional unnecessary retries, since the aforementionedwhile true
loop will move the attempted version number up by 1 from the loaded state rather than peeking at the latest commit in the table and attempting a commit withpeeked_version + 1
What you expected to happen:
I would expect the retry to take
peeked_version + 1
at a bare minimum!How to reproduce it:
Use more than one concurrent writer 😆
More details:
The text was updated successfully, but these errors were encountered: