Skip to content

Conversation

@aiborodin
Copy link
Contributor

This PR is an alternative solution to #14425, which uses the same idea of checking duplicate concurrent commits using snapshot validation, similar to #14445.

The current solution extends BaseRowDelta and BaseReplacePartitions operations in Flink packages instead of adding public API methods to RowDelta and ReplacePartitions interfaces in #14445.

Fix the commit duplication issue in the DynamicIcebergSink caused by a
race condition during recovery from a failed commit to the REST catalog.

Validate that no new parent snapshot fetched from a refresh operation in
the SnapshotProducer has a committed checkpoint ID higher than or equal
to the currently staged committable. Throw an exception if a duplicate
snapshot is detected to let the Flink job retry and skip the new
request.

Add unit and integration tests to replicate the commit duplication.

Change-Id: I92eca2987e4fbe9819ea9161dfe9e8b8bddb9105
@aiborodin aiborodin closed this Nov 5, 2025
@aiborodin aiborodin reopened this Nov 5, 2025
@aiborodin aiborodin changed the title Fix commit duplication in DynamicIcebergSink (alternative) Outdated: Fix commit duplication in DynamicIcebergSink (alternative) Nov 6, 2025
@aiborodin
Copy link
Contributor Author

The new PR is raised in #14517.

@aiborodin aiborodin closed this Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant