Skip to content

[SPARK-47618][CORE] Use Magic Committer for all S3 buckets by default #51010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented May 24, 2025

What changes were proposed in this pull request?

This PR aims to use Apache Hadoop Magic Committer for all S3 buckets by default in Apache Spark 4.1.0.

Why are the changes needed?

Apache Hadoop Magic Committer has been used for S3 buckets to get the best performance since S3 became fully consistent on December 1st, 2020.

Amazon S3 provides strong read-after-write consistency for PUT and DELETE requests of objects in your Amazon S3 bucket in all AWS Regions. This behavior applies to both writes to new objects as well as PUT requests that overwrite existing objects and DELETE requests. In addition, read operations on Amazon S3 Select, Amazon S3 access controls lists (ACLs), Amazon S3 Object Tags, and object metadata (for example, the HEAD object) are strongly consistent.

Does this PR introduce any user-facing change?

Yes, the migration guide is updated.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun
Copy link
Member Author

cc @viirya , @yaooqinn , @peter-toth

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-47618] Use Magic Committer for all S3 buckets by default [SPARK-47618][CORE] Use Magic Committer for all S3 buckets by default May 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant