-
Notifications
You must be signed in to change notification settings - Fork 334
feat(writer): Add clustered and fanout writer #1735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
crates/iceberg/src/writer/mod.rs
Outdated
/// Build the iceberg writer. | ||
async fn build(self) -> Result<Self::R>; | ||
/// Build the iceberg writer for an optional partition key. | ||
async fn build_with_partition(self, partition_key: Option<PartitionKey>) -> Result<Self::R>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change. I believe this is necessary because:
-
IcebergWriter
is supposed to generateDataFile
that always hold a partition value according to iceberg spec. -
The existing code store partition value in the builder directly, making
builder.clone()
useless:
let builder = IcebergWriterBuilder::new(partition_A);
let writer_A = builder.build();
... // write to partition A
// done with partition A and now we need to write to partition B
// this is wrong because partition value A is still stored in the builder
let writer_B = builder.clone().build()
An alternative is to add a new method clone_with_partition()
but that would also be a breaking change and it's less clean compared to build_with_partition()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this change, but I want a further change as following:
async fn build(&self, partition_key: Option<PartitionKey>) -> Result<Self::R>
If the builder could be reused for creating actual IcebergWriter
, I want to avoid cloning.
crates/iceberg/src/writer/partitioning/clustered_data_writer.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @CTTY for this pr! Just finished first round of review, and I think we are on the right track!
/// # Returns | ||
/// | ||
/// `Ok(())` on success, or an error if the write operation fails. | ||
async fn write(&mut self, partition_key: Option<PartitionKey>, input: I) -> Result<()>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async fn write(&mut self, partition_key: Option<PartitionKey>, input: I) -> Result<()>; | |
async fn write(&mut self, partition_key: PartitionKey, input: I) -> Result<()>; |
For partitioning writer, it should always be partitioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning to have partitioning writer to take care of unpartitioned data as well, but I think you are right that we can have an explicit unpartitioned writer (basically a wrapper of iceberg writer) and have TaskWriter to decide which one to use.
will fix this
let partition_value = partition_key.data(); | ||
|
||
// Check if this partition has been closed already | ||
if self.closed_partitions.contains(partition_value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's odd to add the check here. It's the caller's responsibility to ensure that inputs are sorted, but if it's not, we should not throw error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to Java's behavior when writing this.
When looking at the original java PR I don't see any explicit explanation, but I think this can force users to be aware of if their data source is sorted and help identify hidden performance issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with following java's logic.
crates/iceberg/src/writer/mod.rs
Outdated
/// Build the iceberg writer. | ||
async fn build(self) -> Result<Self::R>; | ||
/// Build the iceberg writer for an optional partition key. | ||
async fn build_with_partition(self, partition_key: Option<PartitionKey>) -> Result<Self::R>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this change, but I want a further change as following:
async fn build(&self, partition_key: Option<PartitionKey>) -> Result<Self::R>
If the builder could be reused for creating actual IcebergWriter
, I want to avoid cloning.
/// * `B` - The inner writer builder type | ||
/// * `I` - Input type (defaults to `RecordBatch`) | ||
/// * `O` - Output collection type (defaults to `Vec<DataFile>`) | ||
#[derive(Clone)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need this? Cloning a FanoutWriter
is ambiguous, since it contains states like opened writers and data files.
let partition_value = partition_key.data(); | ||
|
||
// Check if this partition has been closed already | ||
if self.closed_partitions.contains(partition_value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with following java's logic.
Which issue does this PR close?
What changes are included in this PR?
New:
partitioning
module withPartitioningWriter
traitClusteredWriter
: Optimized for pre-sorted data, requires writing in partition orderFanoutWriter
: Flexible writer that can handle data from any partition at any timeModification:
DataFileWriterBuilder
to support dynamic partition assignmentAre these changes tested?
Added unit tests