Skip to content

Commit

Permalink
feat: support write (#10)
Browse files Browse the repository at this point in the history
* add arrow_struct_to_iceberg_struct

* refine writer interface

* support fanout partition writer

* support sort_position_delete_writer

* support equality delta writer

* support precompute partition writer

* update value convert

* fix some wrong in writer

* implement Display for NamespaceIdent

* expose _serde::DataFile

* fix FieldSummary generated from Manifest

* add delete file support for transaction

* fix record_batch_partition_spliter

* fix day transform

* fix RawLiteralEnum::Record

* fix nullable field of equality delete writer

* support to delete empty row file

* fix decimal parse for parquet statistics

---------

Co-authored-by: ZENOTME <[email protected]>
  • Loading branch information
ZENOTME and ZENOTME committed Dec 30, 2024
1 parent 54ef090 commit 667c173
Show file tree
Hide file tree
Showing 25 changed files with 3,026 additions and 62 deletions.
3 changes: 3 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,13 @@ apache-avro = "0.17"
array-init = "2"
arrow-arith = { version = "53" }
arrow-array = { version = "53" }
arrow-buffer = { version = "53" }
arrow-cast = { version = "53" }
arrow-ord = { version = "53" }
arrow-schema = { version = "53" }
arrow-select = { version = "53" }
arrow-string = { version = "53" }
arrow-row = { version = "53" }
async-stream = "0.3.5"
async-trait = "0.1"
async-std = "1.12"
Expand Down
2 changes: 2 additions & 0 deletions crates/iceberg/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,10 @@ apache-avro = { workspace = true }
array-init = { workspace = true }
arrow-arith = { workspace = true }
arrow-array = { workspace = true }
arrow-buffer = { workspace = true }
arrow-cast = { workspace = true }
arrow-ord = { workspace = true }
arrow-row = { workspace = true }
arrow-schema = { workspace = true }
arrow-select = { workspace = true }
arrow-string = { workspace = true }
Expand Down
5 changes: 4 additions & 1 deletion crates/iceberg/src/arrow/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,8 @@ pub use schema::*;
mod reader;
pub(crate) mod record_batch_projector;
pub(crate) mod record_batch_transformer;

mod value;
pub use reader::*;
pub use value::*;
mod record_batch_partition_spliter;
pub(crate) use record_batch_partition_spliter::*;
24 changes: 14 additions & 10 deletions crates/iceberg/src/arrow/reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ use parquet::arrow::{ParquetRecordBatchStreamBuilder, ProjectionMask, PARQUET_FI
use parquet::file::metadata::{ParquetMetaData, ParquetMetaDataReader};
use parquet::schema::types::{SchemaDescriptor, Type as ParquetType};

use super::record_batch_transformer::RecordBatchTransformer;
use crate::arrow::{arrow_schema_to_schema, get_arrow_datum};
use crate::error::Result;
use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor};
Expand All @@ -51,8 +52,6 @@ use crate::spec::{DataContentType, Datum, PrimitiveType, Schema};
use crate::utils::available_parallelism;
use crate::{Error, ErrorKind};

use super::record_batch_transformer::RecordBatchTransformer;

/// Builder to create ArrowReader
pub struct ArrowReaderBuilder {
batch_size: Option<usize>,
Expand Down Expand Up @@ -345,14 +344,19 @@ impl ArrowReader {
if iceberg_field.is_none() || parquet_iceberg_field.is_none() {
return;
}

if !type_promotion_is_valid(
parquet_iceberg_field
.unwrap()
.field_type
.as_primitive_type(),
iceberg_field.unwrap().field_type.as_primitive_type(),
) {
if iceberg_field
.unwrap()
.field_type
.as_primitive_type()
.is_some()
&& !type_promotion_is_valid(
parquet_iceberg_field
.unwrap()
.field_type
.as_primitive_type(),
iceberg_field.unwrap().field_type.as_primitive_type(),
)
{
return;
}

Expand Down
Loading

0 comments on commit 667c173

Please sign in to comment.