Skip to content

Commit

Permalink
Merge Builder
Browse files Browse the repository at this point in the history
commit 2c9d7e4
Author: Christian Thiel <[email protected]>
Date:   Wed Oct 2 07:51:50 2024 +0200

    New builder

commit fea1817
Author: Christian Thiel <[email protected]>
Date:   Tue Oct 1 12:56:10 2024 +0200

    Merge branch 'feat/safe-partition-spec'

commit 1f49c95
Author: Christian Thiel <[email protected]>
Date:   Tue Oct 1 12:55:02 2024 +0200

    Merge branch 'feat/schema-reassign-field-ids'

commit cda4a0c
Author: congyi wang <[email protected]>
Date:   Mon Sep 30 11:07:36 2024 +0800

    chore: fix typo in FileIO Schemes  (apache#653)

    * fix typo

    * fix typo

commit af9609d
Author: Scott Donnelly <[email protected]>
Date:   Mon Sep 30 04:06:14 2024 +0100

    fix: page index evaluator min/max args inverted (apache#648)

    * fix: page index evaluator min/max args inverted

    * style: fix clippy lint in test

commit a6a3fd7
Author: Alon Agmon <[email protected]>
Date:   Sat Sep 28 10:10:08 2024 +0300

    test (datafusion): add test for table provider creation (apache#651)

    * add test for table provider creation

    * fix formatting

    * fixing yet another formatting issue

    * testing schema using data fusion

    ---------

    Co-authored-by: Alon Agmon <[email protected]>

commit 87483b4
Author: Alon Agmon <[email protected]>
Date:   Fri Sep 27 04:40:08 2024 +0300

    making table provider pub (apache#650)

    Co-authored-by: Alon Agmon <[email protected]>

commit 984c91e
Author: ZENOTME <[email protected]>
Date:   Thu Sep 26 17:56:02 2024 +0800

    avoid to create memory schema operator every time (apache#635)

    Co-authored-by: ZENOTME <[email protected]>

commit 4171275
Author: Matheus Alcantara <[email protected]>
Date:   Wed Sep 25 08:28:42 2024 -0300

    scan: change ErrorKind when table dont have spanshots (apache#608)

commit ab51355
Author: xxchan <[email protected]>
Date:   Tue Sep 24 21:25:45 2024 +0800

    fix: compile error due to merge stale PR (apache#646)

    Signed-off-by: xxchan <[email protected]>

commit 420b4e2
Author: Scott Donnelly <[email protected]>
Date:   Tue Sep 24 08:20:23 2024 +0100

    Table Scan: Add Row Selection Filtering (apache#565)

    * feat(scan): add row selection capability via PageIndexEvaluator

    * test(row-selection): add first few row selection tests

    * feat(scan): add more tests, fix bug where min/max args swapped

    * fix: ad test and fix for logic bug in PageIndexEvaluator in-clause handler

    * feat: changes suggested from PR review

commit b3709ba
Author: Christian <[email protected]>
Date:   Tue Sep 24 04:47:04 2024 +0200

    feat: Add NamespaceIdent.parent() (apache#641)

    * Add NamespaceIdent.parent()

    * Use split_last

commit 1533c43
Author: Alon Agmon <[email protected]>
Date:   Mon Sep 23 13:39:46 2024 +0300

    feat (datafusion integration): convert datafusion expr filters to Iceberg Predicate (apache#588)

    * adding main function and tests

    * adding tests, removing integration test for now

    * fixing typos and lints

    * fixing typing issue

    * - added support in schmema to convert Date32 to correct arrow type
    - refactored scan to use new predicate converter as visitor and seperated it to a new mod
    - added support for simple predicates with column cast expressions
    - added testing, mostly around date functions

    * fixing format and lic

    * reducing number of tests (17 -> 7)

    * fix formats

    * fix naming

    * refactoring to use TreeNodeVisitor

    * fixing fmt

    * small refactor

    * adding swapped op and fixing CR comments

    ---------

    Co-authored-by: Alon Agmon <[email protected]>

commit e967deb
Author: xxchan <[email protected]>
Date:   Mon Sep 23 18:34:59 2024 +0800

    feat: expose remove_all in FileIO (apache#643)

    Signed-off-by: xxchan <[email protected]>

commit d03c4f8
Author: Scott Donnelly <[email protected]>
Date:   Mon Sep 23 08:28:52 2024 +0100

    Migrate to arrow-* v53 (apache#626)

    * chore: migrate to arrow-* v53

    * chore: update datafusion to 42

    * test: fix incorrect test assertion

    * chore: update python bindings to arrow 53

commit 88e5e4a
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Sep 23 15:26:18 2024 +0800

    chore(deps): Bump crate-ci/typos from 1.24.5 to 1.24.6 (apache#640)

    Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.5 to 1.24.6.
    - [Release notes](https://github.com/crate-ci/typos/releases)
    - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
    - [Commits](crate-ci/typos@v1.24.5...v1.24.6)

    ---
    updated-dependencies:
    - dependency-name: crate-ci/typos
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit c354983
Author: xxchan <[email protected]>
Date:   Mon Sep 23 14:50:18 2024 +0800

    doc: improve FileIO doc (apache#642)

    Signed-off-by: xxchan <[email protected]>

commit 12e12e2
Author: xxchan <[email protected]>
Date:   Fri Sep 20 19:59:55 2024 +0800

    feat: expose arrow type <-> iceberg type (apache#637)

    * feat: expose arrow type <-> iceberg type

    Previously we only exposed the schema conversion.

    Signed-off-by: xxchan <[email protected]>

    * add tests

    Signed-off-by: xxchan <[email protected]>

    ---------

    Signed-off-by: xxchan <[email protected]>

commit 3b27c9e
Author: xxchan <[email protected]>
Date:   Fri Sep 20 18:32:31 2024 +0800

    feat: add Sync to TransformFunction (apache#638)

    Signed-off-by: xxchan <[email protected]>

commit 34cb81c
Author: Xuanwo <[email protected]>
Date:   Wed Sep 18 20:18:40 2024 +0800

    chore: Bump opendal to 0.50 (apache#634)

commit cde35ab
Author: FANNG <[email protected]>
Date:   Fri Sep 13 10:01:16 2024 +0800

    feat: support projection pushdown for datafusion iceberg (apache#594)

    * support projection pushdown for datafusion iceberg

    * support projection pushdown for datafusion iceberg

    * fix ci

    * fix field id

    * remove depencences

    * remove depencences

commit eae9464
Author: Xuanwo <[email protected]>
Date:   Thu Sep 12 02:06:31 2024 +0800

    refactor(python): Expose transform as a submodule for pyiceberg_core (apache#628)

commit 8a3de4e
Author: Christian <[email protected]>
Date:   Mon Sep 9 14:45:16 2024 +0200

    Feat: Normalize TableMetadata (apache#611)

    * Normalize Table Metadata

    * Improve readability & comments

commit e08c0e5
Author: Renjie Liu <[email protected]>
Date:   Mon Sep 9 11:57:22 2024 +0800

    fix: Correctly calculate highest_field_id in schema (apache#590)

commit f78c59b
Author: Jack <[email protected]>
Date:   Mon Sep 9 03:35:16 2024 +0100

    feat: add `client.region` (apache#623)

commit a5aba9a
Author: Christian <[email protected]>
Date:   Sun Sep 8 18:36:05 2024 +0200

    feat: SortOrder methods should take schema ref if possible (apache#613)

    * SortOrder methods should take schema ref if possible

    * Fix test type

    * with_order_id should not take reference

commit 5812399
Author: Christian <[email protected]>
Date:   Sun Sep 8 18:18:41 2024 +0200

    feat: partition compatibility (apache#612)

    * Partition compatability

    * Partition compatability

    * Rename compatible_with -> is_compatible_with

commit ede4720
Author: Christian <[email protected]>
Date:   Sun Sep 8 16:49:39 2024 +0200

    fix: Less Panics for Snapshot timestamps (apache#614)

commit ced661f
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Sun Sep 8 22:43:38 2024 +0800

    chore(deps): Bump crate-ci/typos from 1.24.3 to 1.24.5 (apache#616)

    Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.3 to 1.24.5.
    - [Release notes](https://github.com/crate-ci/typos/releases)
    - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
    - [Commits](crate-ci/typos@v1.24.3...v1.24.5)

    ---
    updated-dependencies:
    - dependency-name: crate-ci/typos
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit cbbd086
Author: Xuanwo <[email protected]>
Date:   Sun Sep 8 10:29:31 2024 +0800

    feat: Add more fields in FileScanTask (apache#609)

    Signed-off-by: Xuanwo <[email protected]>

commit 620d58e
Author: Callum Ryan <[email protected]>
Date:   Thu Sep 5 03:44:55 2024 +0100

    feat: SQL Catalog - namespaces (apache#534)

    * feat: SQL Catalog - namespaces

    Signed-off-by: callum-ryan <[email protected]>

    * feat: use transaction for updates and creates

    Signed-off-by: callum-ryan <[email protected]>

    * fix: pull out query param builder to fn

    Signed-off-by: callum-ryan <[email protected]>

    * feat: add drop and tests

    Signed-off-by: callum-ryan <[email protected]>

    * fix: String to str, remove pub and optimise query builder

    Signed-off-by: callum-ryan <[email protected]>

    * fix: nested match, remove ok()

    Signed-off-by: callum-ryan <[email protected]>

    * fix: remove pub, add set, add comments

    Signed-off-by: callum-ryan <[email protected]>

    * fix: refactor list_namespaces slightly

    Signed-off-by: callum-ryan <[email protected]>

    * fix: add default properties to all new namespaces

    Signed-off-by: callum-ryan <[email protected]>

    * fix: remove check for nested namespace

    Signed-off-by: callum-ryan <[email protected]>

    * chore: add more comments to the CatalogConfig to explain bind styles

    Signed-off-by: callum-ryan <[email protected]>

    * fix: edit test for nested namespaces

    Signed-off-by: callum-ryan <[email protected]>

    ---------

    Signed-off-by: callum-ryan <[email protected]>

commit ae75f96
Author: Søren Dalby Larsen <[email protected]>
Date:   Tue Sep 3 13:46:48 2024 +0200

    chore: bump crate-ci/typos to 1.24.3 (apache#598)

commit 7aa8bdd
Author: Scott Donnelly <[email protected]>
Date:   Thu Aug 29 04:37:48 2024 +0100

    Table Scan: Add Row Group Skipping (apache#558)

    * feat(scan): add row group and page index row selection filtering

    * fix(row selection): off-by-one error

    * feat: remove row selection to defer to a second PR

    * feat: better min/max val conversion in RowGroupMetricsEvaluator

    * test(row_group_filtering): first three tests

    * test(row_group_filtering): next few tests

    * test: add more tests for RowGroupMetricsEvaluator

    * chore: refactor test assertions to silence clippy lints

    * refactor: consolidate parquet stat min/max parsing in one place

commit da08e8d
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Aug 28 14:35:55 2024 +0800

    chore(deps): Bump crate-ci/typos from 1.23.6 to 1.24.1 (apache#583)

commit ecbb4c3
Author: Sung Yun <[email protected]>
Date:   Mon Aug 26 23:57:01 2024 -0400

    Expose Transforms to Python Binding (apache#556)

    * bucket transform rust binding

    * format

    * poetry x maturin

    * ignore poetry.lock in license check

    * update bindings_python_ci to use makefile

    * newline

    * python-poetry/poetry#9135

    * use hatch instead of poetry

    * refactor

    * revert licenserc change

    * adopt review feedback

    * comments

    * unused dependency

    * adopt review comment

    * newline

    * I like this approach a lot better

    * more tests

commit 905ebd2
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Aug 26 20:49:07 2024 +0800

    chore(deps): Update typed-builder requirement from 0.19 to 0.20 (apache#582)

    ---
    updated-dependencies:
    - dependency-name: typed-builder
      dependency-type: direct:production
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit f9c92b7
Author: FANNG <[email protected]>
Date:   Sun Aug 25 22:31:36 2024 +0800

    fix: Update sqlx from 0.8.0 to 0.8.1 (apache#584)

commit ba66665
Author: FANNG <[email protected]>
Date:   Sat Aug 24 12:35:36 2024 +0800

    fix: correct partition-id to field-id in UnboundPartitionField (apache#576)

    * correct partition-id to field id in PartitionSpec

    * correct partition-id to field id in PartitionSpec

    * correct partition-id to field id in PartitionSpec

    * xx
  • Loading branch information
c-thiel committed Oct 2, 2024
1 parent 12d766f commit 7b44d31
Show file tree
Hide file tree
Showing 21 changed files with 3,496 additions and 745 deletions.
4 changes: 3 additions & 1 deletion crates/catalog/glue/src/catalog.rs
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,9 @@ impl Catalog for GlueCatalog {
}
};

let metadata = TableMetadataBuilder::from_table_creation(creation)?.build()?;
let metadata = TableMetadataBuilder::from_table_creation(creation)?
.build()?
.metadata;
let metadata_location = create_metadata_location(&location, 0)?;

self.file_io
Expand Down
4 changes: 3 additions & 1 deletion crates/catalog/glue/src/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,9 @@ mod tests {
.location("my_location".to_string())
.schema(schema)
.build();
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?.build()?;
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?
.build()?
.metadata;

Ok(metadata)
}
Expand Down
4 changes: 3 additions & 1 deletion crates/catalog/glue/src/utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -299,7 +299,9 @@ mod tests {
.location("my_location".to_string())
.schema(schema)
.build();
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?.build()?;
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?
.build()?
.metadata;

Ok(metadata)
}
Expand Down
4 changes: 3 additions & 1 deletion crates/catalog/hms/src/catalog.rs
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,9 @@ impl Catalog for HmsCatalog {
}
};

let metadata = TableMetadataBuilder::from_table_creation(creation)?.build()?;
let metadata = TableMetadataBuilder::from_table_creation(creation)?
.build()?
.metadata;
let metadata_location = create_metadata_location(&location, 0)?;

self.file_io
Expand Down
8 changes: 5 additions & 3 deletions crates/catalog/memory/src/catalog.rs
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,9 @@ impl Catalog for MemoryCatalog {
}
};

let metadata = TableMetadataBuilder::from_table_creation(table_creation)?.build()?;
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?
.build()?
.metadata;
let metadata_location = format!(
"{}/metadata/{}-{}.metadata.json",
&location,
Expand Down Expand Up @@ -355,7 +357,7 @@ mod tests {

assert_eq!(metadata.current_schema().as_ref(), expected_schema);

let expected_partition_spec = PartitionSpec::builder(expected_schema)
let expected_partition_spec = PartitionSpec::builder((*expected_schema).clone())
.with_spec_id(0)
.build()
.unwrap();
Expand All @@ -365,7 +367,7 @@ mod tests {
.partition_specs_iter()
.map(|p| p.as_ref())
.collect_vec(),
vec![&expected_partition_spec]
vec![&expected_partition_spec.into_schemaless()]
);

let expected_sorted_order = SortOrder::builder()
Expand Down
20 changes: 10 additions & 10 deletions crates/iceberg/src/arrow/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -826,8 +826,8 @@ mod tests {

fn arrow_schema_for_arrow_schema_to_schema_test() -> ArrowSchema {
let fields = Fields::from(vec![
simple_field("key", DataType::Int32, false, "17"),
simple_field("value", DataType::Utf8, true, "18"),
simple_field("key", DataType::Int32, false, "28"),
simple_field("value", DataType::Utf8, true, "29"),
]);

let r#struct = DataType::Struct(fields);
Expand Down Expand Up @@ -1057,9 +1057,9 @@ mod tests {
"required": true,
"type": {
"type": "map",
"key-id": 17,
"key-id": 28,
"key": "int",
"value-id": 18,
"value-id": 29,
"value-required": false,
"value": "string"
}
Expand Down Expand Up @@ -1110,8 +1110,8 @@ mod tests {

fn arrow_schema_for_schema_to_arrow_schema_test() -> ArrowSchema {
let fields = Fields::from(vec![
simple_field("key", DataType::Int32, false, "17"),
simple_field("value", DataType::Utf8, true, "18"),
simple_field("key", DataType::Int32, false, "28"),
simple_field("value", DataType::Utf8, true, "29"),
]);

let r#struct = DataType::Struct(fields);
Expand Down Expand Up @@ -1200,7 +1200,7 @@ mod tests {
),
simple_field("map", map, false, "16"),
simple_field("struct", r#struct, false, "17"),
simple_field("uuid", DataType::FixedSizeBinary(16), false, "26"),
simple_field("uuid", DataType::FixedSizeBinary(16), false, "30"),
])
}

Expand Down Expand Up @@ -1344,9 +1344,9 @@ mod tests {
"required": true,
"type": {
"type": "map",
"key-id": 17,
"key-id": 28,
"key": "int",
"value-id": 18,
"value-id": 29,
"value-required": false,
"value": "string"
}
Expand Down Expand Up @@ -1380,7 +1380,7 @@ mod tests {
}
},
{
"id":26,
"id":30,
"name":"uuid",
"required":true,
"type":"uuid"
Expand Down
53 changes: 48 additions & 5 deletions crates/iceberg/src/catalog/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -445,8 +445,46 @@ impl TableUpdate {
/// Applies the update to the table metadata builder.
pub fn apply(self, builder: TableMetadataBuilder) -> Result<TableMetadataBuilder> {
match self {
TableUpdate::AssignUuid { uuid } => builder.assign_uuid(uuid),
_ => unimplemented!(),
TableUpdate::AssignUuid { uuid } => Ok(builder.assign_uuid(uuid)),
TableUpdate::AddSchema {
schema,
last_column_id,
} => {
if let Some(last_column_id) = last_column_id {
if builder.last_column_id() < last_column_id {
return Err(Error::new(
ErrorKind::DataInvalid,
format!(
"Invalid last column ID: {last_column_id} < {} (previous last column ID)",
builder.last_column_id()
),
));
}
};
Ok(builder.add_schema(schema))
}
TableUpdate::SetCurrentSchema { schema_id } => builder.set_current_schema(schema_id),
TableUpdate::AddSpec { spec } => builder.add_partition_spec(spec),
TableUpdate::SetDefaultSpec { spec_id } => builder.set_default_partition_spec(spec_id),
TableUpdate::AddSortOrder { sort_order } => builder.add_sort_order(sort_order),
TableUpdate::SetDefaultSortOrder { sort_order_id } => {
builder.set_default_sort_order(sort_order_id)
}
TableUpdate::AddSnapshot { snapshot } => builder.add_snapshot(snapshot),
TableUpdate::SetSnapshotRef {
ref_name,
reference,
} => builder.set_ref(&ref_name, reference),
TableUpdate::RemoveSnapshots { snapshot_ids } => {
Ok(builder.remove_snapshots(&snapshot_ids))
}
TableUpdate::RemoveSnapshotRef { ref_name } => Ok(builder.remove_ref(&ref_name)),
TableUpdate::SetLocation { location } => Ok(builder.set_location(location)),
TableUpdate::SetProperties { updates } => builder.set_properties(updates),
TableUpdate::RemoveProperties { removals } => Ok(builder.remove_properties(&removals)),
TableUpdate::UpgradeFormatVersion { format_version } => {
builder.upgrade_format_version(format_version)
}
}
}
}
Expand Down Expand Up @@ -1183,16 +1221,21 @@ mod tests {
let table_metadata = TableMetadataBuilder::from_table_creation(table_creation)
.unwrap()
.build()
.unwrap();
let table_metadata_builder = TableMetadataBuilder::new(table_metadata);
.unwrap()
.metadata;
let table_metadata_builder = TableMetadataBuilder::new_from_metadata(
table_metadata,
"s3://db/table/metadata/metadata1.gz.json",
);

let uuid = uuid::Uuid::new_v4();
let update = TableUpdate::AssignUuid { uuid };
let updated_metadata = update
.apply(table_metadata_builder)
.unwrap()
.build()
.unwrap();
.unwrap()
.metadata;
assert_eq!(updated_metadata.uuid(), uuid);
}
}
Loading

0 comments on commit 7b44d31

Please sign in to comment.