Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit cda4a0c
Author: congyi wang <[email protected]>
Date:   Mon Sep 30 11:07:36 2024 +0800

    chore: fix typo in FileIO Schemes  (apache#653)

    * fix typo

    * fix typo

commit af9609d
Author: Scott Donnelly <[email protected]>
Date:   Mon Sep 30 04:06:14 2024 +0100

    fix: page index evaluator min/max args inverted (apache#648)

    * fix: page index evaluator min/max args inverted

    * style: fix clippy lint in test

commit a6a3fd7
Author: Alon Agmon <[email protected]>
Date:   Sat Sep 28 10:10:08 2024 +0300

    test (datafusion): add test for table provider creation (apache#651)

    * add test for table provider creation

    * fix formatting

    * fixing yet another formatting issue

    * testing schema using data fusion

    ---------

    Co-authored-by: Alon Agmon <[email protected]>

commit 87483b4
Author: Alon Agmon <[email protected]>
Date:   Fri Sep 27 04:40:08 2024 +0300

    making table provider pub (apache#650)

    Co-authored-by: Alon Agmon <[email protected]>

commit 984c91e
Author: ZENOTME <[email protected]>
Date:   Thu Sep 26 17:56:02 2024 +0800

    avoid to create memory schema operator every time (apache#635)

    Co-authored-by: ZENOTME <[email protected]>

commit 4171275
Author: Matheus Alcantara <[email protected]>
Date:   Wed Sep 25 08:28:42 2024 -0300

    scan: change ErrorKind when table dont have spanshots (apache#608)

commit ab51355
Author: xxchan <[email protected]>
Date:   Tue Sep 24 21:25:45 2024 +0800

    fix: compile error due to merge stale PR (apache#646)

    Signed-off-by: xxchan <[email protected]>

commit 420b4e2
Author: Scott Donnelly <[email protected]>
Date:   Tue Sep 24 08:20:23 2024 +0100

    Table Scan: Add Row Selection Filtering (apache#565)

    * feat(scan): add row selection capability via PageIndexEvaluator

    * test(row-selection): add first few row selection tests

    * feat(scan): add more tests, fix bug where min/max args swapped

    * fix: ad test and fix for logic bug in PageIndexEvaluator in-clause handler

    * feat: changes suggested from PR review

commit b3709ba
Author: Christian <[email protected]>
Date:   Tue Sep 24 04:47:04 2024 +0200

    feat: Add NamespaceIdent.parent() (apache#641)

    * Add NamespaceIdent.parent()

    * Use split_last

commit 1533c43
Author: Alon Agmon <[email protected]>
Date:   Mon Sep 23 13:39:46 2024 +0300

    feat (datafusion integration): convert datafusion expr filters to Iceberg Predicate (apache#588)

    * adding main function and tests

    * adding tests, removing integration test for now

    * fixing typos and lints

    * fixing typing issue

    * - added support in schmema to convert Date32 to correct arrow type
    - refactored scan to use new predicate converter as visitor and seperated it to a new mod
    - added support for simple predicates with column cast expressions
    - added testing, mostly around date functions

    * fixing format and lic

    * reducing number of tests (17 -> 7)

    * fix formats

    * fix naming

    * refactoring to use TreeNodeVisitor

    * fixing fmt

    * small refactor

    * adding swapped op and fixing CR comments

    ---------

    Co-authored-by: Alon Agmon <[email protected]>

commit e967deb
Author: xxchan <[email protected]>
Date:   Mon Sep 23 18:34:59 2024 +0800

    feat: expose remove_all in FileIO (apache#643)

    Signed-off-by: xxchan <[email protected]>

commit d03c4f8
Author: Scott Donnelly <[email protected]>
Date:   Mon Sep 23 08:28:52 2024 +0100

    Migrate to arrow-* v53 (apache#626)

    * chore: migrate to arrow-* v53

    * chore: update datafusion to 42

    * test: fix incorrect test assertion

    * chore: update python bindings to arrow 53

commit 88e5e4a
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Sep 23 15:26:18 2024 +0800

    chore(deps): Bump crate-ci/typos from 1.24.5 to 1.24.6 (apache#640)

    Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.5 to 1.24.6.
    - [Release notes](https://github.com/crate-ci/typos/releases)
    - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
    - [Commits](crate-ci/typos@v1.24.5...v1.24.6)

    ---
    updated-dependencies:
    - dependency-name: crate-ci/typos
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit c354983
Author: xxchan <[email protected]>
Date:   Mon Sep 23 14:50:18 2024 +0800

    doc: improve FileIO doc (apache#642)

    Signed-off-by: xxchan <[email protected]>

commit 12e12e2
Author: xxchan <[email protected]>
Date:   Fri Sep 20 19:59:55 2024 +0800

    feat: expose arrow type <-> iceberg type (apache#637)

    * feat: expose arrow type <-> iceberg type

    Previously we only exposed the schema conversion.

    Signed-off-by: xxchan <[email protected]>

    * add tests

    Signed-off-by: xxchan <[email protected]>

    ---------

    Signed-off-by: xxchan <[email protected]>

commit 3b27c9e
Author: xxchan <[email protected]>
Date:   Fri Sep 20 18:32:31 2024 +0800

    feat: add Sync to TransformFunction (apache#638)

    Signed-off-by: xxchan <[email protected]>

commit 34cb81c
Author: Xuanwo <[email protected]>
Date:   Wed Sep 18 20:18:40 2024 +0800

    chore: Bump opendal to 0.50 (apache#634)

commit cde35ab
Author: FANNG <[email protected]>
Date:   Fri Sep 13 10:01:16 2024 +0800

    feat: support projection pushdown for datafusion iceberg (apache#594)

    * support projection pushdown for datafusion iceberg

    * support projection pushdown for datafusion iceberg

    * fix ci

    * fix field id

    * remove depencences

    * remove depencences

commit eae9464
Author: Xuanwo <[email protected]>
Date:   Thu Sep 12 02:06:31 2024 +0800

    refactor(python): Expose transform as a submodule for pyiceberg_core (apache#628)

commit 8a3de4e
Author: Christian <[email protected]>
Date:   Mon Sep 9 14:45:16 2024 +0200

    Feat: Normalize TableMetadata (apache#611)

    * Normalize Table Metadata

    * Improve readability & comments

commit e08c0e5
Author: Renjie Liu <[email protected]>
Date:   Mon Sep 9 11:57:22 2024 +0800

    fix: Correctly calculate highest_field_id in schema (apache#590)

commit f78c59b
Author: Jack <[email protected]>
Date:   Mon Sep 9 03:35:16 2024 +0100

    feat: add `client.region` (apache#623)

commit a5aba9a
Author: Christian <[email protected]>
Date:   Sun Sep 8 18:36:05 2024 +0200

    feat: SortOrder methods should take schema ref if possible (apache#613)

    * SortOrder methods should take schema ref if possible

    * Fix test type

    * with_order_id should not take reference

commit 5812399
Author: Christian <[email protected]>
Date:   Sun Sep 8 18:18:41 2024 +0200

    feat: partition compatibility (apache#612)

    * Partition compatability

    * Partition compatability

    * Rename compatible_with -> is_compatible_with

commit ede4720
Author: Christian <[email protected]>
Date:   Sun Sep 8 16:49:39 2024 +0200

    fix: Less Panics for Snapshot timestamps (apache#614)

commit ced661f
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Sun Sep 8 22:43:38 2024 +0800

    chore(deps): Bump crate-ci/typos from 1.24.3 to 1.24.5 (apache#616)

    Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.3 to 1.24.5.
    - [Release notes](https://github.com/crate-ci/typos/releases)
    - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
    - [Commits](crate-ci/typos@v1.24.3...v1.24.5)

    ---
    updated-dependencies:
    - dependency-name: crate-ci/typos
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit cbbd086
Author: Xuanwo <[email protected]>
Date:   Sun Sep 8 10:29:31 2024 +0800

    feat: Add more fields in FileScanTask (apache#609)

    Signed-off-by: Xuanwo <[email protected]>

commit 620d58e
Author: Callum Ryan <[email protected]>
Date:   Thu Sep 5 03:44:55 2024 +0100

    feat: SQL Catalog - namespaces (apache#534)

    * feat: SQL Catalog - namespaces

    Signed-off-by: callum-ryan <[email protected]>

    * feat: use transaction for updates and creates

    Signed-off-by: callum-ryan <[email protected]>

    * fix: pull out query param builder to fn

    Signed-off-by: callum-ryan <[email protected]>

    * feat: add drop and tests

    Signed-off-by: callum-ryan <[email protected]>

    * fix: String to str, remove pub and optimise query builder

    Signed-off-by: callum-ryan <[email protected]>

    * fix: nested match, remove ok()

    Signed-off-by: callum-ryan <[email protected]>

    * fix: remove pub, add set, add comments

    Signed-off-by: callum-ryan <[email protected]>

    * fix: refactor list_namespaces slightly

    Signed-off-by: callum-ryan <[email protected]>

    * fix: add default properties to all new namespaces

    Signed-off-by: callum-ryan <[email protected]>

    * fix: remove check for nested namespace

    Signed-off-by: callum-ryan <[email protected]>

    * chore: add more comments to the CatalogConfig to explain bind styles

    Signed-off-by: callum-ryan <[email protected]>

    * fix: edit test for nested namespaces

    Signed-off-by: callum-ryan <[email protected]>

    ---------

    Signed-off-by: callum-ryan <[email protected]>

commit ae75f96
Author: Søren Dalby Larsen <[email protected]>
Date:   Tue Sep 3 13:46:48 2024 +0200

    chore: bump crate-ci/typos to 1.24.3 (apache#598)

commit 7aa8bdd
Author: Scott Donnelly <[email protected]>
Date:   Thu Aug 29 04:37:48 2024 +0100

    Table Scan: Add Row Group Skipping (apache#558)

    * feat(scan): add row group and page index row selection filtering

    * fix(row selection): off-by-one error

    * feat: remove row selection to defer to a second PR

    * feat: better min/max val conversion in RowGroupMetricsEvaluator

    * test(row_group_filtering): first three tests

    * test(row_group_filtering): next few tests

    * test: add more tests for RowGroupMetricsEvaluator

    * chore: refactor test assertions to silence clippy lints

    * refactor: consolidate parquet stat min/max parsing in one place

commit da08e8d
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Aug 28 14:35:55 2024 +0800

    chore(deps): Bump crate-ci/typos from 1.23.6 to 1.24.1 (apache#583)

commit ecbb4c3
Author: Sung Yun <[email protected]>
Date:   Mon Aug 26 23:57:01 2024 -0400

    Expose Transforms to Python Binding (apache#556)

    * bucket transform rust binding

    * format

    * poetry x maturin

    * ignore poetry.lock in license check

    * update bindings_python_ci to use makefile

    * newline

    * python-poetry/poetry#9135

    * use hatch instead of poetry

    * refactor

    * revert licenserc change

    * adopt review feedback

    * comments

    * unused dependency

    * adopt review comment

    * newline

    * I like this approach a lot better

    * more tests

commit 905ebd2
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Aug 26 20:49:07 2024 +0800

    chore(deps): Update typed-builder requirement from 0.19 to 0.20 (apache#582)

    ---
    updated-dependencies:
    - dependency-name: typed-builder
      dependency-type: direct:production
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit f9c92b7
Author: FANNG <[email protected]>
Date:   Sun Aug 25 22:31:36 2024 +0800

    fix: Update sqlx from 0.8.0 to 0.8.1 (apache#584)

commit ba66665
Author: FANNG <[email protected]>
Date:   Sat Aug 24 12:35:36 2024 +0800

    fix: correct partition-id to field-id in UnboundPartitionField (apache#576)

    * correct partition-id to field id in PartitionSpec

    * correct partition-id to field id in PartitionSpec

    * correct partition-id to field id in PartitionSpec

    * xx
  • Loading branch information
c-thiel committed Oct 2, 2024
1 parent 72d797c commit 12d766f
Show file tree
Hide file tree
Showing 49 changed files with 7,211 additions and 618 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/bindings_python_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,4 +80,4 @@ jobs:
set -e
pip install hatch==1.12.0
hatch run dev:pip install dist/pyiceberg_core-*.whl --force-reinstall
hatch run dev:test
hatch run dev:test
2 changes: 1 addition & 1 deletion .github/workflows/ci_typos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,4 @@ jobs:
steps:
- uses: actions/checkout@v4
- name: Check typos
uses: crate-ci/typos@v1.23.6
uses: crate-ci/typos@v1.24.6
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,5 @@ dist/*
**/venv
*.so
*.pyc
*.whl
*.tar.gz
19 changes: 11 additions & 8 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ rust-version = "1.77.1"
anyhow = "1.0.72"
apache-avro = "0.17"
array-init = "2"
arrow-arith = { version = "52" }
arrow-array = { version = "52" }
arrow-ord = { version = "52" }
arrow-schema = { version = "52" }
arrow-select = { version = "52" }
arrow-string = { version = "52" }
arrow-arith = { version = "53" }
arrow-array = { version = "53" }
arrow-ord = { version = "53" }
arrow-schema = { version = "53" }
arrow-select = { version = "53" }
arrow-string = { version = "53" }
async-stream = "0.3.5"
async-trait = "0.1"
async-std = "1.12"
Expand All @@ -64,17 +64,20 @@ iceberg = { version = "0.3.0", path = "./crates/iceberg" }
iceberg-catalog-rest = { version = "0.3.0", path = "./crates/catalog/rest" }
iceberg-catalog-hms = { version = "0.3.0", path = "./crates/catalog/hms" }
iceberg-catalog-memory = { version = "0.3.0", path = "./crates/catalog/memory" }
iceberg-datafusion = { version = "0.3.0", path = "./crates/integrations/datafusion" }
itertools = "0.13"
log = "0.4"
mockito = "1"
murmur3 = "0.5.2"
once_cell = "1"
opendal = { git = "https://github.com/twuebi/opendal.git", rev = "a9e3d88e97" }
ordered-float = "4"
parquet = "52"
parquet = "53"
paste = "1"
pilota = "0.11.2"
pretty_assertions = "1.4"
port_scanner = "0.1.5"
rand = "0.8"
regex = "1.10.5"
reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls"] }
rust_decimal = "1.31"
Expand All @@ -87,7 +90,7 @@ serde_with = "3.4"
strum = "0.26.3"
tempfile = "3.8"
tokio = { version = "1", default-features = false }
typed-builder = "0.19"
typed-builder = "0.20"
url = "2"
urlencoding = "2"
uuid = { version = "1.6.1", features = ["v7"] }
Expand Down
3 changes: 2 additions & 1 deletion bindings/python/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,5 @@ crate-type = ["cdylib"]

[dependencies]
iceberg = { path = "../../crates/iceberg" }
pyo3 = { version = "0.22", features = ["extension-module"] }
pyo3 = { version = "0.22.3", features = ["extension-module"] }
arrow = { version = "53", features = ["pyarrow"] }
1 change: 1 addition & 0 deletions bindings/python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ ignore = ["F403", "F405"]
dependencies = [
"maturin>=1.0,<2.0",
"pytest>=8.3.2",
"pyarrow>=17.0.0",
]

[tool.hatch.envs.dev.scripts]
Expand Down
24 changes: 24 additions & 0 deletions bindings/python/src/error.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

use pyo3::exceptions::PyValueError;
use pyo3::PyErr;

/// Convert an iceberg error to a python error
pub fn to_py_err(err: iceberg::Error) -> PyErr {
PyValueError::new_err(err.to_string())
}
12 changes: 4 additions & 8 deletions bindings/python/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,13 @@
// specific language governing permissions and limitations
// under the License.

use iceberg::io::FileIOBuilder;
use pyo3::prelude::*;

#[pyfunction]
fn hello_world() -> PyResult<String> {
let _ = FileIOBuilder::new_fs_io().build().unwrap();
Ok("Hello, world!".to_string())
}
mod error;
mod transform;

#[pymodule]
fn pyiceberg_core_rust(m: &Bound<'_, PyModule>) -> PyResult<()> {
m.add_function(wrap_pyfunction!(hello_world, m)?)?;
fn pyiceberg_core_rust(py: Python<'_>, m: &Bound<'_, PyModule>) -> PyResult<()> {
transform::register_module(py, m)?;
Ok(())
}
93 changes: 93 additions & 0 deletions bindings/python/src/transform.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

use arrow::array::{make_array, Array, ArrayData};
use arrow::pyarrow::{FromPyArrow, ToPyArrow};
use iceberg::spec::Transform;
use iceberg::transform::create_transform_function;
use pyo3::prelude::*;

use crate::error::to_py_err;

#[pyfunction]
pub fn identity(py: Python, array: PyObject) -> PyResult<PyObject> {
apply(py, array, Transform::Identity)
}

#[pyfunction]
pub fn void(py: Python, array: PyObject) -> PyResult<PyObject> {
apply(py, array, Transform::Void)
}

#[pyfunction]
pub fn year(py: Python, array: PyObject) -> PyResult<PyObject> {
apply(py, array, Transform::Year)
}

#[pyfunction]
pub fn month(py: Python, array: PyObject) -> PyResult<PyObject> {
apply(py, array, Transform::Month)
}

#[pyfunction]
pub fn day(py: Python, array: PyObject) -> PyResult<PyObject> {
apply(py, array, Transform::Day)
}

#[pyfunction]
pub fn hour(py: Python, array: PyObject) -> PyResult<PyObject> {
apply(py, array, Transform::Hour)
}

#[pyfunction]
pub fn bucket(py: Python, array: PyObject, num_buckets: u32) -> PyResult<PyObject> {
apply(py, array, Transform::Bucket(num_buckets))
}

#[pyfunction]
pub fn truncate(py: Python, array: PyObject, width: u32) -> PyResult<PyObject> {
apply(py, array, Transform::Truncate(width))
}

fn apply(py: Python, array: PyObject, transform: Transform) -> PyResult<PyObject> {
// import
let array = ArrayData::from_pyarrow_bound(array.bind(py))?;
let array = make_array(array);
let transform_function = create_transform_function(&transform).map_err(to_py_err)?;
let array = transform_function.transform(array).map_err(to_py_err)?;
// export
let array = array.into_data();
array.to_pyarrow(py)
}

pub fn register_module(py: Python<'_>, m: &Bound<'_, PyModule>) -> PyResult<()> {
let this = PyModule::new_bound(py, "transform")?;

this.add_function(wrap_pyfunction!(identity, &this)?)?;
this.add_function(wrap_pyfunction!(void, &this)?)?;
this.add_function(wrap_pyfunction!(year, &this)?)?;
this.add_function(wrap_pyfunction!(month, &this)?)?;
this.add_function(wrap_pyfunction!(day, &this)?)?;
this.add_function(wrap_pyfunction!(hour, &this)?)?;
this.add_function(wrap_pyfunction!(bucket, &this)?)?;
this.add_function(wrap_pyfunction!(truncate, &this)?)?;

m.add_submodule(&this)?;
py.import_bound("sys")?
.getattr("modules")?
.set_item("pyiceberg_core.transform", this)
}
22 changes: 0 additions & 22 deletions bindings/python/tests/test_basic.py

This file was deleted.

99 changes: 99 additions & 0 deletions bindings/python/tests/test_transform.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

from datetime import date, datetime

import pyarrow as pa
import pytest
from pyiceberg_core import transform


def test_identity_transform():
arr = pa.array([1, 2])
result = transform.identity(arr)
assert result == arr


def test_bucket_transform():
arr = pa.array([1, 2])
result = transform.bucket(arr, 10)
expected = pa.array([6, 2], type=pa.int32())
assert result == expected


def test_bucket_transform_fails_for_list_type_input():
arr = pa.array([[1, 2], [3, 4]])
with pytest.raises(
ValueError,
match=r"FeatureUnsupported => Unsupported data type for bucket transform",
):
transform.bucket(arr, 10)


def test_bucket_chunked_array():
chunked = pa.chunked_array([pa.array([1, 2]), pa.array([3, 4])])
result_chunks = []
for arr in chunked.iterchunks():
result_chunks.append(transform.bucket(arr, 10))

expected = pa.chunked_array(
[pa.array([6, 2], type=pa.int32()), pa.array([5, 0], type=pa.int32())]
)
assert pa.chunked_array(result_chunks).equals(expected)


def test_year_transform():
arr = pa.array([date(1970, 1, 1), date(2000, 1, 1)])
result = transform.year(arr)
expected = pa.array([0, 30], type=pa.int32())
assert result == expected


def test_month_transform():
arr = pa.array([date(1970, 1, 1), date(2000, 4, 1)])
result = transform.month(arr)
expected = pa.array([0, 30 * 12 + 3], type=pa.int32())
assert result == expected


def test_day_transform():
arr = pa.array([date(1970, 1, 1), date(2000, 4, 1)])
result = transform.day(arr)
expected = pa.array([0, 11048], type=pa.int32())
assert result == expected


def test_hour_transform():
arr = pa.array([datetime(1970, 1, 1, 19, 1, 23), datetime(2000, 3, 1, 12, 1, 23)])
result = transform.hour(arr)
expected = pa.array([19, 264420], type=pa.int32())
assert result == expected


def test_truncate_transform():
arr = pa.array(["this is a long string", "hi my name is sung"])
result = transform.truncate(arr, 5)
expected = pa.array(["this ", "hi my"])
assert result == expected


def test_identity_transform_with_direct_import():
from pyiceberg_core.transform import identity

arr = pa.array([1, 2])
result = identity(arr)
assert result == arr
2 changes: 1 addition & 1 deletion crates/catalog/memory/src/catalog.rs
Original file line number Diff line number Diff line change
Expand Up @@ -371,7 +371,7 @@ mod tests {
let expected_sorted_order = SortOrder::builder()
.with_order_id(0)
.with_fields(vec![])
.build(expected_schema.clone())
.build(expected_schema)
.unwrap();

assert_eq!(
Expand Down
8 changes: 2 additions & 6 deletions crates/catalog/rest/tests/rest_catalog_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -293,12 +293,8 @@ async fn test_create_table() {
assert_eq!(table.metadata().format_version(), FormatVersion::V2);
assert!(table.metadata().current_snapshot().is_none());
assert!(table.metadata().history().is_empty());
assert!(table.metadata().default_sort_order().unwrap().is_unsorted());
assert!(table
.metadata()
.default_partition_spec()
.unwrap()
.is_unpartitioned());
assert!(table.metadata().default_sort_order().is_unsorted());
assert!(table.metadata().default_partition_spec().is_unpartitioned());
}

#[tokio::test]
Expand Down
2 changes: 1 addition & 1 deletion crates/catalog/sql/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ keywords = ["iceberg", "sql", "catalog"]
[dependencies]
async-trait = { workspace = true }
iceberg = { workspace = true }
sqlx = { version = "0.8.0", features = ["any"], default-features = false }
sqlx = { version = "0.8.1", features = ["any"], default-features = false }
typed-builder = { workspace = true }

[dev-dependencies]
Expand Down
Loading

0 comments on commit 12d766f

Please sign in to comment.