Skip to content

Commit

Permalink
actually check arrow. depend on >=53 (#413)
Browse files Browse the repository at this point in the history
This fixes three issues:

1. The integration test was not enabling enough features to actually
test arrow enough. Change it so it does
2. The above caught that we actually _don't_ support arrow 52.x because
the various `min/max_opt` functions don't exist there. See
https://docs.rs/parquet/53.1.0/parquet/file/statistics/struct.ValueStatistics.html#method.max
and note the deprecation since 53 note.
3. Parquet depends on object_store. So we _also_ need to be flexible on
`object_store`. Right now I'm locking it to the 0.11 series (the
latest).
  • Loading branch information
nicklan authored Oct 22, 2024
1 parent cd53bc1 commit 6e8bd3e
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 41 deletions.
24 changes: 12 additions & 12 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,18 @@ readme = "README.md"
version = "0.3.1"

[workspace.dependencies]
arrow = { version = ">=52, <54" }
arrow-arith = { version = ">=52, <54" }
arrow-array = { version = ">=52, <54" }
arrow-buffer = { version = ">=52, <54" }
arrow-cast = { version = ">=52, <54" }
arrow-data = { version = ">=52, <54" }
arrow-ord = { version = ">=52, <54" }
arrow-json = { version = ">=52, <54" }
arrow-select = { version = ">=52, <54" }
arrow-schema = { version = ">=52, <54" }
parquet = { version = ">=52, <54", features = ["object_store"] }
object_store = "0.11.0"
arrow = { version = ">=53, <54" }
arrow-arith = { version = ">=53, <54" }
arrow-array = { version = ">=53, <54" }
arrow-buffer = { version = ">=53, <54" }
arrow-cast = { version = ">=53, <54" }
arrow-data = { version = ">=53, <54" }
arrow-ord = { version = ">=53, <54" }
arrow-json = { version = ">=53, <54" }
arrow-select = { version = ">=53, <54" }
arrow-schema = { version = ">=53, <54" }
parquet = { version = ">=53, <54", features = ["object_store"] }
object_store = { version = ">=0.11, <0.12" }
hdfs-native-object-store = "0.12.0"
hdfs-native = "0.10.0"
walkdir = "2.5.0"
37 changes: 22 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,28 +76,35 @@ This means you can force kernel to rely on the specific arrow version that your
as long as it falls in that range. You can see the range in the `Cargo.toml` in the same folder as
this `README.md`.

For example, although arrow 53.x has been released, you can force kernel to compile on 52.2.0 by
For example, although arrow 53.1.0 has been released, you can force kernel to compile on 53.0 by
putting the following in your project's `Cargo.toml`:

```toml
[patch.crates-io]
arrow = "52.2"
arrow-arith = "52.2"
arrow-array = "52.2"
arrow-buffer = "52.2"
arrow-cast = "52.2"
arrow-data = "52.2"
arrow-ord = "52.2"
arrow-json = "52.2"
arrow-select = "52.2"
arrow-schema = "52.2"
parquet = "52.2"
arrow = "53.0"
arrow-arith = "53.0"
arrow-array = "53.0"
arrow-buffer = "53.0"
arrow-cast = "53.0"
arrow-data = "53.0"
arrow-ord = "53.0"
arrow-json = "53.0"
arrow-select = "53.0"
arrow-schema = "53.0"
parquet = "53.0"
```

Note that unfortunatly patching in `cargo` requires that _exactly one_ version matches your
specification. If only arrow "52.2.0" has been released the above will work, but if "52.2.1" is
released, the specification will break and you will need to provide a more restrictive
specification.
specification. If only arrow "53.0.0" had been released the above will work, but if "53.0.1" where
to be released, the specification will break and you will need to provide a more restrictive
specification like `"=53.0.0"`.

### Object Store
You may also need to patch the `object_store` version used if the version of `parquet` you depend on
depends on a different version of `object_store`. This can be done by including `object_store` in
the patch list with the required version. You can find this out by checking the `parquet` [docs.rs
page](https://docs.rs/parquet/52.2.0/parquet/index.html), switching to the version you want to use,
and then checking what version of `object_store` it depends on.

## Documentation

Expand Down
27 changes: 14 additions & 13 deletions integration-tests/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,19 @@ edition = "2021"
[workspace]

[dependencies]
arrow = "=52.1.0"
delta_kernel = { path = "../kernel", features = ["arrow-conversion"] }
arrow = "=53.0.0"
delta_kernel = { path = "../kernel", features = ["arrow-conversion", "arrow-expression", "default-engine", "sync-engine"] }

[patch.'file:///../kernel']
arrow = "=52.1.0"
arrow-arith = "=52.1.0"
arrow-array = "=52.1.0"
arrow-buffer = "=52.1.0"
arrow-cast = "=52.1.0"
arrow-data = "=52.1.0"
arrow-ord = "=52.1.0"
arrow-json = "=52.1.0"
arrow-select = "=52.1.0"
arrow-schema = "=52.1.0"
parquet = "=52.1.0"
arrow = "=53.0.0"
arrow-arith = "=53.0.0"
arrow-array = "=53.0.0"
arrow-buffer = "=53.0.0"
arrow-cast = "=53.0.0"
arrow-data = "=53.0.0"
arrow-ord = "=53.0.0"
arrow-json = "=53.0.0"
arrow-select = "=53.0.0"
arrow-schema = "=53.0.0"
parquet = "=53.0.0"
object_store = "=0.11.1"
2 changes: 1 addition & 1 deletion integration-tests/test-all-arrow-versions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ test_arrow_version() {
cargo run
}

MIN_ARROW_VER="52.0.0"
MIN_ARROW_VER="53.0.0"
MAX_ARROW_VER="54.0.0"

for ARROW_VERSION in $(curl -s https://crates.io/api/v1/crates/arrow | jq -r '.versions[].num' | tr -d '\r')
Expand Down

0 comments on commit 6e8bd3e

Please sign in to comment.