-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: upgrade to DataFusion 46.0.0 #3261
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
crates/sql/src/planner.rs
Outdated
@@ -44,6 +44,7 @@ impl<'a, S: ContextProvider> DeltaSqlToRel<'a, S> { | |||
enable_ident_normalization: self.options.enable_ident_normalization, | |||
support_varchar_with_length: false, | |||
enable_options_value_normalization: false, | |||
collect_spans: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is spans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source code locations for expressions:
https://docs.rs/sqlparser/latest/sqlparser/tokenizer/struct.Span.html
We are starting to gather / plumb this into DataFusion. This particular setting means the sql planner won't try and pass the span information along. The only thing the spans are used for now is debug / help messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok thanks for explaining :) I learned something new about datafusion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW this is a new feature that was starting to be added in DataFusion 46
Looks like from CI we also need to update rust to be 1.82 https://github.com/delta-io/delta-rs/actions/runs/13506223447/job/37736308124?pr=3261
|
I have a few cleanup PRs I would plan to make as I work on this. Here is the first one: |
I also started breaking this PR up into some smaller ones: |
#[derive(Default)] | ||
struct ParquetPredicateVisitor { | ||
struct ParquetVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change the way the parquet scan information was found in the two visitors, and I combined them together at the same time as they were mostly boiler plate copy/paste
I filed a ticket in datafusion explaining the current test failures |
enable_options_value_normalization: false, | ||
}, | ||
); | ||
let parser_options = ParserOptions::new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses the nice new API from @kosiew in
@@ -20,7 +20,7 @@ jobs: | |||
uses: actions-rs/toolchain@v1 | |||
with: | |||
profile: default | |||
toolchain: '1.81' | |||
toolchain: '1.82' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataFusion 46 requires Rust 1.82
You can see CI fail without these changes
https://github.com/delta-io/delta-rs/actions/runs/13591787541/job/38000195037?pr=3261
I don't know what the MSRV policy in delta is so we probably can't merge this PR until it is ok to increase MSRV in delta
I think this PR will pass -- maybe someone could trigger the CI so I can show a clean run? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3261 +/- ##
=======================================
Coverage 72.16% 72.16%
=======================================
Files 144 144
Lines 45771 45784 +13
Branches 45771 45784 +13
=======================================
+ Hits 33030 33042 +12
+ Misses 10651 10650 -1
- Partials 2090 2092 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🤔 there are some python failures here. Seeing if I can figure out what is going on https://github.com/delta-io/delta-rs/actions/runs/13592071811/job/38008047004?pr=3261
|
Could someone tell me how to run these tests locally or give me a rust stack trace? I don't really understand what is failing here |
cd python
make develop
RUST_BACKTRACE=1 uv run pytest tests/test_cdf.py -s -k "test_read_cdf_partitioned_projection" |
Update here is @blaginin has a fix: |
Ok, updated with latest release candidate. Here is hoping for a clean CI run 🙏 |
SUccess! Thank you so much @blaginin for making this happen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ion-elgreco -- I'll update the to use the released verson of datafusion now
Signed-off-by: Andrew Lamb <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 |
Description
chore: upgrade to DataFusion 46.0.0
Related Issue(s)
46.0.0
apache/datafusion#14123