-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read transforms via expressions. Part 1: Just compute the expression and return it. #607
base: main
Are you sure you want to change the base?
Conversation
} | ||
} | ||
|
||
/// Given an iterator of (engine_data, bool) tuples and a predicate, returns an iterator of | ||
/// `(engine_data, selection_vec)`. Each row that is selected in the returned `engine_data` _must_ | ||
/// be processed to complete the scan. Non-selected rows _must_ be ignored. The boolean flag | ||
/// indicates whether the record batch is a log or checkpoint batch. | ||
pub fn scan_action_iter( | ||
pub(crate) fn scan_action_iter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this is a significant change as we not longer expose this function. In discussion so far we've agreed that it basically should never have been pub
, and I just made a mistake when doing so. An engine should call scan_data
which mostly just proxies to this, but doesn't expose internal details to the engine.
Open to discussion though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub(crate)
SGTM!
What changes are proposed in this pull request?
This is the initial part of moving to using expressions to express transformations when reading data. What this PR does is:
Add
fileAdd
file, if there are needed fix-ups (just partition columns today), the correct expression is created, and inserted into a row indexed mapFollow-up PRs will:
visit_scan_files
transform_to_logical
entirely and clean up associated codeEach of those are more invasive and end up touching significant code, so I'm staging this as much as possible to make reviews easier.
How was this change tested?
Unit tests, and inspection of resultant expressions when run on tables