-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TASK][EASY] MaxScanStrategy supports DSv2 #6315
Comments
pan3793
pushed a commit
that referenced
this issue
Apr 17, 2024
# 🔍 Description ## Issue References 🔗 Now, MaxScanStrategy can be adopted to limit max scan file size in some datasources, such as Hive. Hopefully we can enhance MaxScanStrategy to include support for the datasourcev2. ## Describe Your Solution 🔧 get the statistics about files scanned through datasourcev2 API ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested **Be nice. Be informative.** Closes #5852 from zhaohehuhu/dev-1213. Closes #6315 3c5b0c2 [hezhao2] reformat fb113d6 [hezhao2] disable the rule that checks the maxPartitions for dsv2 acc3587 [hezhao2] disable the rule that checks the maxPartitions for dsv2 c8399a0 [hezhao2] fix header 70c845b [hezhao2] add UTs 3a07396 [hezhao2] add ut 4d26ce1 [hezhao2] reformat f87cb07 [hezhao2] reformat b307022 [hezhao2] move code to Spark 3.5 73258c2 [hezhao2] fix unused import cf893a0 [hezhao2] drop reflection for loading iceberg class dc128bc [hezhao2] refactor code 661834c [hezhao2] revert code 6061f42 [hezhao2] delete IcebergSparkPlanHelper 5f1c3c0 [hezhao2] fix b15652f [hezhao2] remove iceberg dependency fe620ca [hezhao2] enable MaxScanStrategy when accessing iceberg datasource Authored-by: hezhao2 <[email protected]> Signed-off-by: Cheng Pan <[email protected]> (cherry picked from commit 8edcb00) Signed-off-by: Cheng Pan <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What's the level of this task?
EASY
Code of Conduct
Search before creating
Mentor
Skill requirements
Spark, DSv2
Background and Goals
Now, MaxScanStrategy can be adopted to limit max scan file size in some datasources, such as Hive. Hopefully we can enhance MaxScanStrategy to include support for the datasourcev2.
Implementation steps
#5852
Additional context
Introduction of 2024H1 Kyuubi Code Contribution Program
The text was updated successfully, but these errors were encountered: