Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #6315] Spark 3.5: MaxScanStrategy supports DSv2 #5852
[KYUUBI #6315] Spark 3.5: MaxScanStrategy supports DSv2 #5852
Changes from 13 commits
fe620ca
b15652f
5f1c3c0
6061f42
661834c
dc128bc
cf893a0
73258c2
b307022
f87cb07
4d26ce1
3a07396
70c845b
c8399a0
acc3587
fb113d6
3c5b0c2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numPartitions
does not seem to be the number of scan table partitions. As in iceberg implementation, it is the size oftaskGroups
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is the task number of RDD/stage, instead of the table's partition number, does taskGroups in Iceberg means same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the input RDD partition number for iceberg datasource. Maybe the value of it is equal to table's partition number, but they're not the same thing. Seems it's a bit hard to get the number of scan table partitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add a new data source? Is it better to use iceberg datasource directly? @pan3793 WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to use a dummy DS like Spark does.