Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iceberg split changes for execution with Velox #20738

Merged
merged 2 commits into from
Sep 10, 2023

Conversation

agrawalreetika
Copy link
Member

Co-authors: @agrawalreetika and @imjalpreet

Description

This PR introduces the below addition to Iceberg Connector Split in an ongoing effort to make it compatible with execution on Velox:

  • Passing more partition details like partition name along with the value and id in IcebergSplit
  • Adding DeleteFile information as part of Split to support Delete filter in queries

This PR is a continuation of existing PR - #20501

Motivation and Context

To support Iceberg Table Query Execution with Velox on Presto

Impact

No impact on the existing Iceberg connector

Test Plan

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@agrawalreetika agrawalreetika requested a review from a team as a code owner September 1, 2023 07:45
@agrawalreetika agrawalreetika self-assigned this Sep 1, 2023
import static com.google.common.base.MoreObjects.toStringHelper;
import static java.util.Objects.requireNonNull;

public final class IcebergPartition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agrawalreetika Why is IcebergPartition class needed? I see it adds a name field but that field is not used. HIveSplit has List<HivePartitionKey> partitionKeys;, and the HivePartitionKey is very similar to this IcebergPartition. So I assume you want to introduce a similar class to HivePartitionKey so we can serialize the split, is this right? If this is the case, 1) it should not be named IcebergPartition but IcebergPartitionKey. 2) the content is the same as HivePartitionKey. I think we can use use HivePartitionKey in IcebergSplit directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yingsu00 I have made the changes as discussed.

@yingsu00 yingsu00 added the iceberg Apache Iceberg related label Sep 6, 2023
@yingsu00 yingsu00 merged commit 4dd461f into prestodb:master Sep 10, 2023
@agrawalreetika agrawalreetika deleted the icebergSplitChanges branch September 11, 2023 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iceberg Apache Iceberg related
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants