Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Java Hive and Iceberg Connector for Prestissimo Iceberg Connector #21662

Merged
merged 5 commits into from
Jan 18, 2024

Conversation

imjalpreet
Copy link
Member

Description

Related to #21584

Motivation and Context

facebookincubator/velox#5977

Impact

N.A.

Test Plan

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== NO RELEASE NOTE ==

@imjalpreet imjalpreet self-assigned this Jan 9, 2024
@imjalpreet imjalpreet requested a review from a team as a code owner January 9, 2024 20:24
@imjalpreet imjalpreet requested a review from presto-oss January 9, 2024 20:24
@imjalpreet imjalpreet added the iceberg Apache Iceberg related label Jan 9, 2024
@imjalpreet imjalpreet requested a review from yingsu00 January 9, 2024 21:16
@imjalpreet imjalpreet force-pushed the icebergPrestissimoJavaChanges branch from 66cffe5 to 9e2428b Compare January 10, 2024 06:44
@@ -3666,6 +3667,7 @@ public TableLayoutFilterCoverage getTableLayoutFilterCoverage(ConnectorTableLayo
{
HiveTableLayoutHandle tableHandle = (HiveTableLayoutHandle) connectorTableLayoutHandle;
Set<String> relevantColumns = tableHandle.getPartitionColumns().stream()
.map(HiveColumnHandle.class::cast)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to cast to HiveColumnHandle

Set<String> relevantColumns = tableHandle.getPartitionColumns().stream()
                .map(BaseHiveColumnHandle::getName)

@@ -2565,6 +2565,7 @@ public boolean supportsMetadataDelete(ConnectorSession session, ConnectorTableHa
.collect(toImmutableSet());

Set<String> partitionColumnNames = layoutHandle.getPartitionColumns().stream()
.map(HiveColumnHandle.class::cast)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to cast to HiveColumnHandle

Set<String> relevantColumns = tableHandle.getPartitionColumns().stream()
                .map(BaseHiveColumnHandle::getName)

@@ -129,11 +129,16 @@ protected HiveTableLayoutHandle(
boolean footerStatsUnreliable,
Optional<HiveTableHandle> hiveTableHandle)
{
super(domainPredicate, remainingPredicate, pushdownFilterEnabled, partitionColumnPredicate, partitions);
super(
partitionColumns.stream().map(BaseHiveColumnHandle.class::cast).collect(toList()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't usually do upcasts in the child classes. I think it's better to change the parameter partitionColumns to List<BaseColumnHandle>

@@ -333,7 +332,7 @@ public Builder builder()
return new Builder()
.setSchemaTableName(getSchemaTableName())
.setTablePath(getTablePath())
.setPartitionColumns(getPartitionColumns())
.setPartitionColumns(getPartitionColumns().stream().map(HiveColumnHandle.class::cast).collect(toList()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change
public Builder setPartitionColumns(List<HiveColumnHandle> partitionColumns)
to
public Builder setPartitionColumns(List<BaseHiveColumnHandle> partitionColumns)
and do downcast inside.

@@ -78,21 +79,20 @@ protected IcebergTableLayoutHandle(
Optional<List<HivePartition>> partitions,
IcebergTableHandle table)
{
super(domainPredicate, remainingPredicate, pushdownFilterEnabled, partitionColumnPredicate, partitions);
super(
partitionColumns.stream().map(BaseHiveColumnHandle.class::cast).collect(toList()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@imjalpreet imjalpreet force-pushed the icebergPrestissimoJavaChanges branch from 9e2428b to 9c7bd6b Compare January 11, 2024 10:29
@imjalpreet imjalpreet marked this pull request as draft January 11, 2024 11:05
@imjalpreet imjalpreet force-pushed the icebergPrestissimoJavaChanges branch from 9c7bd6b to c67f699 Compare January 11, 2024 12:26
@imjalpreet imjalpreet marked this pull request as ready for review January 11, 2024 13:17
@imjalpreet imjalpreet requested a review from yingsu00 January 11, 2024 16:21
@@ -80,12 +74,12 @@ public Optional<String> getTableSchemaJson()

public SchemaTableName getSchemaTableName()
{
return new SchemaTableName(schemaName, tableName.getTableName());
return new SchemaTableName(getSchemaName(), getTableName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSchemaTableName() is already in BaseHiveTableHandle. Do we still need it here?

@@ -2633,7 +2633,7 @@ public List<ConnectorTableLayoutResult> getTableLayouts(ConnectorSession session
new HiveTableLayoutHandle.Builder()
.setSchemaTableName(handle.getSchemaTableName())
.setTablePath(table.getStorage().getLocation())
.setPartitionColumns(hivePartitionResult.getPartitionColumns())
.setPartitionColumns(ImmutableList.copyOf(hivePartitionResult.getPartitionColumns()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make HivePartitionResult.partitionColumns List<HiveColumnHandle>, and make hivePartitionResult.getPartitionColumns() return List<BaseHiveColumnHandle>, and also make the constructor of HivePartitionResult take List<BaseHiveColumnHandle> parameter.

@@ -89,7 +89,7 @@ public HiveTableLayoutHandle(
this(
schemaTableName,
tablePath,
partitionColumns,
partitionColumns.stream().map(BaseHiveColumnHandle.class::cast).collect(toList()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the constructor parameter partitionColumns to List<BaseHiveColumnHandle> instead of doing upcasting here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, we won't be able to make this change in the JSON constructor since we don't want the information to get lost while serializing.

@imjalpreet imjalpreet force-pushed the icebergPrestissimoJavaChanges branch from c67f699 to 2cef922 Compare January 18, 2024 10:03
@imjalpreet imjalpreet force-pushed the icebergPrestissimoJavaChanges branch from 2cef922 to 9be76e6 Compare January 18, 2024 12:20
@imjalpreet imjalpreet force-pushed the icebergPrestissimoJavaChanges branch from 9be76e6 to a3ebbb4 Compare January 18, 2024 12:25
@yingsu00 yingsu00 merged commit f75f9db into prestodb:master Jan 18, 2024
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iceberg Apache Iceberg related
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants