Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support buckets containing underscore #263

Merged
merged 3 commits into from
Sep 9, 2024

Conversation

eric-maynard
Copy link
Contributor

@eric-maynard eric-maynard commented Sep 4, 2024

Description

Currently, we are using URI.getHost to extract the bucket from paths, which can return null unexpectedly when the bucket contains an underscore. This seems to be a known issue.

This PR introduces a thin new utility for extracting the path and adds a test for it.

Fixes #262

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Added a new suite, StorageUtilTest.

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings

@@ -73,7 +73,7 @@ PolarisStorageIntegration<T> getStorageIntegrationForConfig(
HttpTransportFactory.class, NetHttpTransport::new));
} catch (IOException e) {
throw new RuntimeException(
"Error initializing default google credentials" + e.getMessage());
"Error initializing default google credentials. " + e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not related to this fix.

Copy link
Contributor

@nk1506 nk1506 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

}

@ParameterizedTest
@ValueSource(strings = {"s3", "gcs", "azure"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do you think here values should be inline with iceberg supported scheme ? something like
s3, s3a, s3n, gs, abfs, abfss

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purposes of this PR, these need to match the schemes listed here

@RussellSpitzer RussellSpitzer enabled auto-merge (squash) September 9, 2024 21:56
@RussellSpitzer
Copy link
Member

Thanks @eric-maynard and thanks @nk1506 for review

@RussellSpitzer RussellSpitzer merged commit bfe3f2f into apache:main Sep 9, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] GCS buckets with underscores are resolved as null
3 participants