-
Notifications
You must be signed in to change notification settings - Fork 9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ABFS][FNSOverBlob] Making AbfsClient Abstract for supporting both DFS and Blob Endpoint #6846
[ABFS][FNSOverBlob] Making AbfsClient Abstract for supporting both DFS and Blob Endpoint #6846
Conversation
:::: AGGREGATED TEST RESULT :::: ============================================================
|
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great code; some comments. Thanks!
DFS, | ||
BLOB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets have field as endpointDnsSuffix, which would have something like .dfs.core.windows.net or .blob.core.windows.net.
Reason being, some account can have word 'blob' or 'dfs' as part of the accountName.
Also, can it happen that fsUri have some other suffix also (https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/filesystem/create?view=rest-storageservices-datalakestoragegen2-2019-12-12 doesnt mention any particular suffix) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Will take this up.
I think we have decided to use blob endpoint only if user has configured .blob.core.windows.net
For all other endpoints, including custom DNS, we will default to DFS endpoint, unless otherwise stated by the new config fs.azure.fns.account.service.type
Will make this change as well.
private AbfsClientHandler clientHandler; | ||
private AbfsServiceType defaultServiceType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be made final.
/** | ||
* Abfs Exception to be thrown when operation is not supported. | ||
*/ | ||
public class UnsupportedAbfsOperationException extends AzureBlobFileSystemException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if PathIOException own this behavior. What you say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used an existing UnsupportedOperationException instead of defining a new.
} | ||
|
||
// Constructor to be used for interacting with AbfsBlobClient | ||
@SuppressWarnings("checkstyle:ParameterNumber") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can add this rule in checkstyle-suppressions.xml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh is it.
Thanks for letting me know this.
...ols/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientHandler.java
Outdated
Show resolved
Hide resolved
...ols/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientHandler.java
Show resolved
Hide resolved
Configuration rawConfig = new Configuration(); | ||
rawConfig.addResource(TEST_CONFIGURATION_FILE_NAME); | ||
|
||
String fileSystemName = TEST_CONTAINER_PREFIX + UUID.randomUUID().toString(); | ||
String accountName = rawConfig.get(FS_AZURE_ACCOUNT_NAME, ""); | ||
if (accountName.isEmpty()) { | ||
// check if accountName is set using different config key | ||
accountName = rawConfig.get(FS_AZURE_ABFS_ACCOUNT_NAME, ""); | ||
} | ||
Assume.assumeFalse("Skipping test as account name is not provided", accountName.isEmpty()); | ||
|
||
Assume.assumeFalse("Blob Endpoint Works only with FNS Accounts", | ||
rawConfig.getBoolean(FS_AZURE_TEST_NAMESPACE_ENABLED_ACCOUNT, true)); | ||
accountName = setBlobEndpoint(accountName); | ||
|
||
AbfsConfiguration abfsConfig = new AbfsConfiguration(rawConfig, accountName); | ||
AuthType authType = abfsConfig.getEnum(FS_AZURE_ACCOUNT_AUTH_TYPE_PROPERTY_NAME, AuthType.SharedKey); | ||
String abfsScheme = authType == AuthType.SharedKey ? FileSystemUriSchemes.ABFS_SCHEME | ||
: FileSystemUriSchemes.ABFS_SECURE_SCHEME; | ||
final String abfsUrl = fileSystemName + "@" + accountName; | ||
URI defaultUri = null; | ||
|
||
try { | ||
defaultUri = new URI(abfsScheme, abfsUrl, null, null, null); | ||
} catch (Exception ex) { | ||
throw new AssertionError(ex); | ||
} | ||
|
||
String testUrl = defaultUri.toString(); | ||
abfsConfig.set(FS_DEFAULT_NAME_KEY, defaultUri.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can inherit AbstractAbfsIntegrationTest
, which would give us all the boilerplate for running the test. What we can do, is get the configuration form the getFileSystem().getConf(); then, make changes to the config, and then return back filesystem created with FileSystem.newInstance().. What you feel?
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking all the comments. Have one request change.
This review is done on the difference patch from last review. Will do one more review for full pr.
if (uri.toString().contains(FileSystemUriSchemes.ABFS_BLOB_DOMAIN_NAME)) { | ||
return AbfsServiceType.BLOB; | ||
} | ||
// Incase of DFS Domain name or any other custom endpoint, the service | ||
// type is to be identified as default DFS. | ||
return AbfsServiceType.DFS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can have get(uri) in AbfsServiceType, which returns back the relevant enum. Also, we can remove FileSystemUriSchemes#ABFS_BLOB_DOMAIN_NAME, and ABFS_DFS_DOMAIN_NAME
💔 -1 overall
This message was automatically generated. |
throws AzureBlobFileSystemException { | ||
// Todo: [FnsOverBlob] - Remove this check, Failing FS Init with Blob Endpoint Until FNS over Blob is ready. | ||
if (getConfiguredServiceType() == AbfsServiceType.BLOB) { | ||
throw new InvalidConfigurationValueException(FS_DEFAULT_NAME_KEY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also inform what is wrong with the config.
|
||
public AbfsBlobClient getBlobClient() { | ||
Preconditions.checkNotNull(blobAbfsClient, "Blob client is not initialized"); | ||
return (AbfsBlobClient) blobAbfsClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can directly have fields of AbfsBlobClient and AbfsDfsClient, this would remove need of runtime casting.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
|
||
final URL url = createRequestUrl(path, abfsUriQueryBuilder.toString()); | ||
final AbfsRestOperation op = getAbfsRestOperation( | ||
AbfsRestOperationType.GetPathStatus, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be GetBlobProperties
💔 -1 overall
This message was automatically generated. |
I'm losing track of this work. Why was it closed? |
This PR had grown too much in size and we thought of breaking it down into 2 smaller PRs covering the changes. To simplify commit history I created two smaller PRs targeting the same Work Item. Would be great if you could review the second PR. Thanks |
Description of PR
Scope of this task is to refactor the ABfsClient so that ABFSStore can choose to interact with the client it wants based on the endpoint configured by user.
The blob endpoint support will remain "Unsupported" until the whole code is checked-in and well tested.
The patch shows higher number of files changed because yetus was somehow failing with javadoc error for files not changed in this patch.
How was this patch tested?
This is a first patch to support Blob Endpoint for FNS Accounts. Full E2E tests will be added once the whole code is ready for this. Till then, user won't be able to use ABFS Driver with Blob Endpoint.
To verify the DFS Endpoint keeps working seamlessly, the whole test suite was run and validated. Few Additional tests were added to verify the Blob Endpoints APIs implemented but they won't run until the Support for Blob Endpoint is enabled.
Metric related tests are fixed in the the #6847
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?