-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for out of band access tokens in the GCS Client #11181
base: main
Are you sure you want to change the base?
Add support for out of band access tokens in the GCS Client #11181
Conversation
ArnavBalyan
commented
Oct 7, 2024
- Currently the GCS client supports only keyfiles for credential based authentication. (However this is not always safe and many usecases leverage access tokens for storage access).
- This change adds support for access token based authentication when interacting with GCS.
- Access tokens can be fetched using an out of band fetch mechanism such as token-broker service or a custom token provider.
- This change adds the contract and ability to use access tokens, the user must bundle their custom token provider which can interact with internal systems and provide an access token on demand.
Hi @ArnavBalyan! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
✅ Deploy Preview for meta-velox canceled.
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
@ArnavBalyan I added another change to GCS config here #11235. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArnavBalyan The PR looks incomplete. The CMakeLists.txt changes are missing. Can you complete this and ensure the build is passing?
Are you able to test this locally?
@@ -28,6 +28,8 @@ | |||
#include <stdexcept> | |||
|
|||
#include <google/cloud/storage/client.h> | |||
#include "TokenProvider.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paths must be relative to root.
Example: velox/connector/velox/connectors/hive/storage_adapters/gcs/TokenProvider.h
Please fix this at other places as well.
@@ -175,6 +180,7 @@ class GCSReadFile final : public ReadFile { | |||
} | |||
|
|||
std::shared_ptr<gcs::Client> client_; | |||
std::function<void(std::shared_ptr<gcs::Client>&, const std::optional<std::string>&, const std::optional<std::vector<std::string>>&)> refreshTokenFn_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment here about the expected inputs are for the refereshTokenFn_
@@ -244,6 +252,7 @@ class GCSWriteFile final : public WriteFile { | |||
|
|||
gcs::ObjectWriteStream stream_; | |||
std::shared_ptr<gcs::Client> client_; | |||
std::function<void(std::shared_ptr<gcs::Client>&, const std::optional<std::string>&, const std::optional<std::vector<std::string>>&)> refreshTokenFn_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here as well.
@@ -267,8 +276,12 @@ class GCSFileSystem::Impl { | |||
// Use the input Config parameters and initialize the GCSClient. | |||
void initializeClient() { | |||
auto options = gc::Options{}; | |||
|
|||
std::string accessTokenEnabled = | |||
hiveConfig_->config()->get<std::string>("gcs.access_token_enabled", "false"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add new configs in HiveConfig.cpp
/HiveConfig.h
if (accessTokenEnabled == "true") { | ||
std::string providerClassName = | ||
hiveConfig_->config()->get<std::string>("gcs.access_token_provider", ""); | ||
throw std::runtime_error("AccessTokenProvider not configured. Please implement an out of band access token provider."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some checks here?
I would LOG an error here and let the GCS library handle the invalid config
client_ = std::make_shared<gcs::Client>(options); | ||
} | ||
|
||
std::shared_ptr<gcs::Client> getClient() const { | ||
return client_; | ||
} | ||
|
||
std::shared_ptr<OutOfBandOAuth2Credentials> credentials_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should go to the private: section below.
#ifndef OUT_OF_BAND_OAUTH2_CREDENTIALS_H | ||
#define OUT_OF_BAND_OAUTH2_CREDENTIALS_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who needs this? gcs library?
Hi @majetideepak, thanks for the review, I have to update the CMakeLists. I'll address the comments and let you know |