-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify S3 credentials directly #148
Comments
For GCS and S3 we have avoided storing credentials in the spec or context spec as that seems more prone to leaking than using a credentials file. The credentials file seems like the best approach, as then you can store the credentials in the same way for the aws cli and tensorstore. You could use one credentials file specific to each user then use the filename/profile of the aws_credentials object as part of the spec. https://google.github.io/tensorstore/kvstore/s3/index.html#json-Context.aws_credentials |
Thanks @laramiel. For other people who might be interested in this, I wrote a helper that creates a temporary profile file to be used with tensorstore: from functools import lru_cache
from pathlib import Path
from tempfile import TemporaryDirectory
from typing import Self
import tensorstore
class AWSCredentialManager:
entries: dict[int, tuple[str, str]]
temp_dir: TemporaryDirectory[str]
credentials_file_path: Path
@classmethod
@lru_cache
def singleton(cls) -> "Self":
return cls()
def __init__(self) -> None:
self.entries = {}
self.temp_dir = TemporaryDirectory()
self.credentials_file_path = Path(self.temp_dir.name) / "aws_credentials"
self.credentials_file_path.touch()
def _dump_credentials(self) -> None:
self.credentials_file_path.write_text(
"\n".join(
[
f"[profile-{key_hash}]\naws_access_key_id = {access_key_id}\naws_secret_access_key = {secret_access_key}\n"
for key_hash, (
access_key_id,
secret_access_key,
) in self.entries.items()
]
)
)
def add(self, access_key_id: str, secret_access_key: str) -> dict[str, str]:
key_tuple = (access_key_id, secret_access_key)
key_hash = hash(key_tuple)
self.entries[key_hash] = key_tuple
self._dump_credentials()
return {
"profile": f"profile-{key_hash}",
"filename": str(self.credentials_file_path),
"metadata_endpoint": "",
}
aws_credential_manager = AWSCredentialManager.singleton()
spec = {
"driver": "s3",
"bucket": "...",
"path": "...",
"endpoint": "https://s3.eu-central-1.amazonaws.com",
"aws_credentials": aws_credential_manager.add("AKIA...", "...")
}
array = tensorstore.open({"driver": "zarr", "kvstore": spec}).result()
data = array[:, :, :].read().result() |
Just a note about your spec: you should be able to use |
I am happy to see that S3 support has arrived in tensorstore. I was wondering if it would be possible to add support to set the AWS credentials directly in the kvstore json?
Our use case is a server application that handles arrays from multiple users and stores the credentials. Setting env vars is inconvenient because the there can be multiple requests in parallel. Writing our a credentials file seems also leaky.
The text was updated successfully, but these errors were encountered: