-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow disabling virtual-hosted-style addressing #208
Comments
Hello, @DamienMatias ! |
@DamienMatias Hi - lakeFS maintainer here 👋🏽 |
I'am also interested in disabling virtual-hosted-style addressing. I would like to use S3 Connector for PyTorch with a S3-compatible Ceph storage that can not be configured for virtual-hosted-style addressing because of the DNS implication. Only path-style addressing can be used in our context. |
The S3 client we use supports disabling virtual-hosted-style addressing here, so I think this would just be a matter of plumbing through a new flag from the various constructors ( |
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, path_style=True) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, path_style=True) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" checkpoint = S3Checkpoint(region=REGION, path_style=True) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, path_style=True) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, path_style=True) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" checkpoint = S3Checkpoint(region=REGION, path_style=True) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, path_style=True) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, path_style=True) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" checkpoint = S3Checkpoint(region=REGION, path_style=True) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3ClientConfig, S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" s3client_config = S3ClientConfig(path_style=True) iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint, S3ClientConfig import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" checkpoint = S3Checkpoint(region=REGION, s3client_config=s3client_config) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3ClientConfig, S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" s3client_config = S3ClientConfig(path_style=True) iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint, S3ClientConfig import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" s3client_config = S3ClientConfig(path_style=True) checkpoint = S3Checkpoint(region=REGION, s3client_config=s3client_config) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3ClientConfig, S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint, S3ClientConfig import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) checkpoint = S3Checkpoint(region=REGION, s3client_config=s3client_config) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3ClientConfig, S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint, S3ClientConfig import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) checkpoint = S3Checkpoint(region=REGION, s3client_config=s3client_config) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3ClientConfig, S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint, S3ClientConfig import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) checkpoint = S3Checkpoint(region=REGION, s3client_config=s3client_config) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes awslabs#208 Signed-off-by: Bala.FA <[email protected]>
This PR extends support to other S3 object storage like MinIO which has path-style addressing to access bucket/object. An example is like ```py from s3torchconnector import S3ClientConfig, S3MapDataset, S3IterableDataset DATASET_URI="s3://<BUCKET>/<PREFIX>" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) iterable_dataset = S3IterableDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) for item in iterable_dataset: print(item.key) map_dataset = S3MapDataset.from_prefix(DATASET_URI, region=REGION, s3client_config=s3client_config) item = map_dataset[0] bucket = item.bucket key = item.key content = item.read() len(content) ``` And ```py from s3torchconnector import S3Checkpoint, S3ClientConfig import torchvision import torch CHECKPOINT_URI="s3://<BUCKET>/<KEY>/" REGION = "us-east-1" s3client_config = S3ClientConfig(force_path_style=True) checkpoint = S3Checkpoint(region=REGION, s3client_config=s3client_config) model = torchvision.models.resnet18() with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer: torch.save(model.state_dict(), writer) with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader: state_dict = torch.load(reader) model.load_state_dict(state_dict) ``` Fixes #208 Signed-off-by: Bala.FA <[email protected]>
I'll reopen this as while its merged, there's no new release. (Looks like I mistakenly linked closing the PR to closing this issue.) This should be supported in the next published version! |
This feature was release in v1.2.5, so closing the request. |
Tell us more about this new feature.
Hello,
I was wondering if there was a way to force the usage of path style requests instead of the default virtual-hosted-style addressing ?
This is actually possible with mountpoint-s3 as you can see here in their documentation.
This would allow us using the LakeFS S3 Gateway and potentially other usages that leverage tools still built around the path style addressing.
Thank you 🙏
The text was updated successfully, but these errors were encountered: