Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsigned Access to Pubilc S3 Data #6421

Merged
merged 11 commits into from
Aug 29, 2022
Merged

Unsigned Access to Pubilc S3 Data #6421

merged 11 commits into from
Aug 29, 2022

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Aug 24, 2022

I had to make two changes in the s3fs library. To do that, I copied it into webKnossos (and added the original MIT license in the package). I’ll mark the relevant changes I made to the library below to make the review easier.

URL of deployed dev instance (used for testing):

Steps to test:

  • In add remote zarr dataset view, paste s3 uri with no credentials, e.g. s3://power-analysis-ready-datastore.s3.amazonaws.com/power_901_monthly_meteorology_utc.zarr/EVLAND, click Add Layer, should yield a datasource
  • Also try abbreviated s3 syntax, e.g. s3://power-analysis-ready-datastore/power_901_monthly_meteorology_utc.zarr/EVLAND (no s3.amazonaws.com), should also work
  • Also make sure that s3 access with credentials still works

Issues:

Note that this also includes the changes of #6422 because I needed them for testing. I’ll take care of merge conflicts if they come up.


@fm3 fm3 self-assigned this Aug 24, 2022
@fm3 fm3 changed the title [WIP] Unsigned Access to Pubilc S3 Data Unsigned Access to Pubilc S3 Data Aug 25, 2022
@fm3 fm3 marked this pull request as ready for review August 25, 2022 07:45
@fm3 fm3 requested a review from jstriebel August 25, 2022 07:45
@jstriebel
Copy link
Contributor

I’ll mark the relevant changes I made to the library below to make the review easier.

@fm3: Did you mark those already? I wasn't quite sure which are the new parts, when skimming through the changes.

@fm3
Copy link
Member Author

fm3 commented Aug 25, 2022

Sorry, apparently I forgot hitting submit 🙈

@jstriebel
Copy link
Contributor

I’ll mark the relevant changes I made to the library below to make the review easier.

@fm3: Did you mark those already? I wasn't quite sure which are the new parts, when skimming through the changes.

@fm3
Copy link
Member Author

fm3 commented Aug 25, 2022

Now I’m confused, did you intend to send the same message again?

@jstriebel
Copy link
Contributor

I’ll mark the relevant changes I made to the library below to make the review easier.

@fm3: Did you mark those already? I wasn't quite sure which are the new parts, when skimming through the changes.

@jstriebel
Copy link
Contributor

Now I’m confused, did you intend to send the same message again?

No, sorry. Gh was hiccuping, I guess that's why it was sent twice. Thanks for the comments 👍

Copy link
Contributor

@jstriebel jstriebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, code LGTM! During testing I found one issue, but feel free to schedule this separately:

During testing I found a case where this logic seemed to have failed:
Importing s3://fmi-opendata-silam-surface-zarr/global/20220725/silam_glob_v5_7_1_20220725_SO2_d0.zarr/SO2/ does not work, the logs show

Error when reading s3://fmi-opendata-silam-surface-zarr.s3.amazonaws.com/global/20220725/silam_glob_v5_7_1_20220725_SO2_d0.zarr/SO2/ as multiscales group: Failed to read OME NGFF header at s3://fmi-opendata-silam-surface-zarr.s3.amazonaws.com/global/20220725/silam_glob_v5_7_1_20220725_SO2_d0.zarr/SO2/.zattrs <~ Failed to read remote file <~ The specified bucket does not exist (Service: Amazon S3; Status Code: 404; Error Code: NoSuchBucket; Request ID: 5EYWMNERW872C2ZG; S3 Extended Request ID: ApjZOT6sx7zk7IdBFb2OLNjUqLrgagDetMDYG0Ra2Z2XU+tiuN4GAxJqiq6CPuJ9aJKSo7jBwRM=; Proxy: null)

However, using the https version worked (not showing data yet, I might need to wait longer, but import was successful):
https://fmi-opendata-silam-surface-zarr.s3.amazonaws.com/global/20220725/silam_glob_v5_7_1_20220725_SO2_d0.zarr/SO2/

@normanrz
Copy link
Member

Would it be possible to open a PR on the original s3fs library with the changes?

@fm3
Copy link
Member Author

fm3 commented Aug 29, 2022

During testing I found a case where this logic seemed to have failed

Huh, ok, I’ll have another look and try to see which uri versions work. If I need to replace the host, the logic will become more involved, though, so I might move this to a follow-up issue.

Would it be possible to open a PR on the original s3fs library with the changes?

Possibly, though it would need some more work. My current changes break some functionality of the original client. If no user/password was passed, it would then look in the provider chain, meaning on disk, env vars etc. And fail if none of those places had something. I would have to create a way to parameterize the code to tell it not to do that. I would like doing that, but it should also not block this PR. /Edit: follow-up issue #6430

@fm3
Copy link
Member Author

fm3 commented Aug 29, 2022

During testing I found a case where this logic seemed to have failed

Neither expanded versions seem to work ( s3://fmi-opendata-silam-surface-zarr.s3.amazonaws.com/global/20220725/silam_glob_v5_7_1_20220725_SO2_d0.zarr/SO2/, s3://s3.amazonaws.com/fmi-opendata-silam-surface-zarr/global/20220725/silam_glob_v5_7_1_20220725_SO2_d0.zarr/SO2/ ) so I guess it would then have to be tried also as https directly. This is not easy to integrate in the fs abstraction :( I’d say this is out of scope for this PR, and I am unsure how to tackle this later 🤔

@fm3 fm3 merged commit c40f99a into master Aug 29, 2022
@fm3 fm3 deleted the s3-unsigned branch August 29, 2022 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow loading data from public S3 buckets without credentials
3 participants