-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable browsing s3 via jupyter-fs #13
Conversation
this is beautiful haha |
Hmm, so this just doesn't recognize 'directories' unless a specific empty object exists? So it probably won't work for readonly data buckets that don't do that? |
My exact workaround won't, though you could do a read-only version of my workaround where it returns a fake Info model as if it were a real one instead of creating the object and trying again. The advantage of my version is it will only take the fallback path one time for any given missing directory, and will do the 'right' thing forever after. |
This one appears to work for read-only: import fs.errors
from fs.info import Info, ResourceType
from fs_s3fs import S3FS
class EnsureDirS3FS(S3FS):
def getinfo(self, path, namespaces=None):
try:
return super().getinfo(path, namespaces)
except fs.errors.ResourceNotFound as e:
# workaround https://github.com/PyFilesystem/s3fs/issues/70
# check if it's a directory with no corresponding Object (not created by S3FS)
# scandir/getinfo don't work on missing directories, but listdir does
# if it's really a directory, return stub Info instead of failing
try:
self.listdir(path)
except fs.errors.ResourceNotFound:
raise e from None
else:
# return fake Info
# based on S3FS.getinfo handling of root (`/`)
name = path.rstrip("/").rsplit("/", 1)[-1]
return Info(
{
"basic": {
"name": name,
"is_dir": True,
},
"details": {"type": int(ResourceType.directory)},
}
) |
@minrk this is great! How do you control the list of buckets that show up here? |
We're only working with one bucket. I think you need to explicitly list each bucket you want to mount in the resources config. Ours is here. The bucket name (or arbitrary subdir) is in the mount. |
Interesting! Was https://github.com/destination-earth/DestinE_ESA_GFTS/pull/13/files#diff-96599d676c72313e9986285fd7ab9d14b18d8bec6167a33056b85ad4d2529435R101 needed as well, even if you only want the sidebar to show up? |
Yes, the listing requests use the contents API at special I'm not 100% sure why this is implemented by overriding ContentsManager rather than replicating the API on a different endpoint. |
@tinaok this adds an S3 browser to the JupyterLab sidebar:
@yuvipanda this is what I mentioned to you yesterday. It seems to work fine with JupyterLab 4, but I had to do some shenanigans to work around PyFilesystem/s3fs#70 because our files are not created with S3FS (they are created with s3fs), and S3FS makes some hard assumptions that it's created everything it might read (namely that an empty Object exists representing each directory level, which is not true in general). I did the definitely-totally-fine thing of catching the error that raises when a directory lacks a corresponding Object and making those empty objects if they are missing.