Skip to content

UPath created with specific filesystem does not use that filesystem #376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alex-at-reliant opened this issue May 16, 2025 · 3 comments
Closed
Labels
question ❓ Further information is requested

Comments

@alex-at-reliant
Copy link

I am setting up

fs = fsspec.filesystem(
  "filecache",
  target_protocol="gs", 
  cache_storage="/tmp/some/cache/folder/", 
)

to do

path = UPath("gs://bucket/some/path/to/some/file.json", fs=fs)

Then path.read_text() returns the right file content, but no file is cached in /tmp/some/cache/folder/. Moreover, path.fs returns the vanilla gcsfs.core.GCSFileSystem, not the expected WholeFileCacheFileSystem.

Am I using the API wrong? How does one force a filesystem to be used with upath?

@alex-at-reliant alex-at-reliant changed the title UPath created with specific filesystem does not use that filesystem? UPath created with specific filesystem does not use that filesystem May 16, 2025
@ap--
Copy link
Collaborator

ap-- commented May 16, 2025

Hi Alex,

there are two issues at hand here.

  1. UPath.__init__ does not support a fs keyword argument, which means you can't provide an instantiated filesystem to UPath.

  2. chained filesystems as you're trying to use with the filecache are currently not supported. Once wip: start implementing chain functionality #346 is completed and merged this functionality will be available and supported.

That being said, you could try the following right now, though it is untested and might not work:

# if this works I would not recommend to using it.
# better wait for the next release when chained urlpath support is available
pth = UPath(
  "bucket/some/path/to/some/file.json"
  protocol="filecache",
  target_protocol="gs", 
  cache_storage="/tmp/some/cache/folder/", 
)

Also, if you don't mind me asking: Was there some text in the documentation, that made you assume you can pass fs=filesystem_instance to UPath as a keyword argument?

@ap--
Copy link
Collaborator

ap-- commented May 16, 2025

Partial duplicate of #207

@ap-- ap-- marked this as a duplicate of #207 May 16, 2025
@ap-- ap-- closed this as completed May 16, 2025
@ap-- ap-- added the question ❓ Further information is requested label May 16, 2025
@alex-at-reliant
Copy link
Author

Thanks a lot for the quick feedback. I don't remember how I came about this reference in the flurry of searches I did to see how I could cache stuff with fsspec and upath, unfortunately. And I remember not being able to confirm if filesystem or fs is a viable kwarg for upath.UPath by just looking at the code, due to the indirection around how kwargs are blindly passed around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question ❓ Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants