Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot access file with named returned by ls via fsspec interface #198

Open
mxmlnkn opened this issue Oct 12, 2024 · 0 comments
Open

Cannot access file with named returned by ls via fsspec interface #198

mxmlnkn opened this issue Oct 12, 2024 · 0 comments

Comments

@mxmlnkn
Copy link

mxmlnkn commented Oct 12, 2024

Considering this setup:

pip install wsgidav cheroot
mkdir -p /tmp/served
echo foo > /tmp/served/'#not-a-good-name!'
ruby -run -e httpd /tmp/served/ --port 8000 --bind-address=127.0.0.1 &
wsgidav --host=127.0.0.1 --port=8047 --root="/tmp/served" --auth=anonymous &

This works in fsspec.implementations.http:

import pprint
from fsspec.implementations.http import HTTPFileSystem as HFS
url = "http://127.0.0.1:8000"
fs = HFS(url)
# What I would have expected to work:
# result = fs.ls("/")
result = fs.ls(url)
pprint.pprint(result)
pprint.pprint(fs.stat(result[1]['name']))

Output:

[{'name': 'http://127.0.0.1:8000/?N=D', 'size': None, 'type': 'file'},
 {'name': 'http://127.0.0.1:8000/?S=D', 'size': None, 'type': 'file'},
 {'name': 'http://127.0.0.1:8000/%23not-a-good-name!',
  'size': None,
  'type': 'file'},
 {'name': 'http://127.0.0.1:8000/?M=D', 'size': None, 'type': 'file'}]
{'ETag': '4013af-4-670a7539',
 'mimetype': 'application/octet-stream',
 'name': 'http://127.0.0.1:8000/%23not-a-good-name!',
 'size': 4,
 'type': 'file',
 'url': 'http://127.0.0.1:8000/%23not-a-good-name!'}

However, with the webdav4 fsspec implementation:

from webdav4.fsspec import WebdavFileSystem as WFS
fs = WFS("http://127.0.0.1:8047")
result = fs.ls("/")
print(result)
print(fs.stat("/" + urllib.parse.quote(path))
print(fs.stat("/" + result[0]['name']))

Output:

[{'content_language': None,
  'content_type': 'application/octet-stream',
  'created': datetime.datetime(2024, 10, 12, 13, 10, 17, tzinfo=tzutc()),
  'display_name': '#not-a-good-name!',
  'etag': '4199343-1728738617-4',
  'href': '/%23not-a-good-name!',
  'modified': datetime.datetime(2024, 10, 12, 13, 10, 17, tzinfo=datetime.timezone.utc),
  'name': '#not-a-good-name!',
  'size': 4,
  'type': 'file'}]

{'content_language': None,
 'content_type': 'application/octet-stream',
 'created': datetime.datetime(2024, 10, 12, 13, 10, 17, tzinfo=tzutc()),
 'display_name': '#not-a-good-name!',
 'etag': '4199343-1728738617-4',
 'href': '/%23not-a-good-name!',
 'modified': datetime.datetime(2024, 10, 12, 13, 10, 17, tzinfo=datetime.timezone.utc),
 'name': '#not-a-good-name!',
 'size': 4,
 'type': 'file'}

Traceback (most recent call last):
  File "/media/d/Myself/projects/ratarmount/worktrees/1/test-webdav.py", line 8, in <module>
    pprint.pprint(fs.stat("/" + result[0]['name']))
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/fsspec/spec.py", line 1605, in stat
    return self.info(path, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/fsspec.py", line 126, in info
    return translate_info(self.client.info(path))
                          ^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/client.py", line 519, in info
    result = self.propfind(path, headers={"Depth": "1"})
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/client.py", line 309, in propfind
    http_resp = self.with_retry(call)
                ^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/func_utils.py", line 47, in wrapped_function
    return func()
           ^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/func_utils.py", line 70, in wrapped
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/client.py", line 354, in _request
    url = self.join_url(path, add_trailing_slash=add_trailing_slash)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/client.py", line 291, in join_url
    return join_url(self.base_url, path, add_trailing_slash=add_trailing_slash)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/webdav4/urls.py", line 25, in join_url
    return base_url.copy_with(path=path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/httpx/_urls.py", line 356, in copy_with
    return URL(self, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/httpx/_urls.py", line 119, in __init__
    self._uri_reference = url._uri_reference.copy_with(**kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 137, in copy_with
    return urlparse("", **defaults)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 225, in urlparse
    raise InvalidURL(f"Invalid URL component '{key}'")
httpx.InvalidURL: Invalid URL component 'path'

Having to call urllib.parse.quote was unexpected for me, especially as it does not even require the full URL in contrast to HTTPFileSystem. It is also inconsistent that the name is a relative path isntead of an absolute one, but I am not even sure what it should be. I think this is insufficiently specified by fsspec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant