Is leveraging custom protocols to specify different data sources a bad idea? #1529
-
I'm working on a file storage library for work, and I'm trying to abstract out the underlying file storage layer a bit.
I got a proof of concept working with anonymous classes which works with the protocol registry. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
I have absolutely no problem for you using the registry in this way. You might also be interested in the ReferenceFileSystem ("reference://") which provides a similar redirection: every path maps to some other file, part of a file or concrete data. We also have the prefixFS ("dir://") which allows you to add a prefix to every URL for arbitrary backend. You sound like you want something just inbetween these two choices. |
Beta Was this translation helpful? Give feedback.
-
I'm trying to do a similar thing as you. My solution works but was wondering how you achieved it? class MadFileSystem(fsspec.AbstractFileSystem):
protocol = "mad"
def __init__(self, basepath: str, storage_options: dict | None = None, **kwargs):
options = storage_options or kwargs
fs, fs_url = fsspec.core.url_to_fs(basepath, **options)
self._fs: fsspec.AbstractFileSystem = fs
self._fs_url: str = fs_url
super().__init__(**options)
def glob(self, path: str, **kwargs):
return self._fs.glob(self.fix_path(path), **kwargs)
def info(self, path: str, **kwargs):
return self._fs.info(self.fix_path(path), **kwargs)
def _open(self, path: str, **kwargs):
return self._fs._open(self.fix_path(path), **kwargs)
def rm(self, path: str, **kwargs):
return self._fs.rm(self.fix_path(path), **kwargs)
def mv(self, path1: str, path2: str, **kwargs):
return self._fs.mv(self.fix_path(path1), self.fix_path(path2), **kwargs)
def fix_path(self, path: str):
# Remove protocol from the path if it exists
if path.startswith(f"{self.protocol}://"):
path = path[len(f"{self.protocol}://") :]
# If path is incomplete, prefix the path with the fs_url
if not path.startswith(self._fs_url):
path = f"{self._fs_url.rstrip('/')}/{path}"
return path |
Beta Was this translation helpful? Give feedback.
I have absolutely no problem for you using the registry in this way.
You might also be interested in the ReferenceFileSystem ("reference://") which provides a similar redirection: every path maps to some other file, part of a file or concrete data. We also have the prefixFS ("dir://") which allows you to add a prefix to every URL for arbitrary backend. You sound like you want something just inbetween these two choices.