-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/1730 extend filesystem sftp #1769
Conversation
✅ Deploy Preview for dlt-hub-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
03393ce
to
5fd34be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
high quality PR! code and sftp specific tests LGTM. what we need to do is to integrate those tests with other filesystem tests. I think we won't need a separate github ci workflow. we'll also reuse a lot of existing general purpose tests that ie. test the glob funtionality
please try the following:
- in tests/.dlt/config.toml add bucket url to the test sftp server
- add sftp to ALL_FILESYSTEM_DRIVERS and WITH_GDRIVE_BUCKETS (tests/load/utils)
now all tests that test buckets directly will see sftp.
ie.
@pytest.mark.parametrize("write_disposition", ("replace", "append", "merge"))
@pytest.mark.parametrize("layout", TEST_FILE_LAYOUTS)
def test_successful_load(write_disposition: str, layout: str, with_gdrive_buckets_env: str) -> None:
there are plenty of tests that run filesystem as destination. those are also enabled by the above.
you can skip the whole module (I'm referring to test_filesystem_sftp). see how we skip inactive destinations here:
def skip_if_not_active(destination: str) -> None:
and do the same for filesystems.
when this is done you can merge your test_destination_sftp CI workflow with test_local_destinations.
just enable sftp via
ALL_FILESYSTEM_DRIVERS: "[\"memory\", \"file\"]"
start the sftp server and stuff should run
@rudolfix thanks for the review! setting up the sftp server was fun. I've addressed all the requested changes - could you take another look and let me know if I missed anything? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! thanks for the tests and docs. I added the extra for sftp
to follow our conventions (and also fsspec)
Description
Extend filesystem source and destination to work with sftp.
As a user, I want to be able to load data from and to sftp server. Probably it can be done already with fsspec
A few implementation hints. Look in
fsspec_filesystem
:MTIME_DISPATCH
to define a mapping (I hope sftp fsspec has it - upload time should be available)FilesystemConfiguration
on how we add credentials for particular filesystems (based on protocol)tests:
glob_files
is tested to startRelated Issues