From 4ba7fdc666a03f0d84d8010e514174991a3e2f62 Mon Sep 17 00:00:00 2001 From: DenChenn Date: Mon, 21 Oct 2024 23:21:48 +0800 Subject: [PATCH] [Docs] add streaming support example for file and directory Signed-off-by: DenChenn --- .../data_types_and_io/flytedirectory.md | 17 +++++++++++++++++ docs/user_guide/data_types_and_io/flytefile.md | 16 ++++++++++++++++ 2 files changed, 33 insertions(+) diff --git a/docs/user_guide/data_types_and_io/flytedirectory.md b/docs/user_guide/data_types_and_io/flytedirectory.md index 121a7d9b677..de67080409e 100644 --- a/docs/user_guide/data_types_and_io/flytedirectory.md +++ b/docs/user_guide/data_types_and_io/flytedirectory.md @@ -86,4 +86,21 @@ You can run the workflow locally as follows: :lines: 94-114 ``` + +## Streaming support + +Flyte `1.5` introduced support for streaming `FlyteDirectory` types via the `fsspec` library. +The `FlyteDirectory` streaming feature enables efficient streaming and handling of entire directories, simplifying operations involving multiple files. + +:::{note} +This feature is marked as experimental. We'd love feedback on the API! +::: + +Here is a simple example, you can accept a `FlyteDirectory` as an input, walk through it and copy the files to another `FlyteDirectory` one by one. + +```{rli} https://raw.githubusercontent.com/DenChenn/flytesnacks/8dd7bf9708ff56d1fbee37f31763cba3277c102b/examples/data_types_and_io/data_types_and_io/file_streaming.py +:caption: data_types_and_io/file_streaming.py +:lines: 23-33 +```{note} + [flytesnacks]: https://github.com/flyteorg/flytesnacks/tree/master/examples/data_types_and_io/ diff --git a/docs/user_guide/data_types_and_io/flytefile.md b/docs/user_guide/data_types_and_io/flytefile.md index e9c02e2132b..3b829edde41 100644 --- a/docs/user_guide/data_types_and_io/flytefile.md +++ b/docs/user_guide/data_types_and_io/flytefile.md @@ -90,4 +90,20 @@ You can enable type validation if you have the [python-magic](https://pypi.org/p Currently, type validation is only supported on the `Mac OS` and `Linux` platforms. ::: +## Streaming support + +Flyte `1.5` introduced support for streaming `FlyteFile` types via the `fsspec` library. +This integration enables efficient, on-demand access to remote files, eliminating the need for fully downloading them to local storage. + +:::{note} +This feature is marked as experimental. We'd love feedback on the API! +::: + +Here is a simple example of removing some columns from a CSV file and writing the result to a new file: + +```{rli} https://raw.githubusercontent.com/DenChenn/flytesnacks/8dd7bf9708ff56d1fbee37f31763cba3277c102b/examples/data_types_and_io/data_types_and_io/file_streaming.py +:caption: data_types_and_io/file_streaming.py +:lines: 8-20 +```{note} + [flytesnacks]: https://github.com/flyteorg/flytesnacks/tree/master/examples/data_types_and_io/