Publish documentation #24

kaaveland · 2024-05-05T22:07:39Z

We should have something bare-bones on rtd, explaining the basic philosophy and design:

Minimal amount of code that works
Speed advantage by catering directly to pyarrow dataset API
"Bring your own DatalakeServiceClient", if you can create it, we can use it -- which means we support all authentication that azure-storage-filedatalake supports

Then we could have some examples for common things like reading hive partitioned datasets, "arrow advantages" like self_destruct, strings_to_categorical etc.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish documentation #24

Publish documentation #24

kaaveland commented May 5, 2024

Publish documentation #24

Publish documentation #24

Comments

kaaveland commented May 5, 2024