Mismatch in key names between fsspec and object_store breaks pyarrow usage with AWS S3 #21076
Open
2 tasks done
Labels
bug
Something isn't working
needs triage
Awaiting prioritization by a maintainer
python
Related to Python Polars
Checks
Reproducible example
Log output
Issue description
When using
use_pyarrow=True
inread_parquet
, AWS S3 reading breaks because of a mismatch between expected keys in object_store and s3fs. To be very specific, the mismatch is in thes3fs.S3FileSystem
kwargs which is what polars is expecting to fill with the providedstorage_options
(1, 2).These are the primary offending keys.
A temporary workaround is to wrap the intended
storage_options
withclient_kwargs
like so:This will keep the intended
storage_options
(aws_*
) from being loaded as kwargs intoaiobotocore.session.AioSession
by `s3fs.The more permanent solution to regain expected behaviors could be pursued at (1), (2)) like so:
I don't know enough about fsspec to understand the broader impacts here beyond AWS S3. My gut tells me that this could break other cloud resources and there may need to be an AWS check and instead do something like:
Expected behavior
The reproducible example runs without issue and loads the parquet from AWS S3.
Installed versions
The text was updated successfully, but these errors were encountered: