Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pl.scan_parquet('gs://...') breaks when GOOGLE_APPLICATION_CREDENTIALS env variable is set #12195

Closed
2 tasks done
Vincenthays opened this issue Nov 2, 2023 · 2 comments
Closed
2 tasks done
Labels
bug Something isn't working python Related to Python Polars

Comments

@Vincenthays
Copy link
Contributor

Vincenthays commented Nov 2, 2023

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl
pl.scan_parquet('gs://<bucket>/<file>.parquet', storage_options={'google_service_account': '/some/google-cloud-creds.json'}) # working

import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/some/google-cloud-creds.json'
pl.scan_parquet('gs://<bucket>/<file>.parquet') # not working (err log bellow)

pl.scan_parquet('gs://<bucket>/<file>.parquet', storage_options={'google_service_account': '/some/google-cloud-creds.json'}) # not working (same error bellow)

Log output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/polars/io/parquet/functions.py", line 268, in scan_parquet
    return pl.LazyFrame._scan_parquet(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/polars/lazyframe/frame.py", line 453, in _scan_parquet
    self._ldf = PyLazyFrame.new_from_parquet(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ComputeError: Generic GCS error: Unsupported ApplicationCredentials type: service_account

Issue description

pl.scan_parquet doesn't work when GOOGLE_APPLICATION_CREDENTIALS env variable is set since 0.19.6

Expected behavior

pl.scan_parquet works with GOOGLE_APPLICATION_CREDENTIALS env variable

Installed versions

--------Version info---------
Polars:              0.19.12
Index type:          UInt32
Platform:            Linux-5.4.0-104-generic-x86_64-with-glibc2.36
Python:              3.11.6 (main, Nov  1 2023, 13:45:43) [GCC 12.2.0]

----Optional dependencies----
adbc_driver_sqlite:  <not installed>
cloudpickle:         <not installed>
connectorx:          <not installed>
deltalake:           <not installed>
fsspec:              2023.10.0
gevent:              <not installed>
matplotlib:          <not installed>
numpy:               1.26.1
openpyxl:            <not installed>
pandas:              2.1.2
pyarrow:             14.0.0
pydantic:            <not installed>
pyiceberg:           <not installed>
pyxlsb:              <not installed>
sqlalchemy:          2.0.22
xlsx2csv:            <not installed>
xlsxwriter:          <not installed>
@Vincenthays Vincenthays added bug Something isn't working python Related to Python Polars labels Nov 2, 2023
@tustvold
Copy link

tustvold commented Nov 7, 2023

This should have been fixed by the latest object_store release, in particular apache/arrow-rs#4926

@Vincenthays
Copy link
Contributor Author

Vincenthays commented Nov 8, 2023

I check on main and it was working just fine, I think you are right, the new version of object_store fixes it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

2 participants