Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta-rs not using latest object_store which fails the table optimize #2988

Closed
gprashmi opened this issue Nov 12, 2024 · 5 comments
Closed
Labels
bug Something isn't working

Comments

@gprashmi
Copy link

Environment

Delta-rs version: 0.21.0


Bug

What happened: We write data to delta table using delta-rs with PyArrow engine with DayHour as partition column. Our Day

deltalake.write_deltalake(
            table_or_uri=delta_table_path,
            data=df,
            partition_by=[dayhour_partition_column],
            schema_mode='overwrite',
            mode="append",
            storage_options={"AWS_S3_ALLOW_UNSAFE_RENAME": "true"},
        )

I ran the optimize command using the spark sql query below on the delta table

optimize_query = f"""
OPTIMIZE delta.`s3_table_path`
ZORDER BY (col1, col2)
"""
spark.sql(optimize_query)

After optimize, it creates partitions with spaces and does not properly encode the partition urls as shown in the below image i.e; it creates new partitions url with spaces (.zstd.parquet).

image

The issue #2978 suggests delta-rs needs to be updated to have the recent object_store version 0.10.2 for this to be fixed.

@ion-elgreco @rtyler Could you please let me know if this is going to be fixed anytime soon?

@gprashmi gprashmi added the bug Something isn't working label Nov 12, 2024
@gprashmi
Copy link
Author

@thomasfrederikhoeck

@ion-elgreco
Copy link
Collaborator

This is not a PR though

@gprashmi
Copy link
Author

@ion-elgreco I would need to create a pull request to make this change from my end? This is not planned as part of the next release?

@thomasfrederikhoeck
Copy link
Contributor

I have created a PR #2994

@thomasfrederikhoeck
Copy link
Contributor

@gprashmi should be closed in #2994

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants