You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In working with polars I've observed that the performance of certain functions is greatly impacted by how you construct queries. For instance, sink_parquet works well if the preceding functions only involve joins, filters, and selects (but hits memory errors if you create new variables or simply wont allow window functions). While collect(streaming = True) and more performant than streaming when creating variables.
Another question is how to write parquet datasets with Polars and perform incremental file builds that writes in the correct format with a proper schema.
The text was updated successfully, but these errors were encountered:
In working with polars I've observed that the performance of certain functions is greatly impacted by how you construct queries. For instance,
sink_parquet
works well if the preceding functions only involve joins, filters, and selects (but hits memory errors if you create new variables or simply wont allow window functions). Whilecollect(streaming = True)
and more performant than streaming when creating variables.Another question is how to write parquet datasets with Polars and perform incremental file builds that writes in the correct format with a proper schema.
The text was updated successfully, but these errors were encountered: