Skip to content

Commit

Permalink
docs: Correct concat rechunk in user guide (#18080)
Browse files Browse the repository at this point in the history
  • Loading branch information
deanm0000 authored Aug 8, 2024
1 parent 2d8c661 commit 64c81ff
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/user-guide/transformations/concatenation.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,5 @@ When the dataframe shapes do not match and we have an overlapping semantic key t

## Rechunking

Before a concatenation we have two dataframes `df1` and `df2`. Each column in `df1` and `df2` is in one or more chunks in memory. By default, during concatenation the chunks in each column are copied to a single new chunk - this is known as **rechunking**. Rechunking is an expensive operation, but is often worth it because future operations will be faster.
If you do not want Polars to rechunk the concatenated `DataFrame` you specify `rechunk = False` when doing the concatenation.
Before a concatenation we have two dataframes `df1` and `df2`. Each column in `df1` and `df2` is in one or more chunks in memory. By default, during concatenation the chunks in each column are not made contiguous. This makes the concat operation faster and consume less memory but it may slow down future operations that would benefit from having the data be in contiguous memory. The process of copying the fragmented chunks into a single new chunk is known as **rechunking**. Rechunking is an expensive operation. Prior to version 0.20.26, the default was to perform a rechunk but in new versions, the default is not to.
If you do want Polars to rechunk the concatenated `DataFrame` you specify `rechunk = True` when doing the concatenation.

0 comments on commit 64c81ff

Please sign in to comment.