From 308a5b1f5b9aa8b42968a9c99817be418a1e41e4 Mon Sep 17 00:00:00 2001 From: Ville Puuska <40150442+VillePuuska@users.noreply.github.com> Date: Sat, 21 Sep 2024 10:01:57 +0000 Subject: [PATCH] update docs usage/managing-tables section on optimizing tables --- docs/usage/managing-tables.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/docs/usage/managing-tables.md b/docs/usage/managing-tables.md index 8a2f20580a..b3ab3540e1 100644 --- a/docs/usage/managing-tables.md +++ b/docs/usage/managing-tables.md @@ -26,4 +26,11 @@ Use `DeltaTable.vacuum` to perform the vacuum operation. Note that to prevent ac ## Optimizing tables -Optimizing tables is not currently supported. \ No newline at end of file +Optimizing a table compacts small files into larger files to avoid the small file problem. This is especially important for tables that get small amounts of data appended to with high frequency. In addition to compacting small files, you can colocate similar data in the same files with Z Ordering, which allows for better file skipping and faster queries. + +A table `dt = DeltaTable(...)` has two methods for optimizing it: + +- `dt.optimize.compact()` for compacting small files, +- `dt.optimize.z_order()` to compact and apply Z Ordering. + +See the section [Small file compaction](./optimize/small-file-compaction-with-optimize.md) for more information and a detailed example on `compact`, and the section [Z Order](./optimize/delta-lake-z-order.md) for more information on `z_order`.