diff --git a/README.md b/README.md index 9aa98823c6..6121110fb0 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@
A native Rust library for Delta Lake, with bindings to Python
- Python docs
+ Python docs
·
Rust docs
·
@@ -48,7 +48,7 @@ API that lets you query, inspect, and operate your Delta Lake with ease.
[pypi]: https://pypi.org/project/deltalake/
[pypi-dl]: https://img.shields.io/pypi/dm/deltalake?style=flat-square&color=00ADD4
-[py-docs]: https://delta-io.github.io/delta-rs/python/
+[py-docs]: https://delta-io.github.io/delta-rs/
[rs-docs]: https://docs.rs/deltalake/latest/deltalake/
[crates]: https://crates.io/crates/deltalake
[crates-dl]: https://img.shields.io/crates/d/deltalake?color=F75101
diff --git a/docs/usage/appending-overwriting-delta-lake-table.md b/docs/usage/appending-overwriting-delta-lake-table.md
new file mode 100644
index 0000000000..0930d8da1e
--- /dev/null
+++ b/docs/usage/appending-overwriting-delta-lake-table.md
@@ -0,0 +1,78 @@
+# Appending to and overwriting a Delta Lake table
+
+This section explains how to append to an exising Delta table and how to overwrite a Delta table.
+
+## Delta Lake append transactions
+
+Suppose you have a Delta table with the following contents:
+
+```
++-------+----------+
+| num | letter |
+|-------+----------|
+| 1 | a |
+| 2 | b |
+| 3 | c |
++-------+----------+
+```
+
+Append two additional rows of data to the table:
+
+```python
+from deltalake import write_deltalake, DeltaTable
+
+df = pd.DataFrame({"num": [8, 9], "letter": ["dd", "ee"]})
+write_deltalake("tmp/some-table", df, mode="append")
+```
+
+Here are the updated contents of the Delta table:
+
+```
++-------+----------+
+| num | letter |
+|-------+----------|
+| 1 | a |
+| 2 | b |
+| 3 | c |
+| 8 | dd |
+| 9 | ee |
++-------+----------+
+```
+
+Now let's see how to perform an overwrite transaction.
+
+## Delta Lake overwrite transactions
+
+Now let's see how to overwrite the exisitng Delta table.
+
+```python
+df = pd.DataFrame({"num": [11, 22], "letter": ["aa", "bb"]})
+write_deltalake("tmp/some-table", df, mode="overwrite")
+```
+
+Here are the contents of the Delta table after the overwrite operation:
+
+```
++-------+----------+
+| num | letter |
+|-------+----------|
+| 11 | aa |
+| 22 | bb |
++-------+----------+
+```
+
+Overwriting just performs a logical delete. It doesn't physically remove the previous data from storage. Time travel back to the previous version to confirm that the old version of the table is still accessable.
+
+```
+dt = DeltaTable("tmp/some-table", version=1)
+
++-------+----------+
+| num | letter |
+|-------+----------|
+| 1 | a |
+| 2 | b |
+| 3 | c |
+| 8 | dd |
+| 9 | ee |
++-------+----------+
+```
diff --git a/docs/usage/create-delta-lake-table.md b/docs/usage/create-delta-lake-table.md
new file mode 100644
index 0000000000..3a2f023a47
--- /dev/null
+++ b/docs/usage/create-delta-lake-table.md
@@ -0,0 +1,25 @@
+# Creating a Delta Lake Table
+
+This section explains how to create a Delta Lake table.
+
+You can easily write a DataFrame to a Delta table.
+
+```python
+from deltalake import write_deltalake
+import pandas as pd
+
+df = pd.DataFrame({"num": [1, 2, 3], "letter": ["a", "b", "c"]})
+write_deltalake("tmp/some-table", df)
+```
+
+Here are the contents of the Delta table in storage:
+
+```
++-------+----------+
+| num | letter |
+|-------+----------|
+| 1 | a |
+| 2 | b |
+| 3 | c |
++-------+----------+
+```
diff --git a/docs/usage/deleting-rows-from-delta-lake-table.md b/docs/usage/deleting-rows-from-delta-lake-table.md
new file mode 100644
index 0000000000..e1833c84b9
--- /dev/null
+++ b/docs/usage/deleting-rows-from-delta-lake-table.md
@@ -0,0 +1,34 @@
+# Deleting rows from a Delta Lake table
+
+This section explains how to delete rows from a Delta Lake table.
+
+Suppose you have the following Delta table with four rows:
+
+```
++-------+----------+
+| num | letter |
+|-------+----------|
+| 1 | a |
+| 2 | b |
+| 3 | c |
+| 4 | d |
++-------+----------+
+```
+
+Here's how to delete all the rows where the `num` is greater than 2:
+
+```python
+dt = DeltaTable("tmp/my-table")
+dt.delete("num > 2")
+```
+
+Here are the contents of the Delta table after the delete operation has been performed:
+
+```
++-------+----------+
+| num | letter |
+|-------+----------|
+| 1 | a |
+| 2 | b |
++-------+----------+
+```
diff --git a/docs/usage/optimize/delta-lake-z-order.md b/docs/usage/optimize/delta-lake-z-order.md
new file mode 100644
index 0000000000..54be212c47
--- /dev/null
+++ b/docs/usage/optimize/delta-lake-z-order.md
@@ -0,0 +1,16 @@
+# Delta Lake Z Order
+
+This section explains how to Z Order a Delta table.
+
+Z Ordering colocates similar data in the same files, which allows for better file skipping and faster queries.
+
+Suppose you have a table with `first_name`, `age`, and `country` columns.
+
+If you Z Order the data by the `country` column, then individuals from the same country will be stored in the same files. When you subquently query the data for individuals from a given country, it will execute faster because more data can be skipped.
+
+Here's how to Z Order a Delta table:
+
+```python
+dt = DeltaTable("tmp")
+dt.optimize.z_order([country])
+```
diff --git a/docs/usage/small-file-compaction-with-optimize.md b/docs/usage/optimize/small-file-compaction-with-optimize.md
similarity index 100%
rename from docs/usage/small-file-compaction-with-optimize.md
rename to docs/usage/optimize/small-file-compaction-with-optimize.md
diff --git a/mkdocs.yml b/mkdocs.yml
index 41f0ee309c..514872e5c8 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -19,12 +19,17 @@ nav:
- Usage:
- Installation: usage/installation.md
- Overview: usage/index.md
- - Loading a Delta Table: usage/loading-table.md
- - Examining a Delta Table: usage/examining-table.md
- - Querying a Delta Table: usage/querying-delta-tables.md
- - Managing a Delta Table: usage/managing-tables.md
- - Writing Delta Tables: usage/writing-delta-tables.md
- - Small file compaction: usage/small-file-compaction-with-optimize.md
+ - Creating a table: usage/create-delta-lake-table.md
+ - Loading a table: usage/loading-table.md
+ - Append/overwrite tables: usage/appending-overwriting-delta-lake-table.md
+ - Examining a table: usage/examining-table.md
+ - Querying a table: usage/querying-delta-tables.md
+ - Managing a table: usage/managing-tables.md
+ - Writing a table: usage/writing-delta-tables.md
+ - Deleting rows from a table: usage/deleting-rows-from-delta-lake-table.md
+ - Optimize:
+ - Small file compaction: usage/optimize/small-file-compaction-with-optimize.md
+ - Z Order: usage/optimize/delta-lake-z-order.md
- API Reference:
- api/delta_table.md
- api/schema.md