From 625c36ae1a75d280522a84cedae1824a873d4655 Mon Sep 17 00:00:00 2001 From: Enes Yesil <90360687+enesyesil@users.noreply.github.com> Date: Sun, 2 Nov 2025 15:38:06 -0500 Subject: [PATCH 1/6] Docs: add missing pages (about, benchmarks, fileio, security) to site nav --- site/nav.yml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/site/nav.yml b/site/nav.yml index 008b094e755f..a470060192ea 100644 --- a/site/nav.yml +++ b/site/nav.yml @@ -64,6 +64,12 @@ nav: - Blogs: blogs.md - Talks: talks.md - Vendors: vendors.md + + - About: about.md + - Benchmarks: benchmarks.md + - File I/O: fileio.md + - Security: security.md + - Specification: - Terms: terms.md - REST Catalog Spec: rest-catalog-spec.md From 1bcc65af1b2cf813759764fcb8f360319e38145d Mon Sep 17 00:00:00 2001 From: Enes Yesil Date: Tue, 4 Nov 2025 16:51:48 -0500 Subject: [PATCH 2/6] update nav.yml --- site/nav.yml | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/site/nav.yml b/site/nav.yml index dcbfd534092d..df8073e55cca 100644 --- a/site/nav.yml +++ b/site/nav.yml @@ -90,6 +90,8 @@ nav: - Project: - Contributing: contribute.md - Multi-engine support: multi-engine-support.md + - Benchmarks: benchmarks.md + - Security: security.md - How to release: how-to-release.md - ASF: - Sponsorship: https://www.apache.org/foundation/sponsorship.html @@ -102,11 +104,6 @@ nav: - Community: community.md - Talks: talks.md - Vendors: vendors.md - - - About: about.md - - Benchmarks: benchmarks.md - - File I/O: fileio.md - - Security: security.md - Specification: - Terms: terms.md From 43aae64fd12874cb5204be7d3e658d20bbf602de Mon Sep 17 00:00:00 2001 From: Enes Yesil Date: Tue, 4 Nov 2025 18:52:10 -0500 Subject: [PATCH 3/6] docs: add fileio link under api section --- docs/mkdocs.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 6b5099f0e0d1..93fef8d1306e 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -38,6 +38,7 @@ nav: - API: - Quickstart: java-api-quickstart.md - API: api.md + - File I/O: ../../fileio/ - Javadoc: ../../javadoc/latest/ - Integrations: - Apache Spark: From 10f79dfe9affc192e32aad16fbc5eab9247a7626 Mon Sep 17 00:00:00 2001 From: Enes Yesil Date: Thu, 6 Nov 2025 23:11:31 -0500 Subject: [PATCH 4/6] docs: add File I/O page to docs and verify sidebar --- docs/docs/fileio.md | 40 ++++++++++++++++++++++++++++++++++++++++ docs/mkdocs.yml | 2 +- 2 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 docs/docs/fileio.md diff --git a/docs/docs/fileio.md b/docs/docs/fileio.md new file mode 100644 index 000000000000..0062219a49bd --- /dev/null +++ b/docs/docs/fileio.md @@ -0,0 +1,40 @@ +--- +title: "FileIO" +--- + + +# Iceberg FileIO + +## Overview + +Iceberg comes with a flexible abstraction around reading and writing data and metadata files. The FileIO interface allows the Iceberg library to communicate with the underlying storage layer. FileIO is used for all metadata operations during the job planning and commit stages. + +## Iceberg Files + +The metadata for an Iceberg table tracks the absolute path for data files which allows greater abstraction over the physical layout. Additionally, changes to table state are performed by writing new metadata files and never involve renaming files. This allows a much smaller set of requirements for file operations. The essential functionality for a FileIO implementation is that it can read files, write files, and seek to any position within a stream. + +## Usage in Processing Engines + +The responsibility of reading and writing data files lies with the processing engines and happens during task execution. However, after data files are written, processing engines use FileIO to write new Iceberg metadata files that capture the new state of the table. + +Different FileIO implementations are used depending on the type of storage. Iceberg comes with a set of FileIO implementations for popular storage providers. +- Amazon S3 +- Google Cloud Storage +- Object Service Storage (including https) +- Dell Enterprise Cloud Storage +- Hadoop (adapts any Hadoop FileSystem implementation) diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index c8a7afdd76ad..13b187909208 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -38,7 +38,7 @@ nav: - API: - Quickstart: java-api-quickstart.md - API: api.md - - File I/O: ../../fileio/ + - File I/O: fileio.md - Javadoc: ../../javadoc/latest/ - Integrations: - Apache Spark: From 3b5a79e25ed5478f723123152e7f85d66e008376 Mon Sep 17 00:00:00 2001 From: Kevin Liu Date: Mon, 10 Nov 2025 08:58:55 -0800 Subject: [PATCH 5/6] mv fileio --- site/docs/fileio.md | 40 ---------------------------------------- 1 file changed, 40 deletions(-) delete mode 100644 site/docs/fileio.md diff --git a/site/docs/fileio.md b/site/docs/fileio.md deleted file mode 100644 index 0062219a49bd..000000000000 --- a/site/docs/fileio.md +++ /dev/null @@ -1,40 +0,0 @@ ---- -title: "FileIO" ---- - - -# Iceberg FileIO - -## Overview - -Iceberg comes with a flexible abstraction around reading and writing data and metadata files. The FileIO interface allows the Iceberg library to communicate with the underlying storage layer. FileIO is used for all metadata operations during the job planning and commit stages. - -## Iceberg Files - -The metadata for an Iceberg table tracks the absolute path for data files which allows greater abstraction over the physical layout. Additionally, changes to table state are performed by writing new metadata files and never involve renaming files. This allows a much smaller set of requirements for file operations. The essential functionality for a FileIO implementation is that it can read files, write files, and seek to any position within a stream. - -## Usage in Processing Engines - -The responsibility of reading and writing data files lies with the processing engines and happens during task execution. However, after data files are written, processing engines use FileIO to write new Iceberg metadata files that capture the new state of the table. - -Different FileIO implementations are used depending on the type of storage. Iceberg comes with a set of FileIO implementations for popular storage providers. -- Amazon S3 -- Google Cloud Storage -- Object Service Storage (including https) -- Dell Enterprise Cloud Storage -- Hadoop (adapts any Hadoop FileSystem implementation) From b503c7eff6c2f50c2d629a6959051f886d20b210 Mon Sep 17 00:00:00 2001 From: Kevin Liu Date: Mon, 10 Nov 2025 08:59:07 -0800 Subject: [PATCH 6/6] also add to mkdocs-dev for local render --- site/mkdocs-dev.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/site/mkdocs-dev.yml b/site/mkdocs-dev.yml index 5466094de4d9..ce02dd99357a 100644 --- a/site/mkdocs-dev.yml +++ b/site/mkdocs-dev.yml @@ -76,6 +76,8 @@ nav: - Project: - Contributing: contribute.md - Multi-engine support: multi-engine-support.md + - Benchmarks: benchmarks.md + - Security: security.md - How to release: how-to-release.md - ASF: - Sponsorship: https://www.apache.org/foundation/sponsorship.html