Small improvements

mapleFU · May 23, 2024 · 2b003d6 · 2b003d6
1 parent bb9ca66
commit 2b003d6
Showing 1 changed file with 16 additions and 17 deletions.
diff --git a/docs/source/cpp/parquet.rst b/docs/source/cpp/parquet.rst
@@ -522,17 +522,16 @@ An Arrow Dictionary type is written out as its value type.  It can still
 be recreated at read time using Parquet metadata (see "Roundtripping Arrow
 types" below).
 
-Roundtripping Arrow types
-~~~~~~~~~~~~~~~~~~~~~~~~~
+Roundtripping Arrow types and schema
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 While there is no bijection between Arrow types and Parquet types, it is
 possible to serialize the Arrow schema as part of the Parquet file metadata.
 This is enabled using :func:`ArrowWriterProperties::store_schema`.
 
 On the read path, the serialized schema will be automatically recognized
 and will recreate the original Arrow data, converting the Parquet data as
-required (for example, a LargeList will be recreated from the Parquet LIST
-type).
+required.
 
 As an example, when serializing an Arrow LargeList to Parquet:
 
@@ -542,6 +541,19 @@ As an example, when serializing an Arrow LargeList to Parquet:
   :func:`ArrowWriterProperties::store_schema` was enabled when writing the file;
   otherwise, it is decoded as an Arrow List.
 
+Parquet field id
+""""""""""""""""
+
+The Parquet format supports an optional integer *field id* which can be assigned
+to a given field. This is used for example in the
+`Apache Iceberg specification <https://github.com/apache/iceberg/blob/main/format/spec.md#column-projection>`__.
+
+On the writer side, if ``PARQUET:field_id`` is present as a metadata key on an
+Arrow field, then its value is parsed as a non-negative integer and is used as
+the field id for the corresponding Parquet field.
+
+On the reader side, Arrow will convert such a field id to a metadata key named
+``PARQUET:field_id`` on the corresponding Arrow field.
 
 Serialization details
 """""""""""""""""""""
@@ -550,19 +562,6 @@ The Arrow schema is serialized as a :ref:`Arrow IPC <format-ipc>` schema message
 then base64-encoded and stored under the ``ARROW:schema`` metadata key in
 the Parquet file metadata.
 
-Field Id
-""""""""
-
-The Parquet format supports an optional integer "field id" which can be assigned
-to a field. This is used for example in the
-`Apache Iceberg specification <https://github.com/apache/iceberg/blob/main/format/spec.md#column-projection>`__.
-
-On the writer side, If ``PARQUET:field_id`` is present as a metadata key on an Arrow field,
-and the corresponding value is a non-negative integer, then it will be used as
-the "field id" in the Parquet file.
-
-On the reader side, Arrow will convert these "field id"s to a metadata key named
-``PARQUET:field_id`` on the corresponding Arrow field.
 
 Limitations
 ~~~~~~~~~~~