apache · pitrou · Mar 18, 2024 · Jan 18, 2024 · Jan 29, 2024 · wgtmac
diff --git a/CHANGES.md b/CHANGES.md
@@ -19,6 +19,12 @@
 
 # Parquet #
 
+### Version 2.11.0 ###
+
+#### New Feature
+
+*   [PARQUET-2414](https://issues.apache.org/jira/browse/PARQUET-2414) - Extend BYTE_STREAM_SPLIT to support INT32, INT64 and FIXED_LEN_BYTE_ARRAY data
+
 ### Version 2.10.0 ###
 
 #### New Feature

diff --git a/Encodings.md b/Encodings.md
@@ -335,14 +335,15 @@ Note that, even for FIXED_LEN_BYTE_ARRAY, all lengths are encoded despite the re
 
 ### Byte Stream Split: (BYTE_STREAM_SPLIT = 9)
 
-Supported Types: FLOAT, DOUBLE
+Supported Types: FLOAT, DOUBLE, INT32, INT64, FIXED_LEN_BYTE_ARRAY
 
 This encoding does not reduce the size of the data but can lead to a significantly better
 compression ratio and speed when a compression algorithm is used afterwards.
 
 This encoding creates K byte-streams of length N where K is the size in bytes of the data
-type and N is the number of elements in the data sequence. Specifically, K is 4 for FLOAT
+type and N is the number of elements in the data sequence. For example, K is 4 for FLOAT
 type and 8 for DOUBLE type.
+
 The bytes of each value are scattered to the corresponding streams. The 0-th byte goes to the
 0-th stream, the 1-st byte goes to the 1-st stream and so on.
 The streams are concatenated in the following order: 0-th stream, 1-st stream, etc.

diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
@@ -526,12 +526,15 @@ enum Encoding {
    */
   RLE_DICTIONARY = 8;
 
-  /** Encoding for floating-point data.
+  /** Encoding for fixed-width data (FLOAT, DOUBLE, INT32, INT64, FIXED_LEN_BYTE_ARRAY).
       K byte-streams are created where K is the size in bytes of the data type.
-      The individual bytes of an FP value are scattered to the corresponding stream and
+      The individual bytes of a value are scattered to the corresponding stream and
       the streams are concatenated.
       This itself does not reduce the size of the data but can lead to better compression
       afterwards.
+
+      Added in 2.8 for FLOAT and DOUBLE.
+      Support for INT32, INT64 and FIXED_LEN_BYTE_ARRAY added in 2.11.
    */
   BYTE_STREAM_SPLIT = 9;
 }