Skip to content

Commit

Permalink
add RLE as encoding type
Browse files Browse the repository at this point in the history
  • Loading branch information
mapleFU committed Oct 9, 2023
1 parent d38e083 commit ecec908
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions python/pyarrow/parquet/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -771,7 +771,8 @@ def _sanitize_table(table, new_schema, flavor):
Specify if we should use dictionary encoding in general or only for
some columns.
When encoding the column, if the dictionary size is too large, the
column will fallback to 'PLAIN' encoding.
column will fallback to fallback encoding. Specially, ``BOOLEAN`` type
doesn't support dictionary encoding.
compression : str or dict, default 'snappy'
Specify the compression codec, either on a general basis or per-column.
Valid values: {'NONE', 'SNAPPY', 'GZIP', 'BROTLI', 'LZ4', 'ZSTD'}.
Expand Down Expand Up @@ -823,10 +824,11 @@ def _sanitize_table(table, new_schema, flavor):
and should be combined with a compression codec.
column_encoding : string or dict, default None
Specify the encoding scheme on a per column basis.
Can only be used when when ``use_dictionary`` is set to False, and cannot
be used in combination with ``use_byte_stream_split``.
Can only be used when when ``use_dictionary`` is set to False, and
cannot be used in combination with ``use_byte_stream_split``.
Currently supported values: {'PLAIN', 'BYTE_STREAM_SPLIT',
'DELTA_BINARY_PACKED', 'DELTA_LENGTH_BYTE_ARRAY', 'DELTA_BYTE_ARRAY'}.
'DELTA_BINARY_PACKED', 'DELTA_LENGTH_BYTE_ARRAY', 'DELTA_BYTE_ARRAY',
'RLE}.
Certain encodings are only compatible with certain data types.
Please refer to the encodings section of `Reading and writing Parquet
files <https://arrow.apache.org/docs/cpp/parquet.html#encodings>`_.
Expand Down

0 comments on commit ecec908

Please sign in to comment.