Skip to content

Commit

Permalink
Address feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
gene-db committed Feb 7, 2025
1 parent bb5b69a commit 0567172
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion protocol_rfcs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Here is the history of all the RFCs propose/accepted/rejected since Feb 6, 2024,
| 2023-02-26 | [column-mapping-usage.tracking.md](https://github.com/delta-io/delta/blob/master/protocol_rfcs/column-mapping-usage-tracking.md) | https://github.com/delta-io/delta/issues/2682 | Column Mapping Usage Tracking |
| 2023-04-24 | [variant-type.md](https://github.com/delta-io/delta/blob/master/protocol_rfcs/variant-type.md) | https://github.com/delta-io/delta/issues/2864 | Variant Data Type |
| 2024-04-30 | [collated-string-type.md](https://github.com/delta-io/delta/blob/master/protocol_rfcs/collated-string-type.md) | https://github.com/delta-io/delta/issues/2894 | Collated String Type |
| 2025-01-09 | [variant-shredding.md](https://github.com/delta-io/delta/blob/master/protocol_rfcs/variant-shredding.md) | https://github.com/delta-io/delta/issues/4032 | Variant Shredding |
| 2025-02-07 | [variant-shredding.md](https://github.com/delta-io/delta/blob/master/protocol_rfcs/variant-shredding.md) | https://github.com/delta-io/delta/issues/4032 | Variant Shredding |

### Accepted RFCs

Expand Down
6 changes: 3 additions & 3 deletions protocol_rfcs/variant-shredding.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Shredding allows Variant data to be be more efficiently stored and queried.
This feature enables support for shredding of the Variant data type, to store and query Variant data more efficiently.
Shredding a Variant value is taking paths from the Variant value, and storing them as a typed column in the file.
The shredding does not duplicate data, so if a value is stored in the typed column, it is removed from the Variant binary.
Storing Variant values as typed columns is faster to access, and enables skipping with statistics.
Storing Variant values as typed columns is faster to access, and enables data skipping with statistics.

The `variantShredding` feature depends on the `variantType` feature.

Expand All @@ -31,7 +31,7 @@ Struct field name | Parquet primitive type | Description
-|-|-
metadata | binary | (required) The binary-encoded Variant metadata, as described in [Parquet Variant binary encoding](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md)
value | binary | (optional) The binary-encoded Variant value, as described in [Parquet Variant binary encoding](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md)
typed_value | * | (optional) This can be any Parquet type, representing the data stored in the Variant. Details of the shredding scheme is found in the [Parquet Variant binary encoding](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md)
typed_value | * | (optional) This can be any Parquet type, representing the data stored in the Variant. Details of the shredding scheme is found in the [Parquet Variant binary encoding](https://github.com/apache/parquet-format/blob/master/VariantShredding.md)

## Writer Requirements for Variant Shredding

Expand All @@ -42,4 +42,4 @@ When Variant Shredding is supported (`writerFeatures` field of a table's `protoc

When Variant type is supported (`readerFeatures` field of a table's `protocol` action contains `variantShredding`), readers:
- must recognize and tolerate a `variant` data type in a Delta schema
- must tolerate a parquet schema that is either unshredded (only `metadata` and `value` struct fields) or shredded (`metadata`, `value`, and `typed_value` struct fields) when reading a Variant data type from file.
- must recognize and correctly process a parquet schema that is either unshredded (only `metadata` and `value` struct fields) or shredded (`metadata`, `value`, and `typed_value` struct fields) when reading a Variant data type from file.

0 comments on commit 0567172

Please sign in to comment.