diff --git a/docs/source/delta-batch.md b/docs/source/delta-batch.md index 4e3bf03cdf..53250e4f47 100644 --- a/docs/source/delta-batch.md +++ b/docs/source/delta-batch.md @@ -740,6 +740,37 @@ Each time a checkpoint is written, Delta automatically cleans up log entries old .. note:: Due to log entry cleanup, instances can arise where you cannot time travel to a version that is less than the retention interval. requires all consecutive log entries since the previous checkpoint to time travel to a particular version. For example, with a table initially consisting of log entries for versions [0, 19] and a checkpoint at verison 10, if the log entry for version 0 is cleaned up, then you cannot time travel to versions [1, 9]. Increasing the table property `delta.logRetentionDuration` can help avoid these situations. +### In-Commit Timestamps + +#### Overview + 3.3 introduced [In-Commit Timestamps](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#in-commit-timestamps) to provide a more reliable and consistent way to track table modification timestamps. These modification timestamps are needed for various usecases e.g. time-travel to a specific time in the past. This feature addresses limitations of the traditional approach that relied on file modification timestamps, particularly in scenarios involving data migration or replication. + +#### Feature Details +In-Commit Timestamps stores modification timestamps within the commit itself, ensuring they remain unchanged regardless of file system operations. This provides several benefits: + +- **Immutable History**: Timestamps become part of the table's permanent commit history +- **Consistent Time Travel**: Queries using timestamp-based time travel produce reliable results even after table migration + +Without the In-Commit Timestamp feature, uses file modification timestamps as the commit timestamp. This approach has various limitations: + +1. Data Migration Issues: When tables were moved between storage locations, file modification timestamps would change, potentially disrupting historical tracking +2. Replication Scenarios: Timestamp inconsistencies could arise when replicating data across different environments +3. Time Travel Reliability: These timestamp changes could affect the accuracy and consistency of time travel queries + +#### Enabling the Feature +This is a [writer table feature](versioning.md#what-are-table-features) and can be enabled by setting the table property `delta.enableInCommitTimestamps` to `true`: + +```sql +ALTER TABLE +SET TBLPROPERTIES ('delta.enableInCommitTimestamps' = 'true'); +``` + +After enabling In-Commit Timestamps: +- Only new write operations will include the embedded timestamps +- File modification timestamps will continued to be used for historical commits performed before enablement + +See the [Versioning](./versioning) section for more details around compatibility. + ## Write to a table diff --git a/docs/source/delta-drop-feature.md b/docs/source/delta-drop-feature.md index 61081e01eb..ac81a59446 100644 --- a/docs/source/delta-drop-feature.md +++ b/docs/source/delta-drop-feature.md @@ -30,7 +30,7 @@ You can drop the following Delta table features: - `columnMapping`. See [_](delta-column-mapping.md). Drop support for column mapping is available in 3.3.0 and above. - `vacuumProtocolCheck`. See [Vacuum Protocol Check Spec](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#vacuum-protocol-check). Drop support for vacuum protocol check is available in 3.3.0 and above. - `checkConstraints`. See [_](delta-constraints.md). Drop support for check constraints is available in 3.3.0 and above. -- `inCommitTimestamp`. See [_](delta-batch.md#in-tommit-timestamps). Drop support for In-Commit Timestamp is available in 3.3.0 and above. +- `inCommitTimestamp`. See [_](delta-batch.md#in-commit-timestamps). In-Commit Timestamp is available in 3.3.0 and above. You cannot drop other [Delta table features](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#valid-feature-names-in-table-features). diff --git a/docs/source/table-properties.md b/docs/source/table-properties.md index 01a58269d9..318173d3cd 100644 --- a/docs/source/table-properties.md +++ b/docs/source/table-properties.md @@ -169,6 +169,17 @@ properties are set. Available Delta table properties include: | | | Default: `classic` | +-------------------------------------------------------------------------------------------+ +| `delta.enableInCommitTimestamps` | +| | +| `true` for enabling the InCommitTimestamps table feature. | +| | +| | +| See [_](delta-batch.md#in--commit-timestamps). | +| | +| Data type: `Boolean` | +| | +| Default: `false` | ++-------------------------------------------------------------------------------------------+ .. replace:: Delta Lake .. replace:: Apache Spark \ No newline at end of file diff --git a/docs/source/versioning.md b/docs/source/versioning.md index 2135a6d5b0..bf6df741a0 100644 --- a/docs/source/versioning.md +++ b/docs/source/versioning.md @@ -29,6 +29,7 @@ The following features break forward compatibility. Features are enabled Row Tracking, [Delta Lake 3.2.0](https://github.com/delta-io/delta/releases/tag/v3.2.0),[_](/delta-row-tracking.md) Type widening (Preview),[Delta Lake 3.2.0](https://github.com/delta-io/delta/releases/tag/v3.2.0),[_](/delta-type-widening.md) Identity columns, [Delta Lake 3.3.0](https://github.com/delta-io/delta/releases/tag/v3.3.0),[_](/delta-batch.md#use-identity-columns) + In-Commit Timestamps, [Delta Lake 3.3.0](https://github.com/delta-io/delta/releases/tag/v3.3.0),[_](/delta-batch.md#use-identity-columns) @@ -113,6 +114,7 @@ The following table shows minimum protocol versions required for feature Vacuum Protocol Check,7,3,[Vacuum Protocol Check Spec](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#vacuum-protocol-check) Row Tracking,7,3,[_](/delta-row-tracking.md) Type widening (Preview),7,3,[_](/delta-type-widening.md) + In-Commit Timestamps,7,3,[In-Commit Timestamps Spec](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#in-commit-timestamps)