-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC][3.3] Doc changes for InCommitTimestamps #3979
Conversation
docs/source/delta-batch.md
Outdated
### In-Commit Timestamps | ||
|
||
#### Overview | ||
<Delta> 3.3 introduced [In-Commit Timestamps](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#in-commit-timestamps) to provide a more reliable and consistent way to track table modifications. This feature addresses limitations of the traditional approach that relied on file modification timestamps, particularly in scenarios involving data migration or replication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we add delta-spark versions in our documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it has been done before. From the same file:
You can selectively overwrite only the data that matches an arbitrary expression. This feature is available with DataFrames in <Delta> 1.1.0 and above and supported in SQL in <Delta> 2.4.0 and above.
docs/source/delta-batch.md
Outdated
<Delta> 3.3 introduced [In-Commit Timestamps](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#in-commit-timestamps) to provide a more reliable and consistent way to track table modifications. This feature addresses limitations of the traditional approach that relied on file modification timestamps, particularly in scenarios involving data migration or replication. | ||
|
||
#### Background | ||
Previously, <Delta> used file modification timestamps as the source of truth for table modifications. This approach presented several challenges: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, <Delta> used file modification timestamps as the source of truth for table modifications. This approach presented several challenges: | |
Without the In-Commit Timestamp feature, <Delta> uses file modification timestamps as the commit timestamp. The commit timestamps are needed for various usecases e.g. time-travel to a specific time in the past. This approach has various limitations: |
docs/source/delta-batch.md
Outdated
#### Overview | ||
<Delta> 3.3 introduced [In-Commit Timestamps](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#in-commit-timestamps) to provide a more reliable and consistent way to track table modifications. This feature addresses limitations of the traditional approach that relied on file modification timestamps, particularly in scenarios involving data migration or replication. | ||
|
||
#### Background |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is documentation and so we don't want to explain what delta used to do before and why this feature was built.
Instead we want to tell the behavior of Delta with and without this feature to users.
- Section-1: Overview
- Section-2: This could be renamed to Feature Details i.e. we can merge Background and Feature Details section.
- Inside
Feature Details
, We can talk about how Delta behaves when the feature is enabled. - Inside
Feature Details
, Next we can talk about how Delta behaves when feature is disabled + its limitations. - Then Section-3 - we can talk about how to enable the feature.
docs/source/delta-batch.md
Outdated
3. Time Travel Reliability: These timestamp changes could affect the accuracy and consistency of time travel queries | ||
|
||
#### Enabling the Feature | ||
This feature can be enabled by setting the table property `delta.enableInCommitTimestamps` to `true`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature can be enabled by setting the table property `delta.enableInCommitTimestamps` to `true`: | |
This is a Writer table feature and can be enabled by setting the table property `delta.enableInCommitTimestamps` to `true`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also attach the link to Writer TableFetaure section (if any).
Which Delta project/connector is this regarding?
Description
Updates docs with details about InCommitTimestamps.
How was this patch tested?
N/A
Does this PR introduce any user-facing changes?
No