-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark] Deprecate tableVersion field in type widening metadata #3334
[Spark] Deprecate tableVersion field in type widening metadata #3334
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments, LGTM otherwise
// tableVersion was a field present during the preview and removed afterwards. We preserve it if | ||
// it's already present in the type change metadata of the table to avoid breaking older clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do they still use it for? Would it be possible to always set it to some constant number (e.g. zero)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information is used to decide which files to rewrite when dropping the table feature.
That is, until #3155 where we switched to looking at parquet footers instead.
Delta 3.2 and Delta 4.0 preview use the "old" way.
We could set it to Long.MaxValue
, which would effectively cause all files to be rewritten but that's pretty hacky
// The version field isn't used anymore but we need to populate it in case the table doesn't | ||
// use the stable feature, as preview clients may then still access the table and rely on | ||
// the field being present. | ||
if (!txn.protocol.isFeatureSupported(TypeWideningTableFeature)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we check for TypeWideningTableFeature
not being there instead of TypeWideningPreviewTableFeature
being there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because a table can technically have both the preview and stable feature if one is present and a user manually add the other using ALTER TABLE t SET TBLPROPERTIES ('delta.feature.typeWidening[-preview]' = 'supported')
As soon as the stable feature is supported, we know preview versions can't access the table and we don't need to populate tableVersion
.
This is a bit convoluted for an edge case that really should be extremely rare, on second thoughts it's probably better to always set tableVersion
when the preview feature is supported to avoid having to reason about this edge case.
I updated this change
s"TBLPROPERTIES ('${DeltaConfigs.ENABLE_TYPE_WIDENING.key}' = 'true')") | ||
sql(s"CREATE TABLE $testTableName (a int) USING delta TBLPROPERTIES (" + | ||
s"'${DeltaConfigs.ENABLE_TYPE_WIDENING.key}' = 'true', " + | ||
s"'${propertyKey(TypeWideningTableFeature)}' = 'supported')") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? Isn't the feature automatically enabled when setting the property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preview feature is enabled by default, this forces the stable to be added instead.
I added a comment to clarify that and that we're testing with stable feature by default and only explicit set the preview feature in specific tests that cover the old behavior re: tableVersion
.
- added a missing test for the preview feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Description
The protocol specification for type widening dropped the requirement to populate a
tableVersion
field as part of type change history to track in which version of the table a given change was applied.See protocol update: #3297
This field was used at some point during the preview but isn't needed anymore.
It is now deprecated:
The last point is necessary to avoid breaking preview clients (Delta 3.2 & Delta 4.0 preview) that require the field to be set.
How was this patch tested?
tableVersion
not being set by default.tableVersion
being set.tableVersion
when using the preview and stable table features.Does this PR introduce any user-facing changes?
Yes. As of this change, a table that supports the stable table feature
typeWidening
won't have atableVersion
field in the type change history stored in the table metadata.Tables that only support the preview table feature
typeWidening-preview
don't see any change.