-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avro: Support default values for generic data #11786
base: main
Are you sure you want to change the base?
Conversation
case "time-micros": | ||
return GenericReaders.times(); | ||
|
||
case "timestamp-micros": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does there need to be case for nanos as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There does, but adding the timestamp nanos read path is a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that we have this string all over the place, probably a good idea to make a constant for it: https://github.com/search?q=repo%3Aapache%2Ficeberg+timestamp-micros+language%3AJava&type=code&l=Java
We can also do that in a separate PR.
f3cd33a
to
a9896a4
Compare
/** | ||
* @deprecated will be removed in 2.0.0; use {@link PlannedDataReader} instead. | ||
*/ | ||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the changed files in this PR are to avoid using this class now that it is deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked, and didn't find any usage anymore 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some unrelated nits, apart from that it looks good to me 👍
/** | ||
* @deprecated will be removed in 2.0.0; use {@link PlannedDataReader} instead. | ||
*/ | ||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked, and didn't find any usage anymore 👍
case "time-micros": | ||
return GenericReaders.times(); | ||
|
||
case "timestamp-micros": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that we have this string all over the place, probably a good idea to make a constant for it: https://github.com/search?q=repo%3Aapache%2Ficeberg+timestamp-micros+language%3AJava&type=code&l=Java
We can also do that in a separate PR.
((LogicalTypes.Decimal) logicalType).getScale()); | ||
|
||
case "uuid": | ||
return ValueReaders.uuids(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate PR, but I think we also need to take the physical type into account: https://github.com/apache/avro/blob/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification/_index.md#uuid
First Avro only supported UUIDs stored as strings, but now also as fixed[16]
.
*/ | ||
public static <D> RawDecoder<D> create( | ||
org.apache.iceberg.Schema readSchema, | ||
Function<org.apache.iceberg.Schema, DatumReader<D>> readerFunction, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
This implements default values in the Avro generic reader.