Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avro: Support default values for generic data #11786

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

rdblue
Copy link
Contributor

@rdblue rdblue commented Dec 13, 2024

This implements default values in the Avro generic reader.

case "time-micros":
return GenericReaders.times();

case "timestamp-micros":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does there need to be case for nanos as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does, but adding the timestamp nanos read path is a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that we have this string all over the place, probably a good idea to make a constant for it: https://github.com/search?q=repo%3Aapache%2Ficeberg+timestamp-micros+language%3AJava&type=code&l=Java

We can also do that in a separate PR.

@rdblue rdblue changed the title Avro: Support default values for generic data (WIP) Avro: Support default values for generic data Dec 16, 2024
@Fokko Fokko self-requested a review December 16, 2024 20:50
@rdblue rdblue force-pushed the avro-support-default-values-generic branch from f3cd33a to a9896a4 Compare December 16, 2024 20:51
/**
* @deprecated will be removed in 2.0.0; use {@link PlannedDataReader} instead.
*/
@Deprecated
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changed files in this PR are to avoid using this class now that it is deprecated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked, and didn't find any usage anymore 👍

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some unrelated nits, apart from that it looks good to me 👍

/**
* @deprecated will be removed in 2.0.0; use {@link PlannedDataReader} instead.
*/
@Deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked, and didn't find any usage anymore 👍

case "time-micros":
return GenericReaders.times();

case "timestamp-micros":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that we have this string all over the place, probably a good idea to make a constant for it: https://github.com/search?q=repo%3Aapache%2Ficeberg+timestamp-micros+language%3AJava&type=code&l=Java

We can also do that in a separate PR.

((LogicalTypes.Decimal) logicalType).getScale());

case "uuid":
return ValueReaders.uuids();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR, but I think we also need to take the physical type into account: https://github.com/apache/avro/blob/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification/_index.md#uuid

First Avro only supported UUIDs stored as strings, but now also as fixed[16].

*/
public static <D> RawDecoder<D> create(
org.apache.iceberg.Schema readSchema,
Function<org.apache.iceberg.Schema, DatumReader<D>> readerFunction,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants