Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logicalType - keep property #435

Closed
jwtryg opened this issue Aug 16, 2024 · 8 comments · Fixed by #469 · May be fixed by #448
Closed

logicalType - keep property #435

jwtryg opened this issue Aug 16, 2024 · 8 comments · Fixed by #469 · May be fixed by #448

Comments

@jwtryg
Copy link

jwtryg commented Aug 16, 2024

Currently when parsing an avro schema, the schema.go file contains a number of reserved properties that are removed from the schemas. These reserved properties include the logicalType property.

However, some projects, like apache iceberg, uses a number of logical types that are not included in the avro specification (see this) and therefore don't have corresponding LogicalTypes defined. When parsing such an avro schema, these unknown logicalTypes are then simply ignored and furthermore removed from the schema propertiies which makes them impossible to detect.

My suggestion for a simple solution is that the logicalType properties are kept by removing it from the list of schemaReserved properties. Or do you have other alternatives?

@nrwiersma
Copy link
Member

Someone floated an idea of supporting custom logical types. That could be an alternative.

@jwtryg
Copy link
Author

jwtryg commented Aug 18, 2024

That would be awesome! I guess that means a rather large change though, if the types that does not currently support logical types should also be able to support custom logical types.

However, I would be willing to take a stab at a PR for this, but I might need a bit of guidance :)

@nrwiersma
Copy link
Member

Hey. You are welcome to have a go. Let me know where you get stuck and I will help if I can.

@jhump
Copy link
Contributor

jhump commented Oct 10, 2024

In addition to preserving unrecognized logical types, it would also be nice to preserve unrecognized properties.

Currently, if a custom property has name that matches a defined field for some other type (for example, a property named "fields" for an array type or a property named "symbols" for a record type, etc), it gets discarded when the schema is parsed.

Similarly, types that do not have defined logical types in the main Avro spec (such as records, enums, maps, and arrays) do not preserve a "logicalType" attribute when parsed.

And another case of losing properties is with fixed types -- if they happen to use a logical type other than "decimal" that also happens to use "precision" and "scale" properties, not only is the logical type not retained but neither are the precision and scale values.

@jhump
Copy link
Contributor

jhump commented Oct 10, 2024

My suggestion for a simple solution is that the logicalType properties are kept by removing it from the list of schemaReserved properties. Or do you have other alternatives?

FWIW, this wouldn't quite be enough. It would work to preserve the "map" logical type for arrays. But for the "timestamp-nanos" and "uuid" logical types for long and fixed[16] respectively, these get discarded at a different stage and are never in the property map to begin with.

@jhump
Copy link
Contributor

jhump commented Oct 10, 2024

Actually, it looks like "uuid" logical type for fixed[16] is in the spec, but it's not yet implemented by this module:
https://avro.apache.org/docs/1.12.0/specification/#uuid

@nrwiersma
Copy link
Member

UUID is supported by this module, via encoding.TextUnmarshaler interface. There are too many packages for this to support a single one, but almost all of them support this interface.

@jhump
Copy link
Contributor

jhump commented Oct 11, 2024

@nrwiersma, "UUID" is currently only supported in this repo for string types. But the spec also states that it can be used with fixed[16].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants