-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logical types #110
Comments
There's only one case that I can see in the spec where decoding the underlying value for a logical type might fail, and that's when decoding a For UUID mismatch, I think it would be fine to throw a runtime error. In any case, the error there isn't that the logical type is unknown (it's clearly the known logical type For the other cases it mentions (e.g. an invalid decimal type), I think the right approach would be to discard the logical type information when parsing the schema. I've realised that there is actually an ambiguity in the spec. It says "Language implementations must ignore unknown logical types when reading" but doesn't make it clear if that applies to unknown logical types in the writer's schema or in the reader's schema. In the spirit of the spec, I think they probably meant the former, making it possible for a reader to read the underlying value even when it was written with a schema with an unknown logical type. For errors in the reader's schema, I think it might be good for gogen-avro to at least have an option to give an error on unknown or malformed logical types, even if that's not the default, because otherwise it would be easy for mistakes to be inappropriately ignored. One other thing about logical types: does anyone know of a decent Go package that implements fixed-precision decimal support in the style specified by Avro? There's a multi-precision decimal package, but that seems like it might be a bit heavyweight. |
So I think that is probably a reasonable compromise, that is ignoring invalid logical type definitions but raising errors on valid ones with malformed input (i.e UUIDs). I haven't had time to dig into other implementations as things were very busy EOY and now it's into the holiday break. When I am back on the tools I will try get into this to validate what is consensus. In regards to the decimal library I don't think there are any libraries that implement a fixed precision decimal quite in the style of the Avro spec. We are using github.com/shopspring/decimal and some conversion code I gisted here: https://gist.github.com/josephglanville/d1453fcf8a249721950026c0e376810a. |
@josephglanville Maybe there is a way to represent LogicalTypes with the schema package? I tried to add it as part of definitions but when I do a union over it, it breaks. |
Implementing logical types that provide automatic serialisation and deserialisation of higher level types into/from Avro primitive types is a highly deserable feature and is implemented by https://github.com/linkedin/goavro here.
Logical types are defined and described in the spec thus:
The last paragraph is somewhat ambiguous. The way it's written implies only that the validity of the type declaration matters. Which for writing isn't a problem, for reading it should also not be a problem as the generated code always represents the reader schema.
However if you instead interpret it to mean that it should fall back to the underlying type when logical decoding fails then it's a significant bit less elegant as the generated types would now need to also contain a fallback field in the struct in order to write when logical decoding fails.
The Python library
fastavro
takes the approach of simply throwing a runtime error on invalid data. I will examine some of the other libraries to work out if there is a consensus on what the spec means here.The text was updated successfully, but these errors were encountered: