You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 22, 2024. It is now read-only.
Is your feature request related to a problem? Please describe.
The question is related to backward compatibility support when changing avro schemas.
Is your feature request related to a specific runtime of cloudflow or applicable for all runtimes?
All runtimes.
Describe the solution you'd like
I faced that the cloudflow serialization/deserialization engine does not support backward compatibility for avro messages. I did a some research on the code and found that when creating the codec, only one schema is used, which is presented along with the generated class during compilation(https://github.com/lightbend/cloudflow/blob/master/core/cloudflow-streamlets/src/main/scala/cloudflow/streamlets/avro/AvroCodec.scala#L30). However to maintain backward compatibility in the java avro serialization mechanism, it is necessary to pass two schemas to the SpecificDatumReader - one schema for the message writer, other for the reader(https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java#L49).
In other words, for now to make changes in the schema, it is necessary for the consumer streamlet to read all the messages in the topic, and only then we can update deployed pipeline version. Are there any plans to support backward compatibility for avro messages in the future releases? Maybe the task overlaps with the plans for introducing the schema registry and this will be implemented within that feature? (#1010, #996)
For now I get an error while deserializing with new schema(compatible) message written in old schema:
Caused by: com.twitter.bijection.InversionFailure: Failed to invert: [B@46271dd6
at com.twitter.bijection.InversionFailure$$anonfun$partialFailure$1.applyOrElse(InversionFailure.scala:43)
at com.twitter.bijection.InversionFailure$$anonfun$partialFailure$1.applyOrElse(InversionFailure.scala:42)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
at scala.util.Failure.recoverWith(Try.scala:236)
at com.twitter.bijection.Inversion$.attempt(Inversion.scala:32)
at com.twitter.bijection.avro.BinaryAvroCodec.invert(AvroCodecs.scala:330)
at com.twitter.bijection.avro.BinaryAvroCodec.invert(AvroCodecs.scala:320)
at cloudflow.streamlets.avro.AvroSerde.$anonfun$inverted$1(AvroCodec.scala:39)
at cloudflow.streamlets.avro.AvroSerde.$anonfun$decode$1(AvroCodec.scala:45)
at scala.util.Try$.apply(Try.scala:213)
at cloudflow.streamlets.avro.AvroSerde.decode(AvroCodec.scala:45)
at cloudflow.streamlets.avro.AvroCodec.decode(AvroCodec.scala:34)
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Is your feature request related to a problem? Please describe.
The question is related to backward compatibility support when changing avro schemas.
Is your feature request related to a specific runtime of cloudflow or applicable for all runtimes?
All runtimes.
Describe the solution you'd like
I faced that the cloudflow serialization/deserialization engine does not support backward compatibility for avro messages. I did a some research on the code and found that when creating the codec, only one schema is used, which is presented along with the generated class during compilation(https://github.com/lightbend/cloudflow/blob/master/core/cloudflow-streamlets/src/main/scala/cloudflow/streamlets/avro/AvroCodec.scala#L30). However to maintain backward compatibility in the java avro serialization mechanism, it is necessary to pass two schemas to the SpecificDatumReader - one schema for the message writer, other for the reader(https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java#L49).
In other words, for now to make changes in the schema, it is necessary for the consumer streamlet to read all the messages in the topic, and only then we can update deployed pipeline version. Are there any plans to support backward compatibility for avro messages in the future releases? Maybe the task overlaps with the plans for introducing the schema registry and this will be implemented within that feature? (#1010, #996)
For now I get an error while deserializing with new schema(compatible) message written in old schema:
The text was updated successfully, but these errors were encountered: