Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema Registration Ordering #12

Open
OneCricketeer opened this issue Apr 11, 2019 · 2 comments
Open

Schema Registration Ordering #12

OneCricketeer opened this issue Apr 11, 2019 · 2 comments
Labels
bug Something isn't working

Comments

@OneCricketeer
Copy link
Owner

OneCricketeer commented Apr 11, 2019

The source topic could have messages posted with "out of order" schemas, with respect to the order generated in the source registry.

When "schema 2" gets copied before "schema 1", then the destination fails with a backwards incompatible exception.

This needs tested, but the only workaround I can think of would be to do an initial schema sync ahead of starting the SMT, or set the destination compatibility to none/full


EDIT

Doesn't look like the mock registry client does any compatibility checking.

Will need to add that first

@OneCricketeer OneCricketeer added the bug Something isn't working label Apr 12, 2019
@twiechert
Copy link
Contributor

Good catch. I strumbled upon this as well

I guess one option would be to create on arrival of v1 a mediated compatible schema that re-adds all dropped fields that were introduced with v2 (assuming backward compat.).

Since this should only happen across partitions (within a partititon, schema ids should be increasing), one could think of leaving partitions unassigned as long as their most current record references a too recent schema version.

@OneCricketeer
Copy link
Owner Author

OneCricketeer commented Oct 28, 2019

@twiechert

Since this should only happen across partitions (within a partititon, schema ids should be increasing)

This is not necessarily the case. I've ran into instances where teams will post their schemas to the registry ahead of time, then one team will update their producer schemas, meanwhile another team has not upgraded yet. Then you get data in the topic that would look like the following for schema ids [1, 1, 1, 2, 2, 1, 2, 1, 2]. If the those first [1, 1, 1] ids expired and the connector is started, then this transform would send ID 2 (as ID 1 in the destination), then try to send ID 1 (as ID 2 in the destination), regardless of the partition.

I don't think it makes sense to put in a config sync on a per-message basis. Plus, there isn't a way to lookup the source subject name to do a GET /config/:subject call which is why I was thinking of a separate script to be used ahead of starting the connector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants