Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Apply schema validation to all topics in local development #2013

Merged
merged 10 commits into from
Apr 14, 2023

Conversation

untitaker
Copy link
Member

@untitaker untitaker commented Apr 5, 2023

Alternative to #1981 -- apply
schema validation to all outgoing messages when Relay runs in
development mode.

This means that messages are validated as part of integration tests.

The additional dependencies are compiled unconditionally, but unused
in release builds. The only way to conditionally compile them is to
introduce yet another featureflag, but that seems burdensome to use
since we use --all-features everywhere right now. We have had similar
issues introducing optional SSL support in Kafka a long time ago, IIRC

This validation does not run in sentry's local development because (I think)
it uses release builds there. In order to make that happen we could add
a "validate schemas" flag to config.yml, off by default.

@untitaker untitaker requested a review from a team April 5, 2023 14:53
Alternative to #1981 -- apply
schema validation to all outgoing messages when Relay runs in
development mode.

This means that messages are validated as part of integration tests.

The additional dependencies are compiled unconditionally, but unused
in release builds. The only way to conditionally compile them is to
introduce yet another featureflag, but that seems burdensome to use
since we use `--all-features` everywhere right now. We have had similar
issues introducing optional SSL support in Kafka a long time ago, IIRC
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach! It has zero runtime overhead in production but is more automatic than relying on manually crafted tests.

Pressing "Request changes" because of some open questions.

@@ -36,6 +39,10 @@ pub enum ClientError {
#[error("failed to serialize json message")]
InvalidJson(#[source] serde_json::Error),

/// Failed to run schema validation on message.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be conditional?

Suggested change
/// Failed to run schema validation on message.
#[cfg(debug_assertions)]
/// Failed to run schema validation on message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it has to be, fixed

_ => return Err(SchemaError::InvalidLogicalTopic),
};

let schema = match sentry_kafka_schemas::get_schema(logical_topic_name, None) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is only for debug builds, but should we keep a cache of schemas by topic name so it isn't loaded from disk for every message? Maybe even put that cache in the sentry-kafka-schemas library?

Copy link
Member Author

@untitaker untitaker Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe sometime but for now we don't want to be opinionated in how people use this library, so we settled for the simplest interface that comes with the least amount of dependencies (aka, return schema as string). This goes for the Python library as well: It requires you to bring your own schema validation library right now, and figure out yourself where it makes sense to store the schema. As such it doesn't have many dependencies. This is a good thing if you're only using the generated types in your service.

I also don't want to build any cache right now because I foresee a (potential, not certain) future where we want/need people to iterate on schemas and relay in parallel (i.e. temporarily pointing relay to a local checkout of the schemas repo), and it would be good if changes to the schema files were picked up immediately in local development without recompiling.

I'm also not convinced that this schema validation library is the best one to use. I detest the fact that the schema and the payload has to be deserialized into intermediate serde_json::Value, but it seems that that is commonpractice for all schema validation libraries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good if changes to the schema files were picked up immediately in local development without recompiling.

I agree, I was thinking more of an in-memory cache that picks up the changes when you restart Relay (or re-run a test). I see your argument against putting such a cache in the library, but I still think it makes sense to put it somewhere (e.g. the KafkaClient). I'm happy to contribute to this PR for this. Or do you see a use case for an actual hot reload, where you keep a local Relay instance running and edit the schema while sending in events?

_ => return Err(SchemaError::InvalidLogicalTopic),
};

let schema = match sentry_kafka_schemas::get_schema(logical_topic_name, None) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should the library have a get_latest_schema helper to make it more explicit what the None does? Or an explicit enum with variant SchemaVersion::Latest instead of an Option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add a get_latest_schema helper (or rename to have get_schema and get_schema_version) -- how strongly do you feel about this happening?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add a get_latest_schema helper

Only if you think it is a good idea.

how strongly do you feel about this happening?

Not strongly at all, I just noticed I had to jump into the library code to see what the None was for.

@@ -18,6 +18,8 @@ rmp-serde = { version = "1.1.1", optional = true }
serde = { version = "1.0.114", features = ["derive"] }
serde_json = { version = "1.0.55", optional = true }
thiserror = "1.0.38"
sentry-kafka-schemas = "0.0.25"
jsonschema = "0.17.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to hide the jsonschema dependency in sentry-kafka-schemas. That is, if we removed this dependency here, and sentry-kafka-schemas exposed a validate(topic, message) function.

That way, the library client (Relay) would not have to deal with compiling schemas, and the way that sentry-kafka-schemas validates would be completely encapsulated (i.e. you could change the library or even the format it uses for schema definition and validation).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my reply to the local cache comment. For now I would like to avoid that. There's still many unsolved questions around how the workflow in Relay will look like, what performance requirements we have on the entire system, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! Definitely not a blocker.

As such it doesn't have many dependencies. This is a good thing if you're only using the generated types in your service.

Makes sense, though this could be solved with cargo features. I.e. there could be a codegen feature and a validation feature, and clients use either-or. My gut feeling is that 90% of clients that want to perform validation will not care about what schema definition language or validation library is used under the hood, hence the idea to hide it inside the library.

Copy link
Member Author

@untitaker untitaker Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm reconsidering my response now. one of the reasons i was hesitant to change this was because python currently also works the same (it just exposes the raw schema, doesn't prescribe a schema validation library)

but at the same time it would be good to have a vetted recommended library that everybody ought to use (for both python and rust)

@untitaker
Copy link
Member Author

@jjbayer after deliberating back and forth I believe we should defer some of the API decisions until we have more feedback from other teams. For that purpose I have summarized the discussion about API here: getsentry/sentry-kafka-schemas#105

Let me know if I missed anything, for now I would like to merge this PR as-is

Add a caching layer to #2013 such
that the schema does not have to be parsed from disk for every message.

---------

Co-authored-by: Markus Unterwaditzer <[email protected]>
@untitaker
Copy link
Member Author

I believe a recent change in sentry-kafka-schemas has made it so that InvalidSchema is returned for every call to get_schema. Need to investigate, but this PR is broken right now (it won't validate anything at all)

@untitaker
Copy link
Member Author

yikes! getsentry/sentry-kafka-schemas#107

@untitaker
Copy link
Member Author

I added LogError somewhere so that a schema validation error for metrics is actually printed to stdout/err

@untitaker untitaker merged commit cd51343 into master Apr 14, 2023
@untitaker untitaker deleted the feat/validate-schemas-in-devserver branch April 14, 2023 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants