Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use greenery and regex_transformer to merge pattern and patternProperties keywords #85

Open
Zac-HD opened this issue Jun 5, 2021 · 3 comments

Comments

@Zac-HD
Copy link
Member

Zac-HD commented Jun 5, 2021

Currently, hypothesis-jsonschema basically just gives up if the schema has two overlapping regular expressions - at best, we'll try to randomly pick one to generate examples from, and use the other to filter. The status quo is mostly fine, but we can do better!

(Not all "python regular expressions" are truly regular in the sense of being equivalent to finite automata. Also, JSONSchema regex actually follow ECMA262 syntax, which is neither truly regular nor entirely compatible with the Python syntax. Fortunately the recommended subset is both compatible (except for Python allowing a trailing newline with $) and - with some special handling for lookahead - regular, so we'll continue our approach of handling what we can and gracefully degrading on the rest.)

This is a medium-to-large feature to develop, since regex_transformer exists but isn't packaged or particularly mature, and of unknown (neg-medium to medium) benefit. greenery might also need some patches to make Unicode handling more efficient. However I'd also like better regex handling in upstream Hypothesis, and it should only get easier over time!

@mristin
Copy link

mristin commented Oct 17, 2022

@Zac-HD here is our use case for which this feature is relevant.

We automatically generate a schema based on the specs for some data exchange format. Multiple patterns thusdo occur as there is quite a bit of multiple inheritance in the specs. This feature is also relevant for Hypothesis, where chained filters are silently rewritten in more optimal strategies.

@Zac-HD
Copy link
Member Author

Zac-HD commented Oct 17, 2022

I'm not planning to work on this or the related Hypothesis issue any time soon, sorry, so if this is a business need you might need to work on it yourselves or pay a contractor (I have a shortlist). The first step would be getting support for Python (and ideally JS) regex syntax shipped in greenery; after that I'd expect it to be fairly straightforward.

@mristin
Copy link

mristin commented Oct 17, 2022

@Zac-HD thanks for the timeline info! I suppose we'll try to play with greenery ourselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants