Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actual pattern validation? #9

Open
kingthorin opened this issue Oct 4, 2023 · 4 comments
Open

Actual pattern validation? #9

kingthorin opened this issue Oct 4, 2023 · 4 comments

Comments

@kingthorin
Copy link
Contributor

kingthorin commented Oct 4, 2023

Is your feature request related to a problem? Please describe.
As part of the CI workflow for PRs (etc) would it be possible to validate the regex patterns or dom selectors ?

In the past we've found that upstream of AliasIO we encountered invalid regex patterns added to the technology files, or invalid selectors.

The two "normal" cases seemed to be:

  • when a regex contained curly braces which either weren't matched in a repetition declaration or when they weren't escaped.
  • when a dom selector had unbalanced single or double quotes.

There are plenty of other things that can make a regex or dom selector invalid, it would be good to catch and fix these early.

Describe the solution you'd like
I believe this could be added to the existing Python based validation. In Java a pattern can be compiled (Pattern.compile(String)) at which point an exception would be thrown if invalid. We also came up with something similar for DOM selectors. I assume something similar can be done with Python.

Describe alternatives you've considered

  • Live with the errors and correct them as they're noticed.

Additional context
Not sure what else to say here. Mainly I was thinking that catching potential errors as close to introduction as possible would be the easiest way to address/prevent them.

@enthec-opensource
Copy link
Member

I believe this could be achieved by recursively iterating json objects and lists until you reach the string and do a regex.compile as you say, I'm going to go with that because that's as much validation as you can do for those fields

There is actually more validation we can do as specified in the schema, cpe has a very clear pattern. implies, requires and excludes could be matched aswell by a simple lookup in the specified json. pricing has limited string options.

enthec-opensource added a commit that referenced this issue Oct 5, 2023
@kingthorin
Copy link
Contributor Author

Great!! 👍 Thanks for tackling this.

@enthec-opensource
Copy link
Member

validating version & confidence tags

image

is this reliable? that fixed version doesnt seem to exist but looks like it would be useful to have (i understand its a fixed version, like "if this matches, force this version")

image

@kingthorin
Copy link
Contributor Author

I believe the first undocumented examples are meant to be just that. Any match on the pattern assume that version string. IIRC I asked about that on the original project once upon a time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants