-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhauling validation results to get them closer to cover different types of validators #1514
base: master
Are you sure you want to change the base?
Conversation
BREAKING CHANGE: we renamed .bids_version to more generic .standard + .standard_version
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1514 +/- ##
==========================================
+ Coverage 88.58% 88.68% +0.09%
==========================================
Files 78 78
Lines 10589 10867 +278
==========================================
+ Hits 9380 9637 +257
- Misses 1209 1230 +21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
||
|
||
class Severity(Enum): | ||
# TODO: decide on the naming consistency -- either prepend all with Validation or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yarikoptic Are you talking about naming of the high level entities in this module, such as Severity
and ValidationOrigin
, or are you talking about the naming of the enum values within Severity
?
class Severity(IntEnum): | ||
HINT = 1 | ||
WARNING = 2 | ||
ERROR = 3 | ||
INFO = 2 # new/unused, available in linkml | ||
WARNING = 3 | ||
ERROR = 4 | ||
CRITICAL = 5 # new/unused, linkml has FATAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see three issues with this though they are small issues.
- We may include all the severity levels in the outputs of all validators, However, for a particular validator, only a subset of these may be used.
- The same severity level may mean two different things in different validators.
- Different validator may also use different severity level name to mean the same level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- agree
- better not
- better not
But you are pointing to the ambiguous (not clearly defined across validators) semantic . Then ideally we should be able to establish "mapping" if there is difference in agreement.
I think we should just clear up here what we mean for each of the levels. E.g.
HINT
: data is correct but could be improvedINFO
: ... ???WARNING
: when "SHOULD" or "SHOULD NOT" level of requirement is violatedERROR
: when "MUST" or "MUST NOT" level of requirement is violatedCRITICAL
: when makes given data completely unusable
may be we might want
INTERNAL
: signals about the problem with validator itself. although @candleindark rightfuly notices that it is still an ERROR but about not data, but validator itself... so we might want to explicitly point to the object of the error (not data but validator)
|
||
|
||
class Scope(Enum): | ||
FILE = "file" | ||
FOLDER = "folder" | ||
# Isaac: make it/add "dandiset-metadata" to signal specific relation to metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need clarification for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be forget about it for now -- seems to combine with ValidationObject
below but we might need to rethink/define some additional levels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as for the folder an example could be an unknown to BIDS folder name under sub-.../ses-.../
folder. I am not sure ATM if BIDS validator complains and how
name: str # Validator name | ||
version: str # Validator version | ||
standard: str | None = ( | ||
None # Standard being validated against # TODO: Enum for the standards?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yarikoptic We can create an enum class to define the supported standards. Do you know what those those standards are ATM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. For starters:
- BIDS
- DANDI-LAYOUT -- our "ad-hoc" naming schema. ATM we validate "explicitly" in dandi-cli code in validate_organized_path, and not as part of the dandi-schema
- DANDI-SCHEMA
- NWB
- HED
- OME-ZARR
Could be also at the level of the "file format" standard:
- JSON
- YAML
- TSV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want also to create Enum for validators since we also have a set of them.
In updating the use of Line 1122 in 45db66c
Do we have a dandi standard? If we do, it doesn't seem that we version this standard. I think we should indeed define an enum class for the supported standard, and also talk about how to retrieve the version of those standards. |
@@ -209,17 +209,20 @@ def get_validation_errors( | |||
import zarr | |||
|
|||
errors: list[ValidationResult] = [] | |||
origin: ValidationOrigin = ValidationOrigin( | |||
name="zarr", | |||
version=zarr.__version__, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this line intentional? The original one was version=zarr.version.version
.
origin: ValidationOrigin = ValidationOrigin( | ||
name="zarr", | ||
version=zarr.__version__, | ||
standard="zarr", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here to add standard_version
on what version of zarr it is .
DANDISET = "dandiset" | ||
DATASET = "dataset" | ||
|
||
|
||
# new/unused, may be should be gone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be drop this for now indeed
Ref:
and result of our chat with @candleindark while also reviewing results of linkml validation over dandisets:
Notes/TODOs:
.bids_version
to more generic.standard
+.standard_version