-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleaning up measurement/unit pairs #517
Conversation
… and units. units were manually curated. denominator now includes cases when source_unit is null
- synced the other CDM versions CSVs
…QualityDashboard into meas_unit_check_clean # Conflicts: # inst/sql/sql_server/concept_plausible_unit_concept_ids.sql
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #517 +/- ##
===========================================
+ Coverage 86.42% 87.30% +0.87%
===========================================
Files 16 16
Lines 884 945 +61
===========================================
+ Hits 764 825 +61
Misses 120 120 ☔ View full report in Codecov by Sentry. |
}:{ | ||
m.unit_concept_id NOT IN (@plausibleUnitConceptIds) | ||
} | ||
AND COALESCE (m.unit_concept_id, -1) NOT IN (replace (@plausibleUnitConceptIds, 'NA', '-1')) -- '-1' stands for the cases when unit_concept_id is null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we put -1 in the threshold file instead of NA to skip this replace
? (it will simplify the query, and i also worry about translation of REPLACE - we'd probably need to add it to SqlRender translation rules if we want to use it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was done intentionally, so in the output you'll see list of unit concept_ids and NA, while -1 can be confusing.
On the other hand, we can add a note, that '-1' stands for a NULL unit as permitted one, to avoid changes in the SQL render. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I do think it'd be better just to use -1 in the threshold file, and provide an explanation (in check descriptions and the documentation) - we probably want an explanation either way, and I do think it's better to keep the SQL simple 😃 Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to -1 and added explanation into Description files.
Looks like this branch originates from the older DQD version. Should I recreate this branch from the latest develop branch and reapply the changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK great thank you!! And yes, I was thinking the same regarding the branch. It's probably easiest to move your changes to a new branch based off the latest version of develop
because there are all of these auto-generated docs files which will be really hard to resolve conflicts for.
Instead of 2000 measurements, we check only 180 most used measurements (as per JnJ database network). Since it was a small count, I manually curated them, so results are more reliable.
But people can have their own most common measurements, so
Once it's merged, I'm going to create a wiki page that says how to submit new measurement-unit pairs and plausible numbers as well.
Currently Wiki already has this notion "It will eventually house measurement/unit pair definitions and descriptions on plausibleValueLow and plausibleValueHigh values."
People can submit their measurement-unit-plausible numbers (optional) as a table attached to the github issue.
I don't expect that much attention as the vocabulary contribution, so we don't need a centralized storage for these requests