Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleaning up measurement/unit pairs #517

Closed
wants to merge 6 commits into from
Closed

Conversation

dimshitc
Copy link
Collaborator

Instead of 2000 measurements, we check only 180 most used measurements (as per JnJ database network). Since it was a small count, I manually curated them, so results are more reliable.
But people can have their own most common measurements, so
Once it's merged, I'm going to create a wiki page that says how to submit new measurement-unit pairs and plausible numbers as well.
Currently Wiki already has this notion "It will eventually house measurement/unit pair definitions and descriptions on plausibleValueLow and plausibleValueHigh values."
People can submit their measurement-unit-plausible numbers (optional) as a table attached to the github issue.
I don't expect that much attention as the vocabulary contribution, so we don't need a centralized storage for these requests

Dmitry Dymshyts added 4 commits November 27, 2023 10:30
… and units. units were manually curated.

denominator now includes cases when source_unit is null
- synced the other CDM versions CSVs
…QualityDashboard into meas_unit_check_clean

# Conflicts:
#	inst/sql/sql_server/concept_plausible_unit_concept_ids.sql
@dimshitc dimshitc changed the base branch from main to develop December 22, 2023 16:17
Copy link

codecov bot commented Dec 22, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (2883027) 86.42% compared to head (88c2775) 87.30%.
Report is 13 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #517      +/-   ##
===========================================
+ Coverage    86.42%   87.30%   +0.87%     
===========================================
  Files           16       16              
  Lines          884      945      +61     
===========================================
+ Hits           764      825      +61     
  Misses         120      120              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

}:{
m.unit_concept_id NOT IN (@plausibleUnitConceptIds)
}
AND COALESCE (m.unit_concept_id, -1) NOT IN (replace (@plausibleUnitConceptIds, 'NA', '-1')) -- '-1' stands for the cases when unit_concept_id is null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we put -1 in the threshold file instead of NA to skip this replace? (it will simplify the query, and i also worry about translation of REPLACE - we'd probably need to add it to SqlRender translation rules if we want to use it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was done intentionally, so in the output you'll see list of unit concept_ids and NA, while -1 can be confusing.
On the other hand, we can add a note, that '-1' stands for a NULL unit as permitted one, to avoid changes in the SQL render. what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I do think it'd be better just to use -1 in the threshold file, and provide an explanation (in check descriptions and the documentation) - we probably want an explanation either way, and I do think it's better to keep the SQL simple 😃 Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to -1 and added explanation into Description files.
Looks like this branch originates from the older DQD version. Should I recreate this branch from the latest develop branch and reapply the changes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK great thank you!! And yes, I was thinking the same regarding the branch. It's probably easiest to move your changes to a new branch based off the latest version of develop because there are all of these auto-generated docs files which will be really hard to resolve conflicts for.

@dimshitc dimshitc changed the base branch from develop to main January 26, 2024 10:50
@dimshitc dimshitc closed this Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants