-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert Rx and Dx variable binning back to 0 vs 1 #14
Comments
Additional historical comment: We changed our binning strategy for Rx and Dx variables to accommodate Translator requests for more granular data. However, the current binning strategy of 0, 1, >1 is of questionable clinical meaning. Moreover, Translator efforts shifted such that a 0 vs 1 binning strategy now seems more aligned with the broader Translator effort, although that may change (again). |
Update: Hong updated the FHIR PIT code to revert back to 0 vs 1 binning. I tested the fix, and all looked good. Moving forward, Hong and I decided to update the asthma, pcd, dili, and covid csv files to reflect the new 0 vs 1 binning scheme for Rx and Dx variables, rather than re-run FHIR PIT to generate new csv files, following the suggestion above. After this step is complete, Hong will redeploy the ICEES+ APIs and you will be able to re-run the precompute for ICEES KG, ideally with additional statistical metrics, as described in #15 and #16. You will also be able to add hard-coded Biolink mappings for environmental/chemical exposures after I update the all_features YAML files (see #12), which won't take long, as I had already included most of the intended mappings, just need to make a few adjustments. Does this seem like a reasonable plan? |
Closing as this is complete ... |
This issue is to suggest that we revert back to 0 vs 1 for binning Rx and Dx variables. Hong and Juan are in the process of modifying FHIR PIT code to accommodate this change; I updated all four all_features YAML files to reflect the change. When implemented, the FHIR PIT csv file output will reflect the new binning strategy. However, I suspect that it will take some time to (1) work out any bugs that have been introduced and (2) generate new csv files for all four ICEES use cases (asthma, pcd, dili, covid). As such, I am wondering if it makes sense to simply recode the existing csv files and then run the precompute step to generate correlational matrices to support ICEES KG. So, a one-off process that might expedite things a bit and also allow us to move forward with incorporating new statistics such as ORs with CIs for ICEES KG edge support.
The text was updated successfully, but these errors were encountered: