Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert Rx and Dx variable binning back to 0 vs 1 #14

Closed
karafecho opened this issue Nov 8, 2022 · 3 comments
Closed

Revert Rx and Dx variable binning back to 0 vs 1 #14

karafecho opened this issue Nov 8, 2022 · 3 comments
Assignees

Comments

@karafecho
Copy link

This issue is to suggest that we revert back to 0 vs 1 for binning Rx and Dx variables. Hong and Juan are in the process of modifying FHIR PIT code to accommodate this change; I updated all four all_features YAML files to reflect the change. When implemented, the FHIR PIT csv file output will reflect the new binning strategy. However, I suspect that it will take some time to (1) work out any bugs that have been introduced and (2) generate new csv files for all four ICEES use cases (asthma, pcd, dili, covid). As such, I am wondering if it makes sense to simply recode the existing csv files and then run the precompute step to generate correlational matrices to support ICEES KG. So, a one-off process that might expedite things a bit and also allow us to move forward with incorporating new statistics such as ORs with CIs for ICEES KG edge support.

@karafecho
Copy link
Author

karafecho commented Nov 8, 2022

Additional historical comment:

We changed our binning strategy for Rx and Dx variables to accommodate Translator requests for more granular data. However, the current binning strategy of 0, 1, >1 is of questionable clinical meaning. Moreover, Translator efforts shifted such that a 0 vs 1 binning strategy now seems more aligned with the broader Translator effort, although that may change (again).

@karafecho
Copy link
Author

Update: Hong updated the FHIR PIT code to revert back to 0 vs 1 binning. I tested the fix, and all looked good.

Moving forward, Hong and I decided to update the asthma, pcd, dili, and covid csv files to reflect the new 0 vs 1 binning scheme for Rx and Dx variables, rather than re-run FHIR PIT to generate new csv files, following the suggestion above. After this step is complete, Hong will redeploy the ICEES+ APIs and you will be able to re-run the precompute for ICEES KG, ideally with additional statistical metrics, as described in #15 and #16. You will also be able to add hard-coded Biolink mappings for environmental/chemical exposures after I update the all_features YAML files (see #12), which won't take long, as I had already included most of the intended mappings, just need to make a few adjustments.

Does this seem like a reasonable plan?

@karafecho
Copy link
Author

karafecho commented Feb 8, 2023

Closing as this is complete ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants