GitHub - jbadger3/ml_4_pheno_ooe: OMOP CDM data extraction methods from: Machine Learning for Phenotyping Opioid Overdose Events

This repository contains code to accompany the paper:

"Machine Learning for Phenotyping Opioid Overdose Events".

Please cite this paper if you use this code as part of a published research project.

Overview

Generates features vectors for phenotyping models from electronic health record (EHR) data stored in OMOP CDM format. Data surrounding each suspected opioid overdose event is gathered from two intervals and includes opioid prescriptions, labs, diagnoses, procedures, and observations.

Requirements

An electronic health record (EHR) database formatted and stored using the OHDSI OMOP Common Data Model (CDM) v5.2 or higher
python >= 3.0 and the following packages:
- numpy
- pandas
- from the python standard library
  - os, argparse, sqlite3, json, collections

Repository files

extract_OMOP_features.py - extracts data from the OMOP CDM and outputs two files for continous and discrete features respectively.

mme_OMOP.json - a supporting json file containting a mapping of RxCUIs to Morphine Milligram Equivalent (MME) conversion factors

Running the code

The main script extract_OMOP_features.py must be supplied with an examples file in .csv format that contains person_ids for patients that have an opioid overdose diagnoses on their record and the dx_date where the diagnoses code first appears. Headers should be included as 'person_id' and 'dx_date' respectively. Dates are exptected to be in ISO 8601 format (YYYY-MM-DD). See supporting_files/examples_sample.csv for concrete examples.
This work utilized sqlite as the backend database for storing OMOP CDM data. When using something other than sqlite the code will need to be edited to import the appropriate python DB API and create a connection. The SQL queries may also need slight adjustments based on differences in SQL syntax.
Run the script from the command line using: python script/extract_OMOP_features.py --db_path path/to/db --examples_path path/to/examples --model_name my_model

The script will create two .csv files for discrete and continuous features in a new directory called OMOP_features. Data for each person_id will appear in the same order as provided in the examples file.

Note: class labels (overdose severity) are not included in the output files, but should match 1:1 if they happen to be recorded in the examples file.

Questions/assistance

Please direct any questions/requests for assistance to [email protected]

License

This work is distributed under an MIT license. Feel free to edit, copy, share.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
script		script
supporting_files		supporting_files
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
features.png		features.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Requirements

Repository files

Running the code

Questions/assistance

License

About

Releases

Packages

Languages

License

jbadger3/ml_4_pheno_ooe

Folders and files

Latest commit

History

Repository files navigation

Overview

Requirements

Repository files

Running the code

Questions/assistance

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages