structured-data-from-notes

A project designed to extract structured data from clinical text-based notes

This repository contains SQL and Python code written to capture and parse structured data from clinical text-based notes from an electronic medical record (EMR).

The SQL code identifies cases based on the presence of a text string embedded in the note template, in specific note types. The initial versions are designed to query specific fields in Clarity (Epic), but the general method can be adapted for use with other relational database.

The Python code takes the results of the SQL query and extracts structured data from the clinical note by identifying a simple repeating motif within the note generated by a purpose-built note template. The motif has the following structure for each field:

For example:

Hair color: 3 - Brown varHairColor;

As the enumerations generated by the custom note template are designed to have an integer key and a non-integer text value, these can be parsed individually or together as a key-value pair.

If the parser is set up to extract only the integer data from the enumeration string data (as is the case with the initial versions of the Python code included here), the resulting output would look like this:

subject hairColor 1 3 2 2 3 1 4 3

...

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
strokeEXPRESS.py		strokeEXPRESS.py
strokeEXPRESS.sql		strokeEXPRESS.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

structured-data-from-notes

About

Releases

Packages

Languages

License

alexanderflint/structured-data-from-notes

Folders and files

Latest commit

History

Repository files navigation

structured-data-from-notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages