diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md new file mode 100644 index 0000000..6bcf50f --- /dev/null +++ b/.github/pull_request_template.md @@ -0,0 +1,4 @@ + +### Checklist +- [ ] Consider if documentation (like in `docs/`) needs to be updated +- [ ] Consider if tests should be added diff --git a/.github/workflows/pages.yaml b/.github/workflows/pages.yaml new file mode 100644 index 0000000..c37311b --- /dev/null +++ b/.github/workflows/pages.yaml @@ -0,0 +1,24 @@ +name: Update Cumulus docs +on: + push: + branches: ["main"] + paths: ["docs/**"] + +jobs: + update-docs: + name: Update Cumulus docs + runs-on: ubuntu-latest + steps: + - name: Send workflow dispatch + uses: actions/github-script@v7 + with: + # This token is set to expire in May 2024. + # You can make a new one with write access to Actions on the cumulus repo. + github-token: ${{ secrets.CUMULUS_DOC_TOKEN }} + script: | + await github.rest.actions.createWorkflowDispatch({ + owner: 'smart-on-fhir', + repo: 'cumulus', + ref: 'main', + workflow_id: 'pages.yaml', + }) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 2d81844..2d9cc34 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,5 +1,35 @@ +# Contributing to Chart Review + +First off, thank you! +Read on below for tips on getting involved with the project. + +## Talk to Us + +If something annoys you, it probably annoys other folks too. +Don't be afraid to suggest changes or improvements! + +Not every suggestion will align with project goals, +but even if not, it can help to talk it out. + +Look at [open issues](https://github.com/smart-on-fhir/chart-review/issues), +and if you don't see your concern, +[file a new issue](https://github.com/smart-on-fhir/chart-review/issues/new)! + +## Set up your dev environment + +To use the same dev environment as us, you'll want to run these commands: +```sh +pip install .[dev] +pre-commit install +``` + +This will install dependencies & build tools, +as well as set up a `black` auto-formatter commit hook. + ## Vocabulary +Here is a quick introduction to some terminology you'll see in the source code. + ### Labels - **Label**: a tag that can be applied to a word, like "Fever" or "Ideation". These are often applied by humans during a chart review in Label Studio, diff --git a/README.md b/README.md index 3dd58b9..224a0a3 100644 --- a/README.md +++ b/README.md @@ -1,165 +1,46 @@ -# chart-review -Measure agreement between two "_reviewers_" from the "_confusion matrix_" +# Chart Review + +**Measure agreement between chart annotations.** + +Whether your chart annotations come from humans, machine-learning, or coded data like ICD-10, +`chart-review` can compare them to reveal interesting statistics like: **Accuracy** * F1-score ([agreement](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1090460/)) * [Sensitivity and Specificity](https://en.wikipedia.org/wiki/Sensitivity_and_specificity) -* [Positive (PPV) or Negative Predictive Value (NPV)](https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values#Relationship)) -* False Negative Rate (FNR) +* [Positive (PPV) or Negative Predictive Value (NPV)](https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values#Relationship) +* False Negative Rate (FNR) -**Confusion Matrix** +**Confusion Matrix** * TP = True Positive (type I error) * TN = True Negative (type II error) -* FP = False Positive -* FN = False Negative - -**Power Calculations** for sample size estimation -* Power = 1 - FNR -* FNR = FN / (FN + TP) - - ---- -**CHART-REVIEW** here is defined as "reading" and "annotating" (highlighting) medical notes to measure accuracy of a measurement. -Measurements can establish the reliability of ICD10, or the reliable utility of NLP to automate labor intensive process. - -Agreement among 2+ human subject matter expert reviewers is considered the defacto gold-standard for ground-truth labeling, but cannot be done manually at scale. - -The most common chart-review measures agreement of the _**class_label**_ from a careful list of notes -* 1 human reviewer _vs_ ICD10 codes -* 1 human reviewer _vs_ NLP results -* 2 human reviewers _vs_ each other - ---- -### How to Install -1. Clone this repo. -2. Install it locally like so: `pipx install .` - -`chart-review` is not yet released on PyPI. - ---- -### How to Run - -#### Set Up Project Folder +* FP = False Positive +* FN = False Negative -Chart Review operates on a project folder that holds your config & data. -1. Make a new folder. -2. Export your Label Studio annotations and put that in the folder as `labelstudio-export.json`. -3. Add a `config.yaml` file (or `config.json`) that looks something like this (read more on this format below): +## Documentation -```yaml -labels: - - cough - - fever +For guides on installing & using Chart Review, +[read our documentation](https://docs.smarthealthit.org/cumulus/chart-review/). -annotators: - jane: 2 - john: 6 - jack: 8 +## Example -ranges: - jane: 242-250 # inclusive - john: [260-271, 277] - jack: [jane, john] -``` - -#### Run - -Call `chart-review` with the sub-command you want and its arguments: - -For Jane as truth for Jack's annotations: -```shell -chart-review accuracy jane jack -``` - -For Jack as truth for John's annotations: ```shell -chart-review accuracy jack john +$ ls +config.yaml labelstudio-export.json + +$ chart-review accuracy jane john +accuracy-jane-john: +F1 Sens Spec PPV NPV TP FN TN FP Label +0.889 0.8 1.0 1.0 0.5 4 1 1 0 * +1.0 1.0 1.0 1.0 1.0 1 0 1 0 Cough +0 0 0 0 0 2 0 0 0 Fatigue +0 0 0 0 0 1 1 0 0 Headache ``` -Pass `--help` to see more options. - ---- -### Config File Format - -`config.yaml` defines study specific variables. - - * Class labels: `labels: ['cough', 'fever']` - * Annotators: `annotators: {'jane': 3, 'john': 8}` - * Note ranges: `ranges: {'jane': 40-50, 'john': [2, 3, 4, 5]}` +## Contributing -`annotators` maps a name to a Label Studio User ID -* human subject matter expert _like_ `jane` -* computer method _like_ `nlp` -* coded data sources _like_ `icd10` - -`ranges` maps a selection of Note IDs from the corpus -* `corpus: start:end` -* `annotator1_vs_2: [list, of, notes]` -* `annotator2_vs_3: corpus` +We love 💖 contributions! -#### External Annotations - -You may have annotations from NLP or coded FHIR data that you want to compare against. -Easy! - -Set up your config to point at a CSV file in your project folder that holds two columns: -- DocRef ID (real or anonymous) -- Label - -```yaml -annotators: - human: 1 - external_nlp: - filename: my_nlp.csv -``` - -When `chart-review` runs, it will inject the external annotations and match up the DocRef IDs -to Label Studio notes based on metadata in your Label Studio export. - ---- -**BASE COHORT METHODS** - -`cohort.py` -* from chart_review import _labelstudio_, _mentions_, _agree_ - -class **Cohort** defines the base class to analyze study cohorts. - * init(`config.py`) - -`simplify.py` -* **rollup**(...) : return _LabelStudioExport_ with 1 "rollup" annotation replacing individual mentions - -`term_freq.py` (methods are rarely used currently) -* overlaps(...) : test if two mentions overlap (True/False) -* calc_term_freq(...) : term frequency of highlighted mention text -* calc_term_label_confusion : report of exact mentions with 2+ class_labels - -`agree.py` get confusion matrix comparing annotators {truth, annotator} -* **confusion_matrix** (truth, annotator, ...) returns List[TruePos, TrueNeg, FalsePos, FalseNeg] -* **score_matrix** (matrix) returns dict with keys {F1, Sens, Spec, PPV, NPV, TP,FP,TN,FN} - -`labelstudio.py` handles LabelStudio JSON - -Class **LabelStudioExport** -* init(`labelstudio-export.json`) - -Class **LabelStudioNote** -* init(...) - -`publish.py` tables and figures for PubMed manuscripts -* table_csv(...) -* table_json(...) - ---- -**NICE TO HAVES LATER** - -* **_confusion matrix_** type support using Pandas -* **score_matrix** would be nicer to use a Pandas strongly typed class - ---- -### Set up your dev environment - -To use the same dev environment as us, you'll want to run these commands: -```sh -pip install .[dev] -pre-commit install -``` +If you have a good suggestion 💡 or found a bug 🐛, +[read our brief contributors guide](CONTRIBUTING.md) +for pointers to filing issues and what to expect. diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 0000000..9f99e5a --- /dev/null +++ b/docs/README.md @@ -0,0 +1,6 @@ +# Chart Review Documentation + +These documents are meant to be built as one part of the larger body of +[Cumulus documentation](https://docs.smarthealthit.org/cumulus). + +To test changes here locally, read more at the [Cumulus docs repo](https://github.com/smart-on-fhir/cumulus). diff --git a/docs/accuracy.md b/docs/accuracy.md new file mode 100644 index 0000000..0476744 --- /dev/null +++ b/docs/accuracy.md @@ -0,0 +1,46 @@ +--- +title: Accuracy Command +parent: Chart Review +nav_order: 5 +# audience: lightly technical folks +# type: how-to +--- + +# The Accuracy Command + +The `accuracy` command will print agreement statistics like F1 scores and confusion matrices +for every label in your project, between two annotators. + +Provide two annotator names (the first name will be considered the ground truth) and +your accuracy scores will be printed to the console. + +## Example + +```shell +$ chart-review accuracy jane john +accuracy-jane-john: +F1 Sens Spec PPV NPV TP FN TN FP Label +0.929 0.958 0.908 0.901 0.961 91 4 99 10 * +0.895 0.895 0.938 0.895 0.938 17 2 30 2 cough +0.815 0.917 0.897 0.733 0.972 11 1 35 4 fever +0.959 1.0 0.812 0.921 1.0 35 0 13 3 headache +0.966 0.966 0.955 0.966 0.955 28 1 21 1 stuffy-nose +``` + +## Options + +### `--config=PATH` + +Use this to point to a secondary (non-default) config file. +Useful if you have multiple label setups (e.g. one grouped into a binary label and one not). + +### `--project-dir=DIR` + +Use this to run `chart-review` outside of your project dir. +Config files, external annotations, etc will be looked for in that directory. + +### `--save` + +Use this to write a JSON and CSV file to the project directory, +rather than printing to the console. +Useful for passing results around in a machine-parsable format. diff --git a/docs/config.md b/docs/config.md new file mode 100644 index 0000000..35c9d95 --- /dev/null +++ b/docs/config.md @@ -0,0 +1,177 @@ +--- +title: Configuration +parent: Chart Review +nav_order: 3 +# audience: lightly technical folks +# type: reference +--- + +# Configuration + +## File Format + +You can write your config file in either +[JSON](https://en.wikipedia.org/wiki/JSON) +or [YAML](https://en.wikipedia.org/wiki/YAML). +Whichever you're more comfortable with. + +By default, Chart Review will look for either `config.json` or `config.yaml` +in your project directory and use whichever it finds. + +For the remainder of this document, examples will be shown in YAML. + +## Alternative Configs +You may want to experiment with different label setups for your project. +That's easy. + +Just provide `--config=./path/to/config.yaml` and your +secondary config will be used instead of the default config. + +## Required Fields + +The only truly required field is `annotators`, +which provides a mapping from names to Label Studio ID values. + +Every other field has some reasonable default. + +## Field Definitions + +### `annotators` + +This is a required mapping of human-readable names to Label Studio IDs. + +#### Example + +Here, Alice has user ID 3 in Label Studio and Bob has the user ID 2. + +```yaml +annotators: + alice: 3 + bob: 2 +``` + +#### External Annotators + +{: .note } +This feature requires you to upload notes +to Label Studio using Cumulus ETL's `upload-notes` command. +That way the document IDs get stored correctly as Label Studio metadata. + +Sometimes you are working with externally-derived annotations. +For example, from NLP or ICD10 codes. + +That's easy to integrate! +Just make a CSV file with two columns: +first an identifier for the document and second, the label. + +- The document identifier can be an Encounter or DocumentReference ID + (either the original ID or the anonymized version that Cumulus ETL creates). +- The label should be the same kind of label you define in your config. +- An ID can appear multiple times with different labels. All the labels will apply to that note. +- If there are no labels for a given ID, include a line for that ID but with an empty label field. + That way, Chart Review will know to include that ID in its math, but with no labels. + +##### Example CSV +```csv +encounter_id,label +abcd123,Cough +abcd123,Fever +efgh456, +ijkl789,Cough +``` + +##### Example Config +```yaml +annotators: + icd10: + filename: icd10.csv +``` + +### `grouped-labels` + +This lets you bundle certain labels together into a smaller set. +For example, you may have many labels for specific heart conditions, but +are ultimately only interested in the binary determination of whether a patient is affected at all. + +This grouping happens after implied labels are expanded and before any scoring is done. + +The new group labels do not need to be a part of your source `labels` list. + +#### Example +```yaml +grouped-labels: + ill: [insomnia, chickenpox, ebola] +``` + +### `ignore` + +This lets you totally exclude some notes from annotation scoring. + +Sometimes notes were included in the Chart Review but are determined to be invalid for the +purposes of the current study. +If put in this ignore list, they won't affect the score. + +You can use either the Label Studio note ID directly, +an Encounter ID (original or anonymized), +or a DocumentReference ID (original or anonymized). + +#### Example +```yaml +ignore: + - abcd123 + - 42 +``` + +### `implied-labels` + +This lets you expand certain labels to a fuller set of implied labels. +For example, you may have specific labels like `heart-attack` +that also imply the `heart-condition` label. + +This expansion happens before labels are grouped and before any scoring is done. + +#### Example + +```yaml +implied-labels: + cat: [animal, has-tail] + lion: cat +``` + +### `labels` + +This lets you restrict scoring to just this specific set of labels. + +Sometimes your source annotations have extra labels that aren't a part of your current analysis. +If a label isn't in this list, it will not be scored. + +If this is not defined, all found labels will be used and scored. + +#### Example + +```yaml +labels: + - animal + - cat + - has-tail + - lion +``` + +### `ranges` + +This is a mapping of note ranges for each annotator. +By default, note ranges are automatically detected by looking at the Label Studio export. +But it may be useful to manually define the note range in unusual cases. + +- You can provide a list of Label Studio note IDs. +- You can reference other defined ranges. +- You can specify a range of IDs with a hyphen. + +#### Example + +```yaml +ranges: + alice: 13-54 + bob: [5, 7, 14] + cathy: [alice, bob] +``` \ No newline at end of file diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..be9a795 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,56 @@ +--- +title: Chart Review +has_children: true +# audience: non-programmers new to this project +# type: explanation +--- + +# Chart Review + +**Measure agreement between chart annotations.** + +Whether your chart annotations come from humans, machine-learning, or coded data like ICD-10, +Chart Review can compare them to reveal interesting statistics like: + +**Accuracy** +* F1-score ([agreement](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1090460/)) +* [Sensitivity and Specificity](https://en.wikipedia.org/wiki/Sensitivity_and_specificity) +* [Positive (PPV) or Negative Predictive Value (NPV)](https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values#Relationship) +* False Negative Rate (FNR) + +**Confusion Matrix** +* TP = True Positive (type I error) +* TN = True Negative (type II error) +* FP = False Positive +* FN = False Negative + +**Power Calculations** for sample size estimation +* Power = 1 - FNR +* FNR = FN / (FN + TP) + +## Is This Part of Cumulus? + +Chart Review is developed by the same team +and is designed to work with the +[Cumulus project](https://docs.smarthealthit.org/cumulus/), +but Chart Review is useful even outside of Cumulus. + +Some features (notably those dealing with external annotations) +require Label Studio metadata that Cumulus ETL creates when it pushes notes +to Label Studio using its `upload-notes` feature. + +But calculating accuracy between human annotators can be done entirely without the use of Cumulus. + +## Installing & Using + +```shell +pip install chart-review +chart-review --help +``` + +Read the [first-time setup docs](setup.md) for more. + +## Source Code +Chart Review is open source. +If you'd like to browse its code or contribute changes yourself, +the code is on [GitHub](https://github.com/smart-on-fhir/chart-review). diff --git a/docs/setup.md b/docs/setup.md new file mode 100644 index 0000000..b3a2136 --- /dev/null +++ b/docs/setup.md @@ -0,0 +1,27 @@ +--- +title: Setup +parent: Chart Review +nav_order: 1 +# audience: lightly technical folks +# type: how-to +--- + +# Setting Up Chart Review + +## Installing + +`pip install chart-review` + +## Make Project Directory + +1. Make a new directory to hold your project files. +2. Export your Label Studio project and put the resulting JSON file + in this directory with the name `labelstudio-export.json`. +3. Create a [config file](config.md) in this directory. + +## Run Chart Review + +The only current command is `accuracy`, +which will print agreement statistics between two annotators. + +Read more about it in its own [accuracy command documentation](accuracy.md). \ No newline at end of file