-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a CLI, add some light unit testing, and change config from Python to Yaml/Json #2
Conversation
Also deletes a duplicate method and refreshes pyproject.toml
CLI: - New `chart-review` script gets installed along with Python module. - One sub-command right now: `accuracy` which calculates accuracy matrixes across labels for two reviewers and a base third Config: - Switch away from Python config files and towards yaml/json files. - I've added yaml versions of the two studies in the repo, as examples.
This will make it easier for someone just using the python to call it, if they want to.
@@ -0,0 +1,41 @@ | |||
"""Methods for high-level accuracy calculations.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is basically a generic version of the calculation in paper.py
Chart Review operates on a project folder that holds your config & data. | ||
1. Make a new folder. | ||
2. Export your Label Studio annotations and put that in the folder as `labelstudio-export.json`. | ||
3. Add a `config.yaml` file (or `config.json`) that looks something like this (read more on this format below): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your idea and I strongly prefer config.json
over yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah fair - but note:
- Json is technically a subset of yaml. (That is, a yaml parser can also read json)
- So what I've done here is use a yaml parser and look for both
config.yaml
andconfig.json
-- it will read either one - The reason I personally prefer yaml for config files is that you can have comments, which are often very useful for explaining why a config is the way it is (and also json can be annoyingly fussy about stuff like trailing commas, but that's less important than the comments thing)
So the way I made this PR, either yaml or json works - whichever the researcher in question is more comfy with.
How do you feel about that? (Or do you feel like standardizing on a specific syntax is worth disallowing yaml?)
OK on my first pass at interacting with the chart-review code, I wrote down some possible improvements here: #1
This is the PR to solve them. Specifically:
accuracy
which does the 3-way accuracy calculation frompaper.py
in the covid folder. 🤷 it seemed like a reasonable place to start, but we should talk about what top-level operations make sense.config.json
will work too.I've split this PR up into different commits, which should hopefully make it easier to read. But there's still a fair bit.