With learning rebase #52

scottwittenburg · 2021-01-13T04:21:22Z

Rebase the original work adding learning to the application, and fix a couple console errors

active learning with girder worker Show good_prob row Add active learning persisted state

scottwittenburg · 2021-01-13T04:31:36Z

@dzenanz This is more or less working for me, but I haven't really been able to test it since I don't know where to find a dataset with the new fields. Even if you import a file in the original csv format, I think there are going to be issues when it goes looking for the data in csv format in Girder, as I have not yet converted everything in the PR to work with JSON. In fact, I'm not sure that's even the right way to go for what it's doing. But I think the next step would be to put together a dataset in the format it expects and see where we start hitting errors. Once we see what's going on, we can decide if converting all internal expectations to JSON is what we want to do.

Looks like it wants two new fields in the data (associated with each scan): IQMs (if it's csv) or iqms if it's json, and good_prob. The latter is simply a floating point value, the former is a semicolon-separated string which corresponds to a list of key/value pairs. The key and value in each pair are separated by a colon. Futhermore, some special processing is done if the key has underscores in it.

But you can look at that expectation yourself if you check out the parseIQM method in server/miqa_server/session.py.

You can feel free to push more commits on this branch, or pull it to your fork and work there, up to you.

scottwittenburg · 2021-01-13T16:58:25Z

@dzenanz As I'm working more with this, I see there are still issues I can fix without a dataset containing the image quality metrics. I'm working on that now, and will push another commit when I sort out the issues.

scottwittenburg · 2021-01-13T17:01:18Z

The refactoring of some of the session.py methods to a utility module was necessary because they are used from the learning.py module.

scottwittenburg · 2021-01-13T19:54:32Z

All of the learning modules seem to assume the data format is csv. I'm considering writing the inverse of my converter so we can easily go back and forth between formats. It shouldn't take much time to write, and the bi-directional conversion would be easily tested. I think it's either that, or maybe we just abandon the JSON format, perhaps it's just not that useful.

Thoughts @dzenanz @curtislisle?

dzenanz · 2021-01-13T20:13:36Z

I don't have a strong preference. Do what you think is easier or better 😄 Or wait for Curt's opinion.

scottwittenburg · 2021-01-15T01:20:31Z

@dzenanz @curtislisle

I opted to write a converter from json back to csv. It seems to be working to pull the json representation out of girder, convert it to csv, and pass this into the learning component.

I learned more about how this workflow runs as I worked through issues today. I tried to encapsulate the key steps to reproduce it in the development.md at the root of the repo (latest commit). It's still not working as we had hoped unfortunately. The mriqc module is used to do the active learning, and there's documentation in various places in there saying you need to point to the MIQA_MRIQC_PATH, which it describes as needing to contain "all important directories like training_data.csv, model_weights, log files". Your guess as to what form those things are supposed to take is likely better than mine. If either of you know how to satisfy those requirements, let me know, maybe we can get it running.

Also, everytime I click the "RETRAIN" button in the application, I see the following errors in the celery output:

Traceback (most recent call last):
  File "/Users/scott/projects/miqa/miqa-venv/lib/python3.7/site-packages/celery/app/trace.py", line 412, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/Users/scott/projects/miqa/miqa-venv/lib/python3.7/site-packages/girder_worker/task.py", line 173, in __call__
    self._maybe_cleanup(hook)
  File "/Users/scott/projects/miqa/miqa-venv/lib/python3.7/site-packages/girder_worker/task.py", line 146, in _maybe_cleanup
    arg.cleanup(**kwargs)
  File "/Users/scott/projects/miqa/miqa-venv/lib/python3.7/site-packages/girder_worker_utils/transforms/girder_io.py", line 161, in cleanup
    if os.path.isdir(self.output_file_path):
AttributeError: 'GirderUploadToItem' object has no attribute 'output_file_path'

That seems like it may be caused by having an older girder_worker_utils in my virtual environment, as newer code has a guard in that method. The other error I'm currently hitting is:

FileNotFoundError: [Errno 2] No such file or directory: '/Users/scott/miqa/mriqc_master_folder/training_data.csv'

... which is obviously because I don't have all the important files in the mriqc master directory.

Add a note indicating that the slash is needed at the end of the mriqc path (we should fix this, but it works this way for now). Also, I added the --no-sub argument when running mriqc so that we avoid submission of the computed metrics back to mriqc.nimh.nih.gov, at least while we are doing a lot of testing of our workflow.

: used in time separation (e.g. 16:10:20) is not allowed in file name on Windows.

This adds a radio group to the retrain dialog, allowing the user to select neural network or random forest classifaction. Also adds a placeholder for neural network learning in the tasks.py module. Also fixes the bug where the retrain dialog "Cancel" button does not work, and adds a note to dev doc on running "data2mriqc.py".

WARNING/MainProcess] m:\dev\zarr\miqa\mriqc\mriqc\data_loader.py:112: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self.data['good_prob'][i] = predictions[ind]

dzenanz · 2021-02-04T23:46:07Z

NN classifier seems to be working on my computer. Can you check my work in #54?

scottwittenburg · 2021-02-05T00:41:15Z

Thanks @dzenanz I'll try to take a look at this tomorrow. In the meantime I'm just curious, why create a new PR?

dzenanz · 2021-02-05T02:50:12Z

I didn't want to impose on your branch. You can cherry pick commits from my branch.

matthewma7 and others added 2 commits January 12, 2021 20:45

Add ability to parse and display MRIQC metrics

4303a12

active learning with girder worker Show good_prob row Add active learning persisted state

Fixes for vuetify 2

1aa7cf4

more fixes

a00d61d

Continue moving active learning to a working state

833b1ae

scottwittenburg and others added 11 commits January 21, 2021 13:27

Fix a couple issues when running data2mriqc.py

24b0f1d

Fix a couple issues importing data with quality metrics

63bbfab

Fix issues that appear when attempting to 'retrain'

d9522be

A few last hard-to-come-by fixes to get through entire workflow

a4c3929

Use Windows-friendly file name for random forest classifier

7f1cc37

: used in time separation (e.g. 16:10:20) is not allowed in file name on Windows.

Passing learning mode into active_learner

63b2e4a

Renaming Model into ModelRF

f330535

Replacing python2 by python3 in shebang line

ab1375e

dzenanz approved these changes Feb 4, 2021

View reviewed changes

dzenanz mentioned this pull request Feb 4, 2021

Update Phase I learning code to latest master #54

Open

dzenanz mentioned this pull request May 7, 2021

Initial Issues Encountered While Setting Up miqa #111

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With learning rebase #52

With learning rebase #52

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

dzenanz commented Jan 13, 2021

scottwittenburg commented Jan 15, 2021

dzenanz commented Feb 4, 2021

scottwittenburg commented Feb 5, 2021

dzenanz commented Feb 5, 2021

With learning rebase #52

Are you sure you want to change the base?

With learning rebase #52

Conversation

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

scottwittenburg commented Jan 13, 2021

dzenanz commented Jan 13, 2021

scottwittenburg commented Jan 15, 2021

dzenanz commented Feb 4, 2021

scottwittenburg commented Feb 5, 2021

dzenanz commented Feb 5, 2021