ENH: Add method to overwrite / redo some analysis steps #311

larsoner · 2020-10-22T19:31:06Z

At least estimation of:

_maxbad.txt
.pos
-annot.h5
-counts.h5

These steps are slow so currently they are recomputed only when they are missing. This means the way to say "recompute these" is to delete files from disk (not great). We could add an overwrite/recompute parameter to control which of these to recompute. Or maybe make it so the inputs to the computation function are cached properly (joblib?) so that recomputation is automatic when the relevant parameters change.

The text was updated successfully, but these errors were encountered:

NeuroLaunch · 2021-02-18T03:29:54Z

I like the idea of setting up a toggle parameter to greenlight potential overwrites, as it can be very annoying to launch a run only to realize a single change (for me, usually related to HPI processing) will cause an error.

I actually do something similar in my own code:

clear_chpi, clear_annot = True, True
if any((clear_chpi, clear_annot)):
    delete_sssfiles(params, clear_chpi, clear_annot)

where delete_sssfiles() is a simple program I stashed away in my scoring script. But this just deletes the files outright, forcing the recomputation.

larsoner · 2021-02-18T15:52:04Z

I think the annotation is completely dependent on the cHPI fitting, so maybe we can get away with just adding a do_chpi parameter that deletes all of the files listed above except _maxbad.txt (but hopefully we don't change that very often) and then reestimates? And for maxbad I guess we could have do_maxbad=True | False as well.

In both cases, do_whatever=False means "don't do it if it's already there" (and might automatically be run if do_sss=True and the files are not there) and do_whatever=True means "delete whatever is there and run it".

NeuroLaunch · 2021-02-18T17:16:33Z

It may be worth implementing your above idea of recomputing (if the toggle is True) only when one or more of the relevant parameters has changed OR if a computed file isn't present. I'm not familiar with joblib, but I imagine a simple file generated during the SSS/cHPI step that holds meta data (or hash codes) with details about how the files are created, something that the params dictionary is compared against for later runs.

NeuroLaunch · 2021-02-18T17:21:53Z

It would be more work, but perhaps such a meta file could be comprehensive across all of the important MNEFun parameters, as well as archivable and human-readable, effectively providing a diary of MNEFun processing for that experiment directory. (I picture a command line tool that could give a nicely formatted readout of the last run or a prior run.) Just a pipeline dream?

larsoner · 2021-02-18T19:07:05Z

It may be worth implementing your above idea of recomputing (if the toggle is True) only when one or more of the relevant parameters has changed OR if a computed file isn't present... Just a pipeline dream?

Yes in principle this would work with something like joblib.cache set up properly but it's difficult and a lot of work to get right. For now I would just always do it if it's true, it's an intermediate solution but it's easy to implement and solves a real problem people have now, even if it doesn't do it optimally (e.g., by automatically tracking what needs to be done).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add method to overwrite / redo some analysis steps #311

ENH: Add method to overwrite / redo some analysis steps #311

larsoner commented Oct 22, 2020 •

edited

Loading

NeuroLaunch commented Feb 18, 2021 •

edited

Loading

larsoner commented Feb 18, 2021

NeuroLaunch commented Feb 18, 2021

NeuroLaunch commented Feb 18, 2021

larsoner commented Feb 18, 2021

ENH: Add method to overwrite / redo some analysis steps #311

ENH: Add method to overwrite / redo some analysis steps #311

Comments

larsoner commented Oct 22, 2020 • edited Loading

NeuroLaunch commented Feb 18, 2021 • edited Loading

larsoner commented Feb 18, 2021

NeuroLaunch commented Feb 18, 2021

NeuroLaunch commented Feb 18, 2021

larsoner commented Feb 18, 2021

larsoner commented Oct 22, 2020 •

edited

Loading

NeuroLaunch commented Feb 18, 2021 •

edited

Loading