Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rerunning Apps #182

Open
pvandyken opened this issue Jul 21, 2022 · 6 comments
Open

Rerunning Apps #182

pvandyken opened this issue Jul 21, 2022 · 6 comments
Labels
enhancement New feature or request

Comments

@pvandyken
Copy link
Contributor

The problem

I don't know what the cleanest way of implementing this in a BidsApp is, but it seems it would be useful to have a "rerun" capability a la datalad rerun. Given pretty much everything is already saved in the config file created in the output dir, one should be able to supply run.py with just the output dir and run everything again the same way. An additional command line --args would override previous ones.

In the same vein, a utility command that extracts and prints previous run settings might be useful.

Use cases might include running an app with different settings after a long period of time, ensuring settings you don't explicitly change (including the input bids dir, which is not always obvious) remain the same. Or generating a report, etc.

@pvandyken pvandyken added the enhancement New feature or request label Jul 21, 2022
@kaitj
Copy link
Contributor

kaitj commented Jul 22, 2022

+1 - would have to think about this a little more in terms of implementation, but like the idea of not having to resupply arguments if possible - can we just pass the config in this case? Another consideration would be if you're supplying new args to override previous ones (I can see how this can be useful), can it still be considered a rerun, or should it be a new run?

One note, unless this has changed recently, for it to be considered a BIDS app, the required arguments are bids_dir, output_dir, and analysis_level (https://bids-apps.neuroimaging.io/dev_faq/). It might be supplying those inputs with a --rerun flag or something similar. Is possible to force a run via Snakemake args using the existing config?

@pvandyken
Copy link
Contributor Author

Re required arguments, I was hoping there would be some latitude here... To me, if you have to supply the input dir, that goes against the point of a rerun. What if I forget the input dir?

But standards are standards. We've talked before about how to incorporate other "utility" features into a snakebids app. Perhaps it's time to revisit this?

@tkkuehn
Copy link
Contributor

tkkuehn commented Jul 25, 2022

I'm of two minds on this proposal. It seems a little beyond the scope of what Snakebids is supposed to do (i.e. I think this kind of use of provenance information is better suited to something like DataLad), and potentially brittle (if after a long time the input dataset is no longer in its original location Snakebids has no way of knowing where it might be). I also worry about encouraging users to run the same workflow with different settings to a pre-populated output directory. I think Snakemake will mostly replace the correct (i.e. outdated) files, but if the workflow or the settings have changed significantly, you could easily end up with an unclear mix of irrelevant old output files and relevant new output files (it is possible that there are Snakemake features that handle this kind of thing and I'm not aware of them, though).

That said, the requisite provenance information is there, and it would be nice to make it easily usable. I think something like the suggested "utility command that extracts and prints previous run settings," maybe as part of the snakebids interface, might by itself be enough to be useful without running into the API or other issues.

I think if running your app with the BIDS app API always does the expected thing, an interface like Peter proposed isn't necessarily prohibited. However, an (awkward) alternative might be to allow something like run.py - {output_dir} -.

@pvandyken
Copy link
Contributor Author

Just an idea for another way to approach this: we could do something like

snakebids rerun /path/to/output [arg overrides]

So in this world, we'll assume snakebids is available in the current python environment, or that it's pipx installed. It will use the local snakemake to rerun the app using the same specs found in the config file in the output. If we save pipeline and snakemake versions (as proposed in #205), we could throw errors if any versions have changed (using flags to override).

By default, this would be primarily for devs (who have snakebids installed). But we can make an API so that it can be incorporated into apps using whatever CLI is desired. So it wouldn't necessarily be baked into every bids app (and wouldn't violate any specs), but would be possible for devs and app makers who want to expose the functionality.

@akhanf
Copy link
Member

akhanf commented Dec 7, 2022

I like that approach, would fit nicely with the way I am running apps right now.

To be clear, it could be used to rerun an app that has since had subjects added to the dataset, right? (Ie it will generate inputs again?).

@pvandyken
Copy link
Contributor Author

pvandyken commented Dec 7, 2022

Yeah, so it would do the following:

  1. Check the pipeline version and snakemake version previously used and compare with the present. Error out if they differ, unless the user specifies to ignore versions (via some flag)
  2. Re-create the CLI call the user made
  3. Apply as patches any new args the user provides. Some rational method will be needed to both add and remove previous args. The simplest would just be a --snakemake-args ... argument similar to --pip-args in pipx, and it would do a clean override.
  4. Call the reconstructed command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants