Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible workflow when arguments are objects #193

Open
timcdlucas opened this issue Oct 10, 2015 · 3 comments
Open

Reproducible workflow when arguments are objects #193

timcdlucas opened this issue Oct 10, 2015 · 3 comments

Comments

@timcdlucas
Copy link
Contributor

Related to #192

It seems reasonable that people might use objects to define arguments. However, for the workflow object to be reproducible we would want to save the call as if it wasn't an object.

k = 2

work2 <- workflow(occurrence = UKAnophelesPlumbeus,
                  covariate  = UKAir,
                  process    = BackgroundAndCrossvalid(k = k),
                  model      = LogisticRegression,
                  output     = PerformanceMeasures)


RerunWorkflow(work2)

Caught errors:
Error in 1:k: NA/NaN argument

...

===================

Call: workflow(occurrence = UKAnophelesPlumbeus, covariate = UKAir, process = BackgroundAndCrossvalid(k = k), model = LogisticRegression, output = PerformanceMeasures, forceReproducible = FALSE) 

We would want the call to be saved as k=2.

@goldingn
Copy link
Member

Seems sensible for this use case. We could handle this by checking whether arguments are objects in the calling environment and then dputing them.

That approach would be awful for large objects (e.g. rasters passed to PredictNewAreaMap #145) though.
An alternative would be to store all the objects used in the workflow object.

@timcdlucas
Copy link
Contributor Author

Yes think the latter is a better general idea. Then RerunWorkflow needs to know where to find those objects. Possibly by writing those things from the workflow object to the global environment at the beginning.

@goldingn
Copy link
Member

Rather than writing to global, we could:

  1. define a new environment obj_env in workflow
  2. copy the named objects from global into obj_env
  3. set that obj_env as the place to look for named objects
  4. return obj_env in the workflow object

Then rerunworkflow could just fetch objects from obj_env in the workflow object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants