Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch NHD tasks to use new pattern #188

Open
jordansread opened this issue Nov 22, 2020 · 4 comments
Open

Switch NHD tasks to use new pattern #188

jordansread opened this issue Nov 22, 2020 · 4 comments
Labels
wontfix This will not be worked on

Comments

@jordansread
Copy link

Right now the NHD features and GNIS names (two different targets) come out of combiners in the 1_crosswalk_fetch_nhd_tasks.yml file.

Since the plan and the yaml are both targets in 1_crosswalk, we get unintended builds whenever someone has a different version of scipiper. Since this is such an early target with so many downstream implications, this fix is probably more important than some of the other conversions we've ignored elsewhere.

The 1_crosswalk_fetch/out/canonical_lakes_sf.rds.ind and 2_crosswalk_munge/out/gnisname_nhdhr_xwalk.rds.ind are the two targets that come out of this workflow. The two target pluck pattern in pipelines III is a good fit here: https://github.com/limnoliver/ds-pipelines-3/blob/master/remake.yml#L56

@jordansread
Copy link
Author

Lindsay, I'm not sure that the pluck pattern will work for this after all, so heads up on that. The pluck assumes the return is an object target, but these are files.

I think part 1 of this issue is figuring out if an overhaul to the new pattern is pressing (i.e., the current pattern creates unnecessary builds, which is bad). We may be fine here if everyone is on a fairly recent version of scipiper where the package version is no longer written into the remakefile that is built for tasks.

Part 2 of this issue (after concluding the re-work is necessary) is to create a contained task table and two combiner target build. As mentioned above, I think you may not be able to directly use the pipelines III solution to the two-target return on a task table, since those were objects. But perhaps you could return a data.frame that includes data_file, and hash for the two and create your own function where the pluck would've been used (I'm calling this data.frame "nhd_task_combiners" below). In this case, perhaps this function does a little more than pluck, because it would actually call gd_put inside the function?

So maybe on the parent remake (1_crosswalk), you'd have

  1_crosswalk_fetch/out/canonical_lakes_sf.rds.ind:
    command: gd_put_plucked(nhd_task_combiners, target_name)

and gd_put_plucked would have something that filters the nhd_task_combiners row to the appropriate file using as_data_file(target_name) and runs gd_put(target_name, data_file)?

@lindsayplatt
Copy link
Contributor

I do not think this is an issue if you are using scipiper v0.0.22 or greater.

To test, I did the following:

  1. Checked scipiper version which was 0.0.22.
  2. Ran scmake("1_crosswalk_fetch/out/canonical_lakes_sf.rds.ind") which built 1_crosswalk_fetch_nhd_tasks.yml but did not cause anything else to rebuild, including 1_crosswalk_fetch/out/canonical_lakes_sf.rds.ind.
  3. Then, to test if a different version of scipiper would cause a rebuild, I upgraded to v0.0.23 by running remotes::install_github('USGS-R/scipiper').
  4. I then deleted 1_crosswalk_fetch_nhd_tasks.yml.
  5. I, once again, ran scmake("1_crosswalk_fetch/out/canonical_lakes_sf.rds.ind") which built 1_crosswalk_fetch_nhd_tasks.yml but did not cause anything else to rebuild.

These steps seem like they would have caused an annoying rebuild in the past due to different versions, but now they do not.

@jordansread
Copy link
Author

Ok, good deal. I'd suggest we keep this issue open but treat it as lower priority (i.e., a wontfix right now). Agreed?

@lindsayplatt
Copy link
Contributor

Yes, sounds good to me!

@lindsayplatt lindsayplatt added the wontfix This will not be worked on label Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants