Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workshop feedback #22

Open
multimeric opened this issue Aug 1, 2023 · 5 comments
Open

Workshop feedback #22

multimeric opened this issue Aug 1, 2023 · 5 comments

Comments

@multimeric
Copy link
Collaborator

multimeric commented Aug 1, 2023

Just a list of thoughts I had from running the workshop last week. Overall the workshop was excellent and very clear, so don't think this is any serious criticism, it's just hard to give feedback on things that went well!

Oh, and also these are very low priority, the workshop doesn't need any of these changed urgently or at all, I'm just writing everything down while I remember.

  • I forgot to update the list of required packages with the move to crew (whoops!):
    install.packages(
    c(
    "conflicted",
    "future.callr",
    "future",
    "palmerpenguins",
    "quarto",
    "tarchetypes",
    "targets",
    "tidyverse",
    "visNetwork"
    )
    )
  • Why do we tell the users to install crew at this point (previously batchtools) if we do so in setup.md?
    ### Install required packages
    You will need to install several packages to use the `crew` backend:
  • Wasn't sure what this meant:
    - **Batch creation** of workflow steps
  • It was a touch confusing to go from looking at _targets.R to jumping backwards into non-targets code:
    ### Background: non-`targets` version
    We will use this template to start building our analysis of bill shape in penguins.
    First though, to get familiar with the functions and packages we'll use, let's run the code like you would in a "normal" R script without using `targets`.
    . However I'm not sure how to improve this.
  • The non-targets code can be all run in a terminal, without using files, although this could be made clearer:
    ```{r}
    #| label: normal-r-path
    library(palmerpenguins)
    # Get path to CSV file
    penguins_csv_file <- path_to_file("penguins_raw.csv")
    penguins_csv_file
    ```
  • This is probably a controversial take but tar_load and tar_read are so similar in function, it hardly seems worth teaching both:
    `tar_read()` is similar to `tar_load()` in that it is used to retrieve objects built by the workflow, but unlike `tar_load()`, it returns them directly as output.
  • I wasn't so sure about _targets/user. Although it's mentioned in the docs somewhere, I don't fully understand why we would put files there instead of just in our project root. I haven't personally ever used this directory:
    The one exception to this rule is a special subfolder called `_targets/user`.
    This folder does not exist by default.
    You can create it if you want, and put whatever you want inside.
  • An example way of modifying the workflow without breaking it would have helped here. I just added an empty line with just 1 on it, but that was just off the top of my head:
    Modify the workflow in `_targets.R`, then run `tar_visnetwork()` again **without** running `tar_make()`.
  • Ideally I think we could have illustrated this point using the workflow we just ran. For example could we modify the file path function but make it return the same string to prove that the subsequent tasks don't have to re-run?
    ## 'Outdated' does not always mean 'will be run'
  • I found tar_outdated and tar_progress a bit superflous since tar_visnetwork and the regular tar_make reporting mostly cover this info:
    ## Other ways to check workflow status
  • An example output from tar_make(some_target) would be useful. I did this when I ran the workshop but it wasn't in the material:
    To do this, type the name of the target you wish to build after `tar_make()` (note that any targets required by the one you specify will also be built).
    For example, `tar_make(penguins_data_raw)` would **only** build `penguins_data_raw`, not `penguins_data`.
  • I didn't find tar_plan so much simpler that it was worth confusing learners with two options:
    ## A simpler way to write workflow plans
  • Another Hot Take, but I wonder if we needed to teach so many methods of loading packages? I like the idea of not teaching library, just because it's almost strictly worse than tar_options(packages=). The other methods have reasonable tradeoffs that I can see the value in teaching though:
    ### Method 1: `library()` {#method-1}
  • The conflicted section didn't seem entirely necessary, because although it's a useful package to know about, it's not an inherent problem with Targets:
    ## Resolving namespace conflicts
  • tar_file_read didn't (doesn't) seem massively important to me, because it's basically as concise if we just write the two targets ourselves. Plus I imagine the !!x syntax is pretty confusing to the uninitiated:
    ## A shortcut (or, About target factories)
  • I was asked a question about why we do models[[1]], which lead to an interesting diversion into the iteration argument to tar_target(). I wonder if this would be worth talking about specifically. On the other hand, I understand why you didn't use iteration="list" here, because you wanted to retain the names:
    glance(models[[1]]),
  • Not sure why R universe is needed here?
    install.packages("nanonext", repos = "https://shikokuchuo.r-universe.dev")
    install.packages("mirai", repos = "https://shikokuchuo.r-universe.dev")
    install.packages("crew", type = "source")
@joelnitta
Copy link
Collaborator

Thanks so much for this great feedback! It will take me some time, but I will try to address these. Some may be split off into their own issues.

Many of these come back to a common problem - how much information is the right amount to present during the workshop. It is tempting to try and be comprehensive, but that can make learning harder if there is too much information and not all of it is necessary to grasp the concepts.

@multimeric
Copy link
Collaborator Author

Hi @joelnitta, did you get a chance to think about any of these? I'm still in favour of removing tar_plan and tar_file for example, and I'm happy to submit a PR.

@joelnitta
Copy link
Collaborator

joelnitta commented Jul 10, 2024

Thanks again for the feedback!

Here is a summary of my responses. I've split out most of these into separate issues.

  • I forgot to update the list of required packages with the move to crew

  • Why do we tell the users to install crew at this point (previously batchtools) if we do so in setup.md?

  • Wasn't sure what this meant

    • I was thinking of branching - you can give targets a pattern, and it will define the workflow steps (targets) for you.
  • It was a touch confusing to go from looking at _targets.R to jumping backwards into non-targets code

  • The non-targets code can be all run in a terminal, without using files, although this could be made clearer

    • Not quite sure what you mean by "all run in a terminal"
  • This is probably a controversial take but tar_load and tar_read are so similar in function, it hardly seems worth teaching both

  • I wasn't so sure about _targets/user. Although it's mentioned in the docs somewhere, I don't fully understand why we would put files there instead of just in our project root. I haven't personally ever used this directory

  • An example way of modifying the workflow without breaking it would have helped here. I just added an empty line with just 1 on it, but that was just off the top of my head

  • Ideally I think we could have illustrated this point using the workflow we just ran. For example could we modify the file path function but make it return the same string to prove that the subsequent tasks don't have to re-run?

  • I found tar_outdated and tar_progress a bit superflous since tar_visnetwork and the regular tar_make reporting mostly cover this info

  • An example output from tar_make(some_target) would be useful. I did this when I ran the workshop but it wasn't in the material

  • I didn't find tar_plan so much simpler that it was worth confusing learners with two options

  • Another Hot Take, but I wonder if we needed to teach so many methods of loading packages? I like the idea of not teaching library, just because it's almost strictly worse than tar_options(packages=). The other methods have reasonable tradeoffs that I can see the value in teaching though

    • If we chose one, I'd lean toward library() actually, for two reasons: 1) it's what people know already, 2) it is easier to use with renv (beyond the scope of this lesson, but a logical next step on learners' reproducibility journeys)
  • The conflicted section didn't seem entirely necessary, because although it's a useful package to know about, it's not an inherent problem with Targets

  • tar_file_read didn't (doesn't) seem massively important to me, because it's basically as concise if we just write the two targets ourselves. Plus I imagine the !!x syntax is pretty confusing to the uninitiated

  • I was asked a question about why we do models[[1]], which lead to an interesting diversion into the iteration argument to tar_target(). I wonder if this would be worth talking about specifically. On the other hand, I understand why you didn't use iteration="list" here, because you wanted to retain the names

  • Not sure why R universe is needed here?

@multimeric
Copy link
Collaborator Author

I just ran this workshop again. I want to re-iterate all the previous points I made about cutting out some of the alternatives we discuss, like all the package loading strategies, as I ran out of time.

The named list strategy (combined_model = lm(bill_depth_mm ~ bill_length_mm, data = penguins_data)) caused confusion again. Could we maybe use a data frame here, because then I think we can get "names" as well as avoiding the iteration argument.

One new point was that I don't have a good explanation for why tar_file works for both input and output files. This isn't really explained in the material either.

@joelnitta
Copy link
Collaborator

@multimeric Thanks!

I am (finally!) going through your comments more carefully as you can see above and opening issues as needed. I will reply to your comments about named list and tar_file as I get to them.

I am also teaching this again soon, so I really appreciate the additional feedback 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants