Workshop feedback #22

multimeric · 2023-08-01T11:56:59Z

Just a list of thoughts I had from running the workshop last week. Overall the workshop was excellent and very clear, so don't think this is any serious criticism, it's just hard to give feedback on things that went well!

Oh, and also these are very low priority, the workshop doesn't need any of these changed urgently or at all, I'm just writing everything down while I remember.

I forgot to update the list of required packages with the move to crew (whoops!):

targets-workshop/learners/setup.md

Lines 15 to 27 in 4376011

    
           install.packages( 
        
             c( 
        
               "conflicted", 
        
               "future.callr", 
        
               "future", 
        
               "palmerpenguins", 
        
               "quarto", 
        
               "tarchetypes", 
        
               "targets", 
        
               "tidyverse", 
        
               "visNetwork" 
        
             ) 
        
           )

Why do we tell the users to install crew at this point (previously batchtools) if we do so in setup.md?

targets-workshop/episodes/parallel.Rmd

Lines 61 to 63 in 4376011

    
           ### Install required packages 
        
           You will need to install several packages to use the `crew` backend:

Wasn't sure what this meant:

targets-workshop/episodes/introduction.Rmd

Line 47 in 4376011

- **Batch creation** of workflow steps

It was a touch confusing to go from looking at _targets.R to jumping backwards into non-targets code:

targets-workshop/episodes/basic-targets.Rmd

Lines 119 to 122 in 4376011

    
           ### Background: non-`targets` version 
        
           We will use this template to start building our analysis of bill shape in penguins. 
        
           First though, to get familiar with the functions and packages we'll use, let's run the code like you would in a "normal" R script without using `targets`.

. However I'm not sure how to improve this.

The non-targets code can be all run in a terminal, without using files, although this could be made clearer:

targets-workshop/episodes/basic-targets.Rmd

Lines 130 to 138 in 4376011

    
           ```{r} 
        
           #| label: normal-r-path 
        
           library(palmerpenguins) 
        
           # Get path to CSV file 
        
           penguins_csv_file <- path_to_file("penguins_raw.csv") 
        
           penguins_csv_file 
        
           ```

This is probably a controversial take but tar_load and tar_read are so similar in function, it hardly seems worth teaching both:

targets-workshop/episodes/cache.Rmd

Line 97 in 4376011

    
           `tar_read()` is similar to `tar_load()` in that it is used to retrieve objects built by the workflow, but unlike `tar_load()`, it returns them directly as output.

I wasn't so sure about _targets/user. Although it's mentioned in the docs somewhere, I don't fully understand why we would put files there instead of just in our project root. I haven't personally ever used this directory:

targets-workshop/episodes/cache.Rmd

Lines 140 to 142 in 4376011

    
           The one exception to this rule is a special subfolder called `_targets/user`. 
        
           This folder does not exist by default. 
        
           You can create it if you want, and put whatever you want inside.

An example way of modifying the workflow without breaking it would have helped here. I just added an empty line with just 1 on it, but that was just off the top of my head:

targets-workshop/episodes/lifecycle.Rmd

Line 214 in 4376011

Modify the workflow in `_targets.R`, then run `tar_visnetwork()` again **without** running `tar_make()`.
Ideally I think we could have illustrated this point using the workflow we just ran. For example could we modify the file path function but make it return the same string to prove that the subsequent tasks don't have to re-run?

targets-workshop/episodes/lifecycle.Rmd

Line 229 in 4376011

## 'Outdated' does not always mean 'will be run'
I found tar_outdated and tar_progress a bit superflous since tar_visnetwork and the regular tar_make reporting mostly cover this info:

targets-workshop/episodes/lifecycle.Rmd

Line 245 in 4376011

## Other ways to check workflow status

An example output from tar_make(some_target) would be useful. I did this when I ran the workshop but it wasn't in the material:

targets-workshop/episodes/lifecycle.Rmd

Lines 300 to 301 in 4376011

    
           To do this, type the name of the target you wish to build after `tar_make()` (note that any targets required by the one you specify will also be built). 
        
           For example, `tar_make(penguins_data_raw)` would **only** build `penguins_data_raw`, not `penguins_data`.

I didn't find tar_plan so much simpler that it was worth confusing learners with two options:

targets-workshop/episodes/organization.Rmd

Line 37 in 4376011

## A simpler way to write workflow plans
Another Hot Take, but I wonder if we needed to teach so many methods of loading packages? I like the idea of not teaching library, just because it's almost strictly worse than tar_options(packages=). The other methods have reasonable tradeoffs that I can see the value in teaching though:

targets-workshop/episodes/packages.Rmd

Line 41 in 4376011

### Method 1: `library()` {#method-1}
The conflicted section didn't seem entirely necessary, because although it's a useful package to know about, it's not an inherent problem with Targets:

targets-workshop/episodes/packages.Rmd

Line 229 in 4376011

## Resolving namespace conflicts
tar_file_read didn't (doesn't) seem massively important to me, because it's basically as concise if we just write the two targets ourselves. Plus I imagine the !!x syntax is pretty confusing to the uninitiated:

targets-workshop/episodes/files.Rmd

Line 154 in 4376011

## A shortcut (or, About target factories)
I was asked a question about why we do models[[1]], which lead to an interesting diversion into the iteration argument to tar_target(). I wonder if this would be worth talking about specifically. On the other hand, I understand why you didn't use iteration="list" here, because you wanted to retain the names:

targets-workshop/episodes/branch.Rmd

Line 294 in 4376011

glance(models[[1]]),

Not sure why R universe is needed here?

targets-workshop/episodes/parallel.Rmd

Lines 68 to 70 in 4376011

    
           install.packages("nanonext", repos = "https://shikokuchuo.r-universe.dev") 
        
           install.packages("mirai", repos = "https://shikokuchuo.r-universe.dev") 
        
           install.packages("crew", type = "source")

The text was updated successfully, but these errors were encountered:

joelnitta · 2023-08-06T23:00:47Z

Thanks so much for this great feedback! It will take me some time, but I will try to address these. Some may be split off into their own issues.

Many of these come back to a common problem - how much information is the right amount to present during the workshop. It is tempting to try and be comprehensive, but that can make learning harder if there is too much information and not all of it is necessary to grasp the concepts.

multimeric · 2024-07-10T05:53:45Z

Hi @joelnitta, did you get a chance to think about any of these? I'm still in favour of removing tar_plan and tar_file for example, and I'm happy to submit a PR.

joelnitta · 2024-07-10T08:02:35Z

Thanks again for the feedback!

Here is a summary of my responses. I've split out most of these into separate issues.

I forgot to update the list of required packages with the move to crew
- Fixed by Joelnitta/issue31 #32
Why do we tell the users to install crew at this point (previously batchtools) if we do so in setup.md?
- Fixed by Joelnitta/issue31 #32
Wasn't sure what this meant
- I was thinking of branching - you can give targets a pattern, and it will define the workflow steps (targets) for you.
It was a touch confusing to go from looking at _targets.R to jumping backwards into non-targets code
- Opened Order of non-targets example and targets example is confusing #33 for this.
The non-targets code can be all run in a terminal, without using files, although this could be made clearer
- Not quite sure what you mean by "all run in a terminal"
This is probably a controversial take but tar_load and tar_read are so similar in function, it hardly seems worth teaching both
- Opened tar_read() is not needed #34 for this
I wasn't so sure about _targets/user. Although it's mentioned in the docs somewhere, I don't fully understand why we would put files there instead of just in our project root. I haven't personally ever used this directory
- Opened Should not mention _targets/user #35 for this
An example way of modifying the workflow without breaking it would have helped here. I just added an empty line with just 1 on it, but that was just off the top of my head
- Opened Provide a method to modify the workflow in Workflow Lifecycle Challenge 2 #36 for this
Ideally I think we could have illustrated this point using the workflow we just ran. For example could we modify the file path function but make it return the same string to prove that the subsequent tasks don't have to re-run?
- Opened Missing an actual code example for the concept that "'Outdated' does not always mean 'will be run'" #37 for this
I found tar_outdated and tar_progress a bit superflous since tar_visnetwork and the regular tar_make reporting mostly cover this info
- Opened tar_outdated() and tar_progress() not needed #38 for this
An example output from tar_make(some_target) would be useful. I did this when I ran the workshop but it wasn't in the material
- Opened Missing output for code example of tar_make(some_specific_target) #39 for this
I didn't find tar_plan so much simpler that it was worth confusing learners with two options
- Opened tar_plan() not needed #40 for this
Another Hot Take, but I wonder if we needed to teach so many methods of loading packages? I like the idea of not teaching library, just because it's almost strictly worse than tar_options(packages=). The other methods have reasonable tradeoffs that I can see the value in teaching though
- If we chose one, I'd lean toward library() actually, for two reasons: 1) it's what people know already, 2) it is easier to use with renv (beyond the scope of this lesson, but a logical next step on learners' reproducibility journeys)
The conflicted section didn't seem entirely necessary, because although it's a useful package to know about, it's not an inherent problem with Targets
- Opened {conflicted} not needed #41 for this
tar_file_read didn't (doesn't) seem massively important to me, because it's basically as concise if we just write the two targets ourselves. Plus I imagine the !!x syntax is pretty confusing to the uninitiated
- Opened tar_file_read() not needed #42 for this
I was asked a question about why we do models[[1]], which lead to an interesting diversion into the iteration argument to tar_target(). I wonder if this would be worth talking about specifically. On the other hand, I understand why you didn't use iteration="list" here, because you wanted to retain the names
- Opened Figure out a way to avoid models[[1]] #43 for this
Not sure why R universe is needed here?
- Fixed by Joelnitta/issue31 #32

multimeric · 2024-07-11T05:30:33Z

I just ran this workshop again. I want to re-iterate all the previous points I made about cutting out some of the alternatives we discuss, like all the package loading strategies, as I ran out of time.

The named list strategy (combined_model = lm(bill_depth_mm ~ bill_length_mm, data = penguins_data)) caused confusion again. Could we maybe use a data frame here, because then I think we can get "names" as well as avoiding the iteration argument.

One new point was that I don't have a good explanation for why tar_file works for both input and output files. This isn't really explained in the material either.

joelnitta · 2024-07-11T08:18:48Z

@multimeric Thanks!

I am (finally!) going through your comments more carefully as you can see above and opening issues as needed. I will reply to your comments about named list and tar_file as I get to them.

I am also teaching this again soon, so I really appreciate the additional feedback 🙏

multimeric mentioned this issue Jul 10, 2024

Use format = "file" instead of tar_file #30

Open

amart90 mentioned this issue Jul 29, 2024

renv can parse tar_option_set() now #46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workshop feedback #22

Workshop feedback #22

multimeric commented Aug 1, 2023 •

edited

Loading

joelnitta commented Aug 6, 2023

multimeric commented Jul 10, 2024

joelnitta commented Jul 10, 2024 •

edited

Loading

multimeric commented Jul 11, 2024

joelnitta commented Jul 11, 2024

Workshop feedback #22

Workshop feedback #22

Comments

multimeric commented Aug 1, 2023 • edited Loading

joelnitta commented Aug 6, 2023

multimeric commented Jul 10, 2024

joelnitta commented Jul 10, 2024 • edited Loading

multimeric commented Jul 11, 2024

joelnitta commented Jul 11, 2024

multimeric commented Aug 1, 2023 •

edited

Loading

joelnitta commented Jul 10, 2024 •

edited

Loading