Feedback on chapter 13 #296

rpruim · 2024-11-30T16:55:43Z

park_extra_magic_morning = c(rep(1, 5000), rep(0, 5000)) -> rep(1:0, each = 5000) is shorter and clearer. rep(0:1, length.out = 10000 is perhaps safer (but the order of the 0s and 1s will be different.
I would re-order the workflow in 13.3. 1) Refine the question; 2) Wrangle the data (since it comes first in the workflow, and so you can use the wrangled data to test subsequent steps along the way, not just at the very end); 3) simulate population for left-most variables; 4) simulate process (perhaps with a better/less vague name); 5) compute stats.
Should pivot_longer( names_to = "term", values_to = "estimate", cols = everything() go inside compute_stats()?
In fit_models() from 13.3, fit_wait_minutes_posted is never needed or used since we set the values of that variable based on our contrast.
It would perhaps be good to add a bootstrap confidence interval at the end of 13.3.
The modular functions that return lists of models; data + contrast; etc. seem a little heavy and unnatural. Seems like it might be nicer if the functions returned more natural things (like a simple data frame). sim_population() could produce a data frame from simulation parameters, and calculate_stats() could produce a tidy model data frame from a data set. In any case, I would remove the pluck() from compute_stats().
in compute_stats(), exposure_val and control_val are never used and values of 30 and 60 are hard coded into the names of the returned object. I'd suggest something like the code below. (Alternatively, one could use lm() |> tidy().)

# sim_obj is a list created by our simulate_process() function
compute_stats <- function(sim_obj) {
 
 sim_obj |> pluck("df_outcome") |>   # pluck() can be avoided if the input is a data frame
   group_by(wait_minutes_posted_avg) |>
   summarize(avg_wait_actual = mean(wait_minutes_actual_avg)) |>
   pivot_wider(
     names_from = wait_minutes_posted_avg,
     values_from = avg_wait_actual,
     names_prefix = "X_"
   ) |>
   mutate(effect = diff(c_across(1:2)))
}

The text was updated successfully, but these errors were encountered:

rpruim · 2024-11-30T16:59:53Z

Here is an alternative way to do the g-computation in 13.3:

simulate_population2 <- function(orig_data, contrast = c(30,60), size = 10000) { 
  orig_data |>
    select(park_ticket_season, park_close, park_temperature_high) |>
    slice_sample(n = 10000, replace = TRUE) |>
    mutate(wait_minutes_posted_avg = rep(contrast, length.out = size)) |>
    augment( 
      newdata = _,
      glm( park_extra_magic_morning ~ 
             park_ticket_season + park_close + park_temperature_high,
           data = orig_data, family = "binomial"),
      type.predict = "response") |>
    mutate(park_extra_magic_morning = rbinom(size, 1, .fitted)) |>
    augment(
      newdata = _,
      lm(wait_minutes_actual_avg ~
           splines::ns(wait_minutes_posted_avg, df = 3) + park_extra_magic_morning + 
           park_ticket_season + park_close + park_temperature_high,
         data = orig_data)) |>
    rename(wait_minutes_actual_avg = .fitted)
}    

compute_stats2 <- function(population) {
  population  |> 
    lm(wait_minutes_actual_avg ~ factor(wait_minutes_posted_avg), data = _) |>
    tidy()
}

set.seed(8675309)
wait_times |>
  simulate_population2() |>
  compute_stats2()

rpruim · 2024-12-07T15:31:14Z

wrong milestone?

malcolmbarrett · 2024-12-08T00:46:57Z

No, I'm going to rework this chapter to show a simpler approach you can use when it's a pre-post analysis (cloning and standardizing). We're going to use the currently described approach in the section on time-varying exposures, when you need this way of simulating the data

malcolmbarrett added this to the Chapter 18: Causal inference across time milestone Dec 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback on chapter 13 #296

Feedback on chapter 13 #296

rpruim commented Nov 30, 2024

rpruim commented Nov 30, 2024

rpruim commented Dec 7, 2024

malcolmbarrett commented Dec 8, 2024

Feedback on chapter 13 #296

Feedback on chapter 13 #296

Comments

rpruim commented Nov 30, 2024

rpruim commented Nov 30, 2024

rpruim commented Dec 7, 2024

malcolmbarrett commented Dec 8, 2024