Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

marginaleffects examples #240

Open
vincentarelbundock opened this issue Jun 4, 2024 · 3 comments
Open

marginaleffects examples #240

vincentarelbundock opened this issue Jun 4, 2024 · 3 comments

Comments

@vincentarelbundock
Copy link

@malcolmbarrett requested an issue instead of a PR, so here is a copy of examples from the G-computation chapter I posted here: #239

# A linear model for wait_minutes_posted_avg
fit_wait_minutes <- lm(
  wait_minutes_posted_avg ~ 
    park_extra_magic_morning + park_ticket_season + park_close +
    park_temperature_high,
  data = seven_dwarfs_9
)

avg_predictions(
  fit_wait_minutes, 
  variables = "park_extra_magic_morning")

 park_extra_magic_morning Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %
                        0     68.1      0.915 74.4   <0.001   Inf  66.3   69.9
                        1     74.2      2.052 36.2   <0.001 949.9  70.2   78.3

avg_predictions(
  fit_wait_minutes, 
  hypothesis = "b2 - b1 = 0",
  variables = "park_extra_magic_morning")

    Term Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
 b2-b1=0     6.16       2.26 2.73  0.00636 7.3  1.74   10.6

fit_wait_minutes_actual <- lm(
  wait_minutes_actual_avg ~ 
    ns(wait_minutes_posted_avg, df = 3) + 
    park_extra_magic_morning +
    park_ticket_season + park_close +
    park_temperature_high,
  data = wait_times
)

avg_predictions(fit_wait_minutes_actual, 
  variables = list(wait_minutes_posted_avg = c(60, 30))
)

 wait_minutes_posted_avg Estimate Std. Error     z Pr(>|z|)    S 2.5 % 97.5 %
                      60     29.8       7.05  4.23   <0.001 15.4  16.0   43.7
                      30     40.7       3.84 10.60   <0.001 84.8  33.2   48.2

avg_comparisons(fit_wait_minutes_actual, 
  variables = list(wait_minutes_posted_avg = c(60, 30))
)

                    Term            Contrast Estimate Std. Error     z Pr(>|z|)   S 2.5 % 97.5 %
 wait_minutes_posted_avg mean(60) - mean(30)    -10.8       8.13 -1.33    0.182 2.5 -26.8   5.09
@malcolmbarrett
Copy link
Collaborator

@vincentarelbundock can you tell me a little more about the SEs here? I know it's the delta method, and I vaguely remember that it's exact and so the nominal coverage should be right. Is that true?

My instinct is that this only works for this sort of two time point point-estimate and not for longitudinal g-computation, particularly with dynamic exposures. Curious if you disagree or have additional thoughts on that

@vincentarelbundock
Copy link
Author

@malcolmbarrett , I don't have a great intuition for the case you describe. Do you have a function we could use to simulate data and run some tests?

Note that there are several options for SEs:

  • Delta method
    • Classical (IID): default
    • Robust: vcov="HC3" or any of the sandwich package covariance estimators, such has heteroskedasticity or autocorrelation-consistent.
    • Clustered: vcov=~groupid
  • Bootstrap
    • avg_comparisons(model) |> inferences(method = "rsample")
  • Simulation-based inference
    • avg_comparisons(model) |> inferences(method = "simulation")

If the classical standard errors don't work in your case, one of the alternatives might. It's just hard for me to say without getting more specific about the DGP and running some monte carlos (assuming we can't rely on known theory).

@malcolmbarrett
Copy link
Collaborator

Ok I'll come back with something more concrete when I turn my attention fully to this chapter. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants