Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Error in pred_interval_esmeans() #46

Open
zacrobinson5 opened this issue Aug 4, 2023 · 21 comments
Open

Potential Error in pred_interval_esmeans() #46

zacrobinson5 opened this issue Aug 4, 2023 · 21 comments

Comments

@zacrobinson5
Copy link

zacrobinson5 commented Aug 4, 2023

Describe the bug
In comparing the prediction intervals on some random intercept vs random slope models, I think there may be an error in the pred_interval_esmeans() function.

In the code of the function, the following is listed:

if(length(model$tau2) <= 1){ #including gamma2

I believe this ifelse() statement is supposed to determine the way to calculate the prediction intervals based on the ~inner | outer structure of the random effects in rma.mv. Because tau2 will always return a number rather than NULL, the result of length(model$tau2) will always be <=1, which sends it down the first "path" of the ifelse() statement which only takes into account the sigma(s):

sigmas <- sum(model$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)

The comment next to (i.e., "including gamma2) it is what makes me think this isn't the desired function, as if multiple ~ inner arguments are included, the length of tau2 and gamma2 arguments together would be >1, which would send it down a different "path" that includes tau and gamma in the calculations which are otherwise ignored.

To Reproduce
Steps to reproduce the behavior:

library(metafor)
library(orchaRd)
library(tidyverse)

data("dat.konstantopoulos2011")
df=dat.konstantopoulos2011%>%
  as.data.frame()

#fit random intercept model
m1=rma.mv(yi ~ year, vi, random = list(~1|study, ~1|district, ~1|school), data = df, method = "REML")
e1=emmeans(emmprep(m1),~year)
p1=pred_interval_esmeans(m1,e1,mod = "year")%>%
  as.data.frame()%>%
  mutate(model="rma.mv (Intercept)")

#fit single random slope model
m2=rma.mv(yi ~ year, vi, random = list(~year|study, ~1|district, ~1|school), data = df, method = "REML")
e2=emmeans(emmprep(m2),~year)
p2=pred_interval_esmeans(m2,e2,mod = "year")%>%
  as.data.frame()%>%
  mutate(model="rma.mv (Slope x 1)")

#fit double random slope model
m3=rma.mv(yi ~ year, vi, random = list(~year|study, ~year|district, ~1|school), data = df, method = "REML")
e3=emmeans(emmprep(m3),~year)
p3=pred_interval_esmeans(m3,e3,mod = "year")%>%
  as.data.frame()%>%
  mutate(model="rma.mv (Slope x 2)")

#combine dfs
prediction.intervals=rbind(p1,p2,p3)

#proofs
tmp <- summary(e1)
tmp <- tmp[ , ]
test.stat <- stats::qt(0.975, tmp$df[[1]])

sigmas <- sum(m1$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
p1b=tmp%>%
  as.data.frame()%>%
  mutate(model="proof (Intercept)")

tmp <- summary(e2)
tmp <- tmp[ , ]
test.stat <- stats::qt(0.975, tmp$df[[1]])

sigmas <- sum(m2$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
p2b=tmp%>%
  as.data.frame()%>%
  mutate(model="proof (Slope x 1)")

tmp <- summary(e3)
tmp <- tmp[ , ]
test.stat <- stats::qt(0.975, tmp$df[[1]])

sigmas <- sum(m3$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
p3b=tmp%>%
  as.data.frame()%>%
  mutate(model="proof (Slope x 2)")

#compare proofs

final.df=rbind(p1,p1b,p2,p2b,p3,p3b)
view(final.df) # all PIs are the same

Expected behavior
I'd expect the tau and gamma values to be taken into account for models with more complex ~inner | outer structures

Desktop (please complete the following information):

  • OS: IOS
  • Rstudio 2023.6.1.524
@daniel1noble
Copy link
Owner

Thanks @zacrobinson5 for exploring this in more detail. Yes, looks like a bug. I'll have a closer look at this and write a few more tests to catch this as soon as possible. Appreciate you exploring things in detail.

@zacrobinson5
Copy link
Author

Hi @daniel1noble , I hope all is well! I am sure you're extremely busy so no worries if you haven't gotten a chance to look into this further but any updates here? Thank you as always for your work!

@daniel1noble
Copy link
Owner

daniel1noble commented Aug 30, 2023 via email

@daniel1noble
Copy link
Owner

@zacrobinson5 Sorry it's taken me a while together to this. Can you clarify where the problem arises because I'm seeing the exact same values for your "proof" and "ram.mv". We also don't yet take two slopes in orchaRd either, so I should probably have a warning for that. Here's the output I get from your code (thanks for that!). Without diving too deep yet, can you tell me what you expect the values should be? I assumed that the ram.mv and the proofs would all be the same based on your comments "# all PIs are the same", but I'm not getting that from your code:

final.df
year emmean SE df asymp.LCL asymp.UCL lower.PI
1 1989.554 0.1727144 0.09060712 Inf -0.004872251 0.3503011 -0.48412520
2 1989.554 0.1727144 0.09060712 Inf -0.004872251 0.3503011 -0.48412520
3 1989.554 0.1727144 0.09060711 Inf -0.004872236 0.3503011 -0.38770558
4 1989.554 0.1727144 0.09060711 Inf -0.004872236 0.3503011 -0.38770558
5 1989.554 0.1691904 0.08426555 Inf 0.004033013 0.3343479 -0.02393547
6 1989.554 0.1691904 0.08426555 Inf 0.004033013 0.3343479 -0.02393547
upper.PI model
1 0.8295541 rma.mv (Intercept)
2 0.8295541 proof (Intercept)
3 0.7331344 rma.mv (Slope x 1)
4 0.7331344 proof (Slope x 1)
5 0.3623164 rma.mv (Slope x 2)
6 0.3623164 proof (Slope x 2)

@daniel1noble
Copy link
Owner

daniel1noble commented Sep 14, 2023

@zacrobinson5 Sorry, in dealing with #45 I found a bug, and you picked up on one of the main issues, but it was slightly different, and ended up being a little more complicated because of how the function works across different types of rma models. gamma slots are treated a little differently between models. I believe this fixes your issue, but please let me know if not. I still need to do some testing for double random slope models as I'm not sure it will perform as expected under all conditions. For now #46 and #45 should be sorted but again please do let me know if not and I'll dig deeper.

daniel1noble added a commit that referenced this issue Sep 14, 2023
…at. It leads to one additional issue when using rma.uni models becuase the gamma2 slot is stored as NULL values. So, we need to work around that.
@daniel1noble
Copy link
Owner

Going to close this issue for now. If you find more issues I can re-open

@zacrobinson5
Copy link
Author

zacrobinson5 commented Sep 14, 2023

@daniel1noble Thanks for the work here sir - I think there is still an issue here, but I could be mistaken. What I tried to do in the "proof" data frame is demonstrate that when you calculate the prediction interval with the formula that only includes sigma, it ends up matching the output from the function for each type of model. I don't think this is how it's supposed to work - i.e., tau2/gamma2 should be taken into account for the relevant models. This should result in the "proof" row and the output from pred_interval_esmeans() being different if I'm understanding everything under the hood correctly. Here is the formula I'm using for the proofs which ends up matching the output of the function in all cases, regardless of model specification:

tmp <- summary(e1)
tmp <- tmp[ , ]
test.stat <- stats::qt(0.975, tmp$df[[1]])

sigmas <- sum(m1$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
p1b=tmp%>%
  as.data.frame()%>%
  mutate(model="proof (Intercept)")

@zacrobinson5
Copy link
Author

@daniel1noble played around with this a bit more. I realized in my original code I didn't specify the "struct" argument to which Wolfgang mentioned to me is necessary to get the equivalent of "random slopes" in these models. I've adapted this code a little bit and now am getting a different error when I go to use the function.

Here is the error:

Error in `$<-.data.frame`(`*tmp*`, "lower.PI", value = c(-0.272912654871058, : replacement has 2 rows, data has 1

library(metafor)
library(orchaRd)
library(tidyverse)

data("dat.konstantopoulos2011")
df=dat.konstantopoulos2011%>%
  as.data.frame()

#fit random intercept model
m1=rma.mv(yi ~ year, vi, random = list(~1|study, ~1|district, ~1|school), data = df, method = "REML",
          control = list(optimizer="optim"))
e1=emmeans(emmprep(m1),~year)
p1=pred_interval_esmeans(m1,e1,mod = "year")%>%
  as.data.frame()%>%
  mutate(model="rma.mv (Intercept)")

#fit single random slope model
m2=rma.mv(yi ~ year, vi, random = list(~year|study, ~1|district, ~1|school), data = df, method = "REML",
          struct = "GEN",
          control = list(optimizer="optim"))
e2=emmeans(emmprep(m2),~year)
p2=pred_interval_esmeans(m2,e2,mod = "year")%>% ### GET ERROR HERE
  as.data.frame()%>%
  mutate(model="rma.mv (Slope x 1)")

#fit double random slope model
m3=rma.mv(yi ~ year, vi, random = list(~year|study, ~year|district, ~1|school), data = df, method = "REML",
          struct = c("GEN","GEN"),
          control = list(optimizer="optim"))
e3=emmeans(emmprep(m3),~year)
p3=pred_interval_esmeans(m3,e3,mod = "year")%>% ### GET ERROR HERE
  as.data.frame()%>%
  mutate(model="rma.mv (Slope x 2)")

#combine dfs
prediction.intervals=rbind(p1,p2,p3)

#proofs
tmp <- summary(e1)
tmp <- tmp[ , ]
test.stat <- stats::qt(0.975, tmp$df[[1]])

sigmas <- sum(m1$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
p1b=tmp%>%
  as.data.frame()%>%
  mutate(model="proof (Intercept)")

tmp <- summary(e2)
tmp <- tmp[ , ]
test.stat <- stats::qt(0.975, tmp$df[[1]])

sigmas <- sum(m2$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
p2b=tmp%>%
  as.data.frame()%>%
  mutate(model="proof (Slope x 1)")

tmp <- summary(e3)
tmp <- tmp[ , ]
test.stat <- stats::qt(0.975, tmp$df[[1]])

sigmas <- sum(m3$sigma2)
PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
p3b=tmp%>%
  as.data.frame()%>%
  mutate(model="proof (Slope x 2)")

#compare proofs

final.df=rbind(p1,p1b,p2,p2b,p3,p3b)
view(final.df) 

@daniel1noble
Copy link
Owner

@zacrobinson5 Thanks for clarifying the your expectations. I wasn't clear on whether the proofs were what you expected or not. I usually work the other way and match values to correct values. When I tested yesterday, the proofs no longer matched what predict_interval_means was producing. And, in fact, simplified intercept only models produced the same answers to Wolfgang's predict PIs. Having said that, it looks like there is still something going on here so I'll need to explore this some more. It could be something different to what appeared to be fixed yesterday.

@daniel1noble daniel1noble reopened this Sep 14, 2023
@daniel1noble
Copy link
Owner

daniel1noble commented Sep 14, 2023

I'll add one more note here @zacrobinson5. I noticed that you're using the emmprep function. We are intending to incorporate that into mod_results, but it is not yet used because the object produced is quite different to what we are currently producing for orchard plots. Having said that, I can replicate your error using our existing function though when you add the struc argument to this specific model, but it seems to be specific to the models (and /or data) your using as the current functions run all tests and examples in vignette fine and produce the expected output. I'm going to add a warning for now for users to check results of random slope models carefully until we can sort this out. Thanks again for your efforts. I'll see if I can sort this out but I may need time as I've got quite a bit on the go and I'll be away for periods

@daniel1noble
Copy link
Owner

daniel1noble commented Sep 15, 2023

@zacrobinson5 Just a quick update on this. First, I think you need to stick with mod_results which is the key function in the orchaRd package rather than emmprep as this is not yet tested. Again, we plan on adopting it but doing so requires a careful look at all the downstream changes that are required.

In terms of the example you provide. Here's where I got to exploring things.

m(list = ls())
library(metafor)
library(tidyverse)
library(orchaRd)
library(emmeans)

data("dat.konstantopoulos2011")
df=dat.konstantopoulos2011%>%
  as.data.frame()

# Use orchards functions

#fit random intercept model
m1=rma.mv(yi ~ year, vi, random = list(~1|study, ~1|district, ~1|school), data = df, method = "REML")

# Check orchard gives roughly same results. Note that they won't be exactly the same because year is continuous and orchard treats the moderator differently than metafor, but they should be pretty close, which they are. 
orchard1 <- mod_results(m1, mod = "year", group = "study")$mod_table[1:2,] # works fine
   meta1 <- predict(m1, newmods= c(1976:2000), addx = TRUE)[1:2,]

#fit single random slope model
m2=rma.mv(yi ~ year, vi, random = list(~year|study, ~1|district, ~1|school), data = df, rho = 0, method = "REML")

# Check orchard gives roughly same results. 
orchard2 <- mod_results(m2, mod = "year", group = "study")$mod_table[1:2,] # works fine
   meta2 <- predict(m2, newmods= c(1976:2000), addx = TRUE)[1:2,]

#fit double random slope model
m3=rma.mv(yi ~ year, vi, random = list(~year|study, ~year|district, ~1|school), rho = 0, data = df, method = "REML")

# Check orchard gives roughly same results. 
orchard3 <- mod_results(m3, mod = "year", group = "study")$mod_table[1:2,] # works fine
   meta3 <- predict(m3, newmods= c(1976:2000), addx = TRUE)[1:2,]

####### Now add struc argument

#fit single random slope model
m2_1=rma.mv(yi ~ year, vi, random = list(~year|study, ~1|district, ~1|school), struct = "GEN", data = df, rho = 0, method = "REML")

# Check orchard gives roughly same results. Note that they won't be exactly the same beuse year is continuous and orchard treats the moderator diefferently than metafor, but they should be pretty close. 
orchard2_1 <- mod_results(m2_1, mod = "year", group = "study")$mod_table[1:2,] # works fine
   meta2_1 <- predict(m2_1, newmods= c(1976:2000), addx = TRUE)[1:2,] # Note that metafor doesn't produce PIs here

#fit double random slope model
m3_1=rma.mv(yi ~ year, vi, random = list(~year|study, ~year|district, ~1|school), struct = "GEN", rho = 0, phi = 0, data = df, method = "REML", control = list(optimizer="optim"))

# Check orchard gives roughly same results. Note that they won't be exactly the same beuse year is continuous and orchard treats the moderator diefferently than metafor, but they should be pretty close. 
orchard3_1 <- mod_results(m3_1, mod = "year", group = "study")$mod_table[1:2,] # works fine
   meta3_1 <- predict(m3_1, newmods= c(1976:2000), addx = TRUE)[1:2,] # Note that metafor doesn't produce PIs here - GEN?

I need to do some more testing on this issue, but part of the problem maybe that you're not using mod_results. When you get a chance, would you mind just running the above code and making sure you at least don't get errors anymore?

I've run a few other examples and did some checks (both manual and with predict) that the prediction intervals match and it looks to be good on my end for the models I've run. Of course, there may still be situations where things fail so I've now added a new warning for random slope models for users to check carefully!

@zacrobinson5
Copy link
Author

zacrobinson5 commented Sep 15, 2023

@daniel1noble Thank you so much for your continued work looking into this sir - I genuinely appreciate it!

Using mod_results seems to address the problem from my tests! I suppose one follow up I have from a practical perspective is that for the project I'm working on with this, I'm exploring some non-linear patterns and am fitting some splines which is how I ended up on using emmprep in concert with orchaRd - mod_results is giving me some funky results with those

It is recommended that you fit the model with an intercept. Unanticipated errors can occur otherwise.

emmprep() takes the params argument which it seems like mod_results() should as well but I've had some trouble specifying it - any advice there?

I'm going to go ahead and close the issue here as this seems to be fixed for the most part so long as you're using the functions within orchaRd's ecosystem - thanks again!

EDIT:

I can get the spline models to give me outputs that look correct - just not sure what the error message is getting at

@zacrobinson5
Copy link
Author

@daniel1noble sorry one other question - is this quote still accurate after the updates?

@zacrobinson5 We also don't yet take two slopes in orchaRd either, so I should probably have a warning for that.

And specifically do you mean:

~slope|study , slope|group, 1|es

or

~slope1+slope2|study, 1|group, 1|es

or

both?

@daniel1noble
Copy link
Owner

@zacrobinson5 Great that it seems to be working now.

The intercept message above is with respect to mod = ~1 + p1 + p2, where you fit explicitly an intercept. Often we fit. mod = -1 + p1 + p2. So the message is just suggesting to fit the former. The reason is the emmeans can calculate the marginalised means for each group anyway.

Ih regards to the second message. That is now null and void because, as you say, now that gamma is incorporated it can deal with two slopes. There still might be situations it fails, so it's still got a warning message for now until we have a diversity of testing situations.

Hope that all helps!

Cheers,
Dan

@zacrobinson5
Copy link
Author

zacrobinson5 commented Mar 18, 2024

Hey @daniel1noble I am revisiting this issue with a fresh pair of eyes and (I think) an improved understanding. I think I've figured out a few things in the function for pred_interval_esmeans() that don't allow it to play nice with emmprep()

Here is me explaining a few changes to the function that (might) be worth making, but please let me know if I am woefully off base. Was just poking around and figured I'd ask!

  1. Wrapping taus and gammas with sum()

Now, I could be misunderstanding some of what is going on here, but I think the final method of calculating PI may be missing a step

Current:

sigmas <- sum(model$sigma2)
    taus   <- model$tau2
    gammas <- model$gamma2

### skip some lines ###

 PI <- test.stat * sqrt(tmp$SE^2 + sigmas + taus + gammas) ## tau and gamma should be wrapped in sum() ?

The key here is that taus and gammas might need to be wrapped in sum() so that you don't receive the following error and consequently wacky prediction intervals:

Warning message: In tmp$SE^2 + sigmas + taus : longer object length is not a multiple of shorter object length

  1. I think there might be some errors with respect to the weighted gamma calculation
  • It seems to me that g.levels.k refers to the weights for tau rather than gamma and should be replaced with h.levels.k
  • I think gamma should be replaced with gammas in the calculation of the weighted gamma

Current:

sigmas <- sum(model$sigma2)
    taus   <- model$tau2
    gammas <- model$gamma2
    w_tau <- model$g.levels.k
    w_gamma <- model$g.levels.k ## h.levels.k instead?
    
    if(mod == "1"){
      tau <- weighted_var(taus, weights = w_tau)
      gamma <- weighted_var(gamma, weights = w_gamma) ## gamma should be gammas ?
      PI <- test.stat * sqrt(tmp$SE^2 + sigmas + tau + gamma)

Potentially better version:

pred_interval_esmeans <- function(model, mm, mod, ...){
  
  tmp <- summary(mm)
  tmp <- tmp[ , ]
  test.stat <- stats::qt(0.975, tmp$df[[1]])
  
  if(length(model$tau2) <= 1 | length(model$gamma2) <= 1){ # Note this should fix #46 but code is repetitive and needs to be cleaned up. Other issue is how this plays with different rma. objects. uni models will treat slots for gamma NULL and we need to deal with this. 
    sigmas <- sum(model$sigma2)
    taus   <- model$tau2
    gamma2 <- ifelse(is.null(model$gamma2), 0, model$gamma2)
    PI <- test.stat * base::sqrt(tmp$SE^2 + sigmas + sum(taus) + sum(gamma2))
  } else {
    sigmas <- sum(model$sigma2)
    taus   <- model$tau2
    gammas <- model$gamma2
    w_tau <- model$g.levels.k
    w_gamma <- model$h.levels.k
    
    if(mod == "1"){
      tau <- weighted_var(taus, weights = w_tau)
      gamma <- weighted_var(gammas, weights = w_gamma)
      PI <- test.stat * sqrt(tmp$SE^2 + sigmas + tau + gamma)
      
    } else {
      PI <- test.stat * sqrt(tmp$SE^2 + sigmas + sum(taus) + sum(gammas))
    }
  }
  
  tmp$lower.PI <- tmp$emmean - PI
  tmp$upper.PI <- tmp$emmean + PI
  
  # renaming "overall" to ""
  if(tmp[1,1] == "overall"){tmp[,1] <- "intrcpt"}
  
  return(tmp)
}

EDIT 3/19/24

  1. I am not entirely sure about this one, but after playing around with my suggested changes - some situations still don't seem to make sense. Particularly, this is the case when a bunch of fixed effects are added, and the random effects variance components shrink towards zero. After taking a look at the Johnson equation - should we be accounting for the fixed effect variance in any way? I'm not sure if the current code is already doing this with the tmp$SE^2 but the following change seems to pass the face validity test (to me):
PI <- test.stat * sqrt(tmp$SE^2 + sigmas + sum(taus) + sum(gammas) + sum(diag(vcov(model, type = "fixed"))))

or maybe better yet

PI <- test.stat * sqrt(tmp$SE^2 + sigmas + sum(taus) + sum(gammas) + var(model$X%*%model$b))

This seems closer to the way metafor computes pi's with all the similar components, although I'm not exactly sure what newvi is referring to

pi.lb <- pred - crit * sqrt(vpred + sum(x$sigma2) + x$tau2 + x$gamma2 + newvi)
pi.ub <- pred + crit * sqrt(vpred + sum(x$sigma2) + x$tau2 + x$gamma2 + newvi)

@zacrobinson5 zacrobinson5 reopened this Mar 18, 2024
@daniel1noble
Copy link
Owner

@zacrobinson5 Thanks for all this. Give me a few weeks and I will look into it. Sorry. It's very busy at the moment and I have some urgent things that need to be dealt with, but this all sounds sensible to me.

@zacrobinson5
Copy link
Author

@daniel1noble sounds good and no worries! I added one other thing after I played around with my suggested tweaks a bit more

@zacrobinson5
Copy link
Author

zacrobinson5 commented Mar 21, 2024

Been working on this some more - here is where I'm at after some great feedback from the r-sig-meta mailing list:

I think I still need covariances between multiple random slopes

get.pi.emmeans.rma.mv <- function(model, mm){
  
  tmp <- summary(mm)
  tmp <- tmp[ , ]
  test.stat <- stats::qt(mm@misc$level+(1-mm@misc$level)/2, tmp$df[[1]])
  
  n_g <- colnames(model$G)
  n_g2 <- if(is.null(n_g)){0} else{n_g[!n_g %in% c("intrcpt", "outer")]}
  l_g <- if(is.null(n_g)){0} else{mm@linfct[,colnames(mm@linfct) %in% n_g2]}
  v_g <- if(is.null(n_g)){0} else{diag(model$G)%>%as.data.frame()%>%.[n_g2,]}
  cv_g <- if(is.null(n_g)){0} else{model$G[upper.tri(model$G)]}
  
  n_h <- colnames(model$H)
  n_h2 <- if(is.null(n_h)){0} else{n_h[!n_h %in% c("intrcpt", "outer")]}
  l_h <- if(is.null(n_h)){0} else{mm@linfct[,colnames(mm@linfct) %in% n_h2]}
  v_h <- if(is.null(n_h)){0} else{diag(model$H)%>%as.data.frame()%>%.[n_h2,]}
  cv_h <- if(is.null(n_h)){0} else{model$H[upper.tri(model$H)]}
  
  V_b <- tmp$SE^2
  V_ri <- sum(model$sigma2 + model$tau2[1] + model$gamma2[1])
  
  V_g <- if(is.null(n_g)){0} else{((l_g^2)*v_g)%>%as.data.frame()%>%rowSums()}
  V_g_cv <- if(is.null(n_g)){0} 
  else{((2*l_g)*cv_g[seq(1,ncol(l_g))])%>%as.data.frame()%>%rowSums()}
    
  V_h <- if(is.null(n_h)){0} else{((l_h^2)*v_h)%>%as.data.frame()%>%rowSums()}
  V_h_cv <- if(is.null(n_h)){0} 
  else{((2*l_h)*cv_h[seq(1,ncol(l_h))])%>%as.data.frame()%>%rowSums()}
  
  PI <- test.stat * sqrt(V_b + V_ri + V_g + V_g_cv + V_h + V_h_cv)
  
  tmp$lower.PI <- tmp$emmean - PI
  tmp$upper.PI <- tmp$emmean + PI
  
  return(tmp)
}

@zacrobinson5
Copy link
Author

zacrobinson5 commented Mar 25, 2024

Sorry for the multiple messages, I (think) this is the final version with all covariances integrated:

get.pi.emmeans.rma.mv <- function(mm,model){
  
  tmp <- summary(mm)
  tmp <- tmp[ , ]
  test.stat <- stats::qt(mm@misc$level+(1-mm@misc$level)/2, df=tmp$df[[1]])
  
  V_se <- tmp$SE^2
  V_ri <- sum(model$sigma2)
  
  n_g <- colnames(model$G)
  colnames(mm@linfct)[1] <- if(colnames(mm@linfct)[1]=="(Intercept)"){"intrcpt"} else{colnames(mm@linfct)[1]}
  g_g <- model$X[,colnames(mm@linfct) %in% n_g]
  V_g <- if(is.null(n_g)){0} else{sum(diag(crossprod((g_g %*% model$G),g_g)))/nobs.rma(model)}
  
  n_h <- colnames(model$H)
  colnames(mm@linfct)[1] <- if(colnames(mm@linfct)[1]=="(Intercept)"){"intrcpt"} else{colnames(mm@linfct)[1]}
  g_h <- model$X[,colnames(mm@linfct) %in% n_h]
  V_h <- if(is.null(n_h)){0} else{sum(diag(crossprod((g_h %*% model$H),g_h)))/nobs.rma(model)}
  
  V=cbind(V_ri,V_g,V_h)%>%
    as.data.frame()%>%
    rowSums()
  
  PI <- test.stat * sqrt(V_se + V)
  
  tmp$lower.PL <- tmp$emmean - PI
  tmp$upper.PL <- tmp$emmean + PI
  
  return(tmp)
}

@daniel1noble
Copy link
Owner

@zacrobinson5 Really sorry I haven't had a chance to deep dive into this yet Zac. Shinichi and I are just swamped at the moment. Any chance you could fork the repo, edit the relevant code, run the tests devtools::test() and see where we are at. Then you can create a pull request and I can at least be sure things are not breaking anywhere. As with any big changes these take time to properly integrate as it's not always clear what will be the downstream consequences. This is why we have tests but there still maybe situations we have not yet thought about. Plus, given the work you have done, I thin you should get the credit you deserve to make it clear you did this bit of coding!

@zacrobinson5
Copy link
Author

@daniel1noble No worries at all Dan! I will give this a shot when able, pretty new to Git so might take me a bit to figure it out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants