- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
group-level parameters #112
Comments
I like grouplevel.. or groupwise? |
I like grouplevel too but I do recall @bwiernik made a distinction between group-level and group-specific |
After quite some tinkering I think the most intuitive and straightforward design is
library(modelbased)
library(lme4)
library(see)
data <- lme4::sleepstudy
data <- rbind(data, data)
data$Newfactor <- rep(c("A", "B", "C", "D"))
model <- lmer(Reaction ~ Days + (1 + Days|Subject) + (1|Newfactor), data = data)
random <- estimate_random(model)
head(random)
#> Group | Level | Parameter | Coefficient | SE | 95% CI
#> ----------------------------------------------------------------------
#> Newfactor | A | (Intercept) | 0.58 | 2.04 | [ -3.42, 4.58]
#> Newfactor | B | (Intercept) | 2.32 | 2.05 | [ -1.69, 6.34]
#> Newfactor | C | (Intercept) | -2.04 | 2.04 | [ -6.04, 1.97]
#> Newfactor | D | (Intercept) | -0.87 | 2.05 | [ -4.88, 3.15]
#> Subject | 308 | (Intercept) | -3.28 | 9.24 | [-21.39, 14.82]
#> Subject | 308 | Days | 10.37 | 1.73 | [ 6.98, 13.75]
plot(random) # Rematch to original random data
x <- reshape_random(random, indices = c("Coefficient", "SE"))
head(x)
#> Subject Newfactor Newfactor_Coefficient_Intercept Newfactor_SE_Intercept
#> 1 308 A 0.5818450 2.042339
#> 2 308 B 2.3221207 2.049567
#> 3 308 C -2.0368499 2.042339
#> 4 308 D -0.8671158 2.049567
#> 5 308 A 0.5818450 2.042339
#> 6 308 B 2.3221207 2.049567
#> Subject_Coefficient_Intercept Subject_SE_Intercept Subject_Coefficient_Days
#> 1 -3.283123 9.238225 10.3656
#> 2 -3.283123 9.238225 10.3656
#> 3 -3.283123 9.238225 10.3656
#> 4 -3.283123 9.238225 10.3656
#> 5 -3.283123 9.238225 10.3656
#> 6 -3.283123 9.238225 10.3656
#> Subject_SE_Days
#> 1 1.728378
#> 2 1.728378
#> 3 1.728378
#> 4 1.728378
#> 5 1.728378
#> 6 1.728378 Created on 2021-06-02 by the reprex package (v1.0.0) |
that looks nice! |
These are the just the estimated random effects (returned with "estimate_group_specific" should return the random effects + fixed effects (what is returned with |
What I did is I added an argument to The description is as follows: modelbased/R/estimate_random.R Line 6 in 66a45db
I also mention that in the new vignette I started: feel free to correct and improve! |
We aren't currently reporting SEs or intervals for the |
From my current understanding the output of coef (https://github.com/lme4/lme4/blob/9c673edb76ae19165ffe0a45b375737bb02f1fc3/R/lmer.R#L657) is literally just the sum of ranef and fixef (i.e., the point-estimate of the coefficients). As such, I don't see the difference between adding the fixef point-estimate to the random coefficient and adding the fixef point-estimate to say the random CI bounds. With the same logic, the SE stays the same (since we it doesn't correct for the variability of the fixed effect anyway). So I don't think it's inherently wrong to return the same SE and the summed CI, given the current state of things (again, based on how coef currently works) |
If this is ranef+fixef, I think |
I don't see it that way, for me it's still the random effects (or rather group-level effects), it's just that they axe expressed in the same "unit" as the fixed effect (rather than being expressed in relation to it), which makes it often easier to interpret but it doesn't change inherently the nature of the coefficients. It's a bit like doing Afaik it's mostly a "rescaling" of the values for more interpretability rather than a fundamental alteration of the meaning... |
To me those aren't the same thing... :/
|
It actually matters quite a bit..... library(lme4)
#> Loading required package: Matrix
m1 <- lmer(mpg ~ 1 + (1|gear), data = mtcars)
m2 <- lmer(mpg ~ 0 + (1|gear), data = mtcars)
# Same
coef(m1)
#> $gear
#> (Intercept)
#> 3 16.46041
#> 4 24.15615
#> 5 21.22401
#>
#> attr(,"class")
#> [1] "coef.mer"
coef(m2)
#> $gear
#> (Intercept)
#> 3 16.05235
#> 4 24.43000
#> 5 21.16515
#>
#> attr(,"class")
#> [1] "coef.mer"
# Very diff
ranef(m1)
#> $gear
#> (Intercept)
#> 3 -4.153112
#> 4 3.542629
#> 5 0.610483
#>
#> with conditional variances for "gear"
ranef(m2)
#> $gear
#> (Intercept)
#> 3 16.05235
#> 4 24.43000
#> 5 21.16515
#>
#> with conditional variances for "gear" Created on 2021-06-03 by the reprex package (v2.0.0) |
when I said it doesn't matter I didn't mean that the values are the same ^^ naturally they change in respect to their fixed effect. But yeah in your |
It's not really the predicted effect as much as simply the random parameter expressed in "absolute" (i.e., not "relative" to its fixed effect) |
The random part is the predicted deviation of the effect, so when adding the fixed effect it is the predicted effect. AFAIK this is the terminology used for the output of |
ah ok I see what you mean by predicted here ^^ (and thank god none of us is from ML) |
|
Let's change the name to Explanation: The more I think about it, the more I really don't like calling the output of coef() "random". It's not. It's the sum of the fixed and random components. I think this is going to encourage confusion on users' parts. See https://wviechtb.github.io/metafor/reference/blup.html and glmmTMB/glmmTMB#691 for some discussion. The SEs are absolutely not the same. In most cases, they are not even close. The SE for the output of coef() is sqrt(SE^2_fixed + SE^2_random + 2*cov_fixedrandom). The covariance term tends to be very strong and negative. Because of some of the linear algebra tricks lme4 uses (and Doug Bates' philosophical objections), it's not possible to estimate that covariance with lme4. Ben Bolker has said he is open to glmmTMB reporting these SEs, but there are some implementation details to work out. glmmTMB/glmmTMB#691 |
I agree with all of this, expect for |
Yeah, I think estimate_random is alright and understandable as a general name what about I'll remove the SE |
I am really uncomfortable calling the function If these are going to go into one function, then I think that "blup" is the accurate label. These are the BLUPs or eBLUPs of either the conditional modes of the random effect distributions or the conditional modes of the total coefficient distributions. BLUP is quite widespread terminology, "best linear unbiased prediction (BLUP)" is a term used widely in mixed effects model literature and in most mixed effects model textbooks and course materials. We can try to come up with another label, but if we can't and "blup" isn't acceptable, then I think separate functions are needed, such as estimate_random and estimate_coefficient. One other reason I like blup here is that it can then also be extended in the future. For example, we could add |
Agreed :/ Though I still think |
I am okay with |
"condtional" could be another option, but that's probably too confusing given that glmmTMB also uses "conditional" to refer to conditional on the zero-inflated model as well as conditional on group membership. |
|
I'm worried that |
How about "what" or "estimate" or "estimate.what"/"estimate_what"? |
what about one of: |
I like it. Let's go with "coefficient" |
|
should |
actually "coefficient" might not be the best, because it's confusing with the |
Why does it matter if the argument value might conflict with another argument's value? |
Hmmm I think should also be changed. |
is arguably not the most elegant and clear ^^ |
I don't think that clash is that big a deal. It's not as unclear when the arguments are named. x %>%
estimate_grouplevel(type = "coefficient") %>%
reshape_grouplevel(indices = "Coefficient")
"total" should be okay--I'm not thinking of a place where this would be inaccurate. Another option might be "effect" but that's probably not as clear. "blup" could refer to either random or total. The best prediction of what? |
Made the fixes, I think we're good for submission for this round, next update will focus on visualisation (#114 which needs lightbeam geoms in see etc.). Will pass it to the winbuilder first and then submit |
Follow up of easystats/parameters#486
What's the most catchy & appropriate name;
estimate_individual
,estimate_groupspecific
,estimate_grouplevel
,estimate_random
,estimate_individualeffects
,estimate_scores
,estimate_individualparameters
...The text was updated successfully, but these errors were encountered: