-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[paper] writing #332
base: main
Are you sure you want to change the base?
[paper] writing #332
Conversation
@strengejacke I think I'll need your help for the marginal section, the different types of marginalization etc. it's quite blurry in my head to be honest ^^ |
@easystats/core-team remaining bits:
|
@DominiqueMakowski where do you want my input, and what should it include? I must admit I have 0 familiarity with |
Your torment over you can make the switch now 😛 Mostly if you could take a look at the technical details section that talks about the backends |
modelbased currently covers classical |
|
||
The modelbased package simplifies the extraction of these effects, providing a clear interface to understand how different predictors interact with outcomes in various scenarios. | ||
|
||
[details about types of marginalization] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@strengejacke I think here we could mention our 2 main types of marginalization (essentially copypasta our nice docs on that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can start working/helping on this paper by mid of February, or earlier if it's just minor stuff.
paper/paper.md
Outdated
The term "Least-Squares Means" was somewhat misleading as it suggested a method specific to least-squares estimation, hence its renaming to `emmeans` in 2016 to clarify its scope for a wider range of models including generalized linear models, mixed models, and Bayesian models. | ||
- `marginaleffects` (REF) was more recently introduced and also employs the delta method to approximate variance estimation. It is compatible with a wider range of models and allows for more flexibility in the specification of the marginal effects to be estimated. | ||
|
||
[What's the difference / benefits / drawbacks of using one or ther other?] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattansb any other interesting facts to mention here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference for marginaleffects would be https://www.jstatsoft.org/article/view/v111i09
Arel-Bundock, V., Greifer, N., & Heiss, A. (2024). How to Interpret Statistical Models Using marginaleffects for R and Python. Journal of Statistical Software, 111(9), 1–32. https://doi.org/10.18637/jss.v111.i09
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh there's a lot that I can add.
I think the biggest difference would be:
- emmeans used a reference grid (that has by default a "complete" combination of all predictors) to generate expected values, by default conditioning on the mean of numeric values and marginalizing over categorical/binary variables, using a linear function of the model's coefficients (and v-cov matrix to get SEs) to give "predictions at the mean" (predictions for an average observation).
- marginaleffects uses unit level predictions to generate two counterfactual values - the difference of which is then taken (with SEs computed using the delta method), and averaged across all units. By default, the original model frame is used.
Of course, emmeans can also the delta method and can build complex reference grids (that aren't actually "grid" like), and marginaleffects can also generate linear perditions at the mean.
Using the delta method is more computationally costly than using a linear combination (though marginaleffects is very efficient). Using linear combinations with orthogonal "grids" also often means that results from emmeans directly correspond to a models coefficients (which is a benefit for those who are used to looking at regressions tables to understand their models - this can be shown with an example).
...
Would you like me to add all of this in? Just some of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like me to add all of this in? Just some of this?
Anything you think is relevant, but I think it'll be good to be quite thorough and detailed here as the inner workings of marginal stuff are not often clearly explained so having details is good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright - I'll give it some time tomorrow.
and some edits
I've added the expanded "technical details" section. Some general comments: Statement of need I think the "statement of need" needs some restructuring, because the actual need it addresses is that emmeans and marginaleffects are sometimes hard to use - "the Predictions "For generalized linear models (GLMs), while the distinction between prediction and confidence intervals do not apply" - why not? I teach prediction intervals for non-gaussian models. Marginal effects The different "marginals" need to be stated better here - the "marginal" in "marginal means" is not the same "marginal" that is in "marginal effects". |
Since I updated my mixed models slides for teaching after working on modelbased, maybe we can use some of the "definitions" I use on my slides (hopefully, these are accurate): After fitting a model, it is useful generate model-based estimates. Usually, there are three quantities of interest:
The key difference:
Briefly:
|
What would be the difference between these differences? |
library(marginaleffects)
mtcars$vs <- factor(mtcars$vs)
mtcars$gear <- factor(mtcars$gear)
mod <- glm(am ~ vs * hp * gear,
family = binomial(),
data = mtcars)
# unit level differences
comparisons(mod, variables = "vs",
type = "response")
# average differences (mean unit level differences)
avg_comparisons(mod, variables = "vs",
type = "response")
#>
#> Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> -0.166 0.195 -0.851 0.395 1.3 -0.547 0.216
#>
#> Term: vs
#> Type: response
#> Comparison: 1 - 0
#>
# marginal means over a regular grid
avg_predictions(mod, variables = "vs",
# a regular grid
newdata = datagrid(hp = mean, gear = levels),
type = "response")
# marginal differences (differences between marginal means)
avg_predictions(mod, variables = "vs",
# a regular grid
newdata = datagrid(hp = mean, gear = levels),
type = "response",
hypothesis = ~ reference)
#>
#> Hypothesis Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> 1 - 0 -0.302 0.239 -1.26 0.207 2.3 -0.772 0.167
#>
#> Type: response
#> |
I think we should avoid using the ambiguous term "marginal" as much as possible with regard to effects/slopes/differences (probably still okay with regard to "marginal means" since that can only be the "average" kind, not the "conditional" kind). |
Unless you deal with mixed models, where you can have "marginal" and "conditional" effects/predictions/means (or whatever you'd like to call them). |
Ok, I see, I think your distinction refers to the |
Here are the modelbased equivalents: library(modelbased)
library(marginaleffects)
mtcars$vs <- factor(mtcars$vs)
mtcars$gear <- factor(mtcars$gear)
mod <- glm(am ~ vs * hp * gear, family = binomial(), data = mtcars)
# average differences (mean unit level differences)
avg_comparisons(mod, variables = "vs")
estimate_contrasts(mod, "vs", marginalize = "population")
# marginal means over a regular grid
avg_predictions(mod, variables = "vs", newdata = datagrid(vs = levels, hp = mean, gear = levels))
estimate_means(mod, "vs", predict = "response")
# note the corrected prediction type
avg_predictions(mod, variables = "vs",
newdata = datagrid(vs = levels, hp = mean, gear = levels),
type = "invlink(link)")
estimate_means(mod, "vs")
# marginal differences (differences between marginal means)
avg_predictions(mod, variables = "vs",
newdata = datagrid(hp = mean, gear = levels),
type = "response",
hypothesis = ~ reference)
estimate_contrasts(mod, "vs", comparison = ~ reference) |
That's fine - but since there are these two meanings, I think it is best we stear clear of this term whenever possible. I would even try and find a better argument name for |
any suggestions? |
The default in marginaleffects is to use the model frame and produce unit level predictions/effects for each observation and potentially average across some/all groups of observations to get average / conditional effects. It sounds like the default in modelbased is to make regular grids (like emmeans does by default). So that can also be a talking point in the paper. Doesn't |
Not fully. That's why Dom and I think it's clearer to avoid a rather "technical" approach to what's going on, and instead a more "practical", or: "what's the question I'd like to answer?" approach.
In all three cases you get an estimated average value for your response, however, for different "persons"/"subjects"/"groups" - because in all three cases, non-focal terms are treated differently (according to how they're "averaged" - or "marginalized"). That's the idea behind this argument name, and to avoid a "technical" thinking of marginal means, and instead looking at predictions in terms of "what is the sample/entity/person we're making inferences about"? |
## Predictions | ||
|
||
At a fundamental level, `modelbased` and similar packages leverage model *predictions*. | ||
These predictions can be of different types, depending on the model and the question at hand. | ||
For instance, for linear regressions, predictions can be associated with **confidence intervals** (`predict="expectation"`) or **prediction intervals** (`predict="prediction"`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not only linear regression? Remove that subordinary sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
It provides a unified interface to extract marginal means, marginal effects, and model predictions from a wide range of statistical models. | ||
It is built on top of the `emmeans` (REF) and `marginaleffects` (REF) packages, which are the two most popular packages for extracting these quantities of interest from statistical models. | ||
In line with the `easystats`' *raison d'être*, the `modelbased` package is designed to be user-friendly, with a focus on simplicity and flexibility. | ||
The two probably most popular R packages for extracting these quantities of interest from statistical models are `emmeans` (REF) and `marginaleffects` (REF). These packages are enormously feature rich and (almost) cover all imaginable needs for post-hoc analysis of statistical models. However, these packages are not always easy to use, especially for users who are not familiar with the underlying statistical concepts. The `modelbased` package aims to unlock this currently underused potential by providing a unified interface to extract marginal means, marginal effects, contrasts, comparisons and model predictions from a wide range of statistical models. It is built on top of the two aforementioned `emmeans` and `marginaleffects` packages. In line with the `easystats`' *raison d'être*, the `modelbased` package is designed to be user-friendly, with a focus on simplicity and flexibility. | ||
|
||
|
||
# Key concepts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might add a short sentence clarifying the conceptual differences between predictions and marginal means?
|
||
Mention types of predictions (expectations, predictions, response vs. the link scale) | ||
## Marginal means and effects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think somewhere we can introduce the concept of focal and non-focal terms. For marginalization, we can refer to the wording of "non-focal terms", which is probably better than "non-focal predictors" or "other predictors", because sometimes, we have interaction terms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
Hmmm to me the word "marginalize" is very technical, and Since the options are "average over regular grid" or "average over the model frame" and the word "marginalize" has two meanings, I would have:
(Or to be even more clear, I would have |
Sounds good, we should think about this - @DominiqueMakowski WDYT? |
There's two decisions to be made I think:
|
I also prefer
(and |
I'm confused by Edit: Oh, you mean like "an average person"? (something that is weird to think about when averaging over a grid of marginal means) how about:
What does |
Why? How would you interpret the different types of averaging? I.e. to which subjects do your results refer to? (It's not helpful interpreting results in technical terms, I think, when a general audience should understand it) |
Oh, now I get it - this isn't about averaging really, averaging is sometimes used, but it's about what units are the estimates generated (and then possibly averaged across). So how about:
Or |
Yes, that sounds appropriate, too! 🙂 |
Let's get this bad boi out.
@mattansb @bwiernik what would you write about the two backends, emmeans and marginaleffects, their main differences etc.
Do they both rely on the delta method?