Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R-squared for Dirichlet regression (r2) #683

Open
MarcRieraDominguez opened this issue Feb 16, 2024 · 1 comment
Open

R-squared for Dirichlet regression (r2) #683

MarcRieraDominguez opened this issue Feb 16, 2024 · 1 comment
Labels
3 investigators ❔❓ Need to look further into this issue Enhancement 💥 Implemented features can be improved or revised

Comments

@MarcRieraDominguez
Copy link

Hi! First of all, thank you for creating and maintaining this package!

I have come across an unexpected behaviour when applying r2() to Dirichlet regression fitted with the DirichletReg package. In short, the Dirichlet regression extends the beta regression to C categories: bounded responses (0, 1) across more than 2 categories. This regression comes in two parametrizations: common (a separate model is fitted to each of the C categories) vs alternative (a separate model is fitted to C-1 categories, and precision is modelled separately). Each model can use a different set of explanatory variables, separated by pipes |.

r2() appears to return Nagelkerke's R2, but the value is very high for models with the alternative parametrization. For instance, a value close to 0.9, when the squared correlation between fitted and observed values is no higher than 0.75 for any category. The value for a model with the common parametrization is more sensible (i.e. in line with the correlations between fitted and observed values). I suspect this has to do with how a null model is declared, based on comparisons with MuMIn::rsquaredLR(). A reproducible example is available in an issue over at the DirichletReg package.

maiermarco/DirichletReg#12

I am not an expert, so perhaps the r2 values actually make sense. The analysis of proportions across categories is quite interesting, and given a recent review (https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13234) its popularity might increase in ecology and evolution. If performance can work with such models it would be a very useful extension!

Thank you!

@strengejacke strengejacke added Enhancement 💥 Implemented features can be improved or revised 3 investigators ❔❓ Need to look further into this issue labels Mar 18, 2024
@roaldarbol
Copy link

roaldarbol commented Aug 12, 2024

I agree this will be a super interesting area to follow in the coming years. Worth noting that it's also possible to model with brms and the dirichlet family - would that still necessitate a new r2 method?

Using brms also allows specifying random effects (as far as I can tell). That would allow testing e.g. consistency of time budgets for individuals if icc() or variance_ratio() can handle those cases too. I haven't tested, so I'm not sure whether this might already be the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 investigators ❔❓ Need to look further into this issue Enhancement 💥 Implemented features can be improved or revised
Projects
None yet
Development

No branches or pull requests

3 participants