added mallows cp #42

timbook · 2019-10-25T18:10:01Z

Added Mallows's Cp metric. Still missing unit tests, although I don't know how useful they would be. Also needs a formula added to the README. I'd be happy to do this, although I don't know how.

mfrasco

Thanks so much for your interest in Metrics and for making this contribution! I really appreciate it. I've left a few comments in the PR that I would love your thoughts on. In addition, there are a few outstanding things to complete before the PR can be accepted

Adding documentation files. (You can generate these with devtools::document())
Adding tests (You can delay doing this until we agree on the appropriate function signature).

mfrasco · 2019-10-28T04:39:03Z

R/regression.R

+#' 
+#' \deqn{MSE_\text{sub} + 2p_\text{sub}MSE_\text{full}}
+#' 
+#' While the two definitions don't give the same answer, a model with the lowest


Since this is true, why does the package need to provide both definitions of the metric?

mfrasco · 2019-10-28T04:47:18Z

R/regression.R

+#' @param data The data (only necessary whe supplying formulae)
+#' @param alt_definition Whether or not to use the alternate definition for Mallows's Cp
+#' @export
+mallowsCp <- function(full_model, sub_model, data = NULL, alt_definition = FALSE) {


I wonder if the arguments to this function should be actual, predicted_full, predicted_sub, p_full, and p_sub, instead of requiring the user to pass in model objects or formulae?

The advantages of this approach are

It keeps consistent with the rest of the package

It allows the user to use whatever software they want in order to fit the regression, not necessarily lm,

The disadvantages of this approach are

An uglier function signature with more arguments.

What are your thoughts on this decision? Is there a particular reason that you designed the function the way you did?

mfrasco · 2019-10-28T04:49:55Z

R/regression.R

+#' Cp will have the lowest Cp using both definitions.
+#'
+#' @param full_model Either a \code{formula} or \code{lm} for the OLS model using all predictors.
+#' @param sub_model Either a \code{formula} or \code{lm} for the OLS model with a subset of predictors


Did you consider any variable names other than sub? I'm afraid that it won't be obvious to users that sub means subset of columns? What about candidate or subset?

added mallows cp

80fb510

mfrasco requested changes Oct 28, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added mallows cp #42

added mallows cp #42

timbook commented Oct 25, 2019

mfrasco left a comment

mfrasco Oct 28, 2019

mfrasco Oct 28, 2019

mfrasco Oct 28, 2019

added mallows cp #42

Are you sure you want to change the base?

added mallows cp #42

Conversation

timbook commented Oct 25, 2019

mfrasco left a comment

Choose a reason for hiding this comment

mfrasco Oct 28, 2019

Choose a reason for hiding this comment

mfrasco Oct 28, 2019

Choose a reason for hiding this comment

mfrasco Oct 28, 2019

Choose a reason for hiding this comment