You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for inviting me to review a chapter of Telling Stories with Data. I really admire the scope and ambition of your project here, and hope my feedback is able to contribute in some small way. Reading through the current draft of Chapter 14, on linear models, I frequently found myself nodding along in appreciation. There is a lot of information that is both interesting and immediately useful for applied modeling, as well as high level discussions about modeling and the field of statistics. My primary suggestions have to do with the organization of the chapter, and potentially narrowing the focus to allow for a greater sense of coherence. My apologies from the rambling nature of the subsequent review, I'm a bit over-caffeinated the moment -- I'm happy to clarify any unclear comments!
Section 14.1 introduces least squares and regression, through the lens of parametric error minimization. It also discusses the history of least squares estimation, as well as providing some wisdom about models as limited tools for understanding the world. Section 14.1 also introduces the idea that software is important to the practice of modeling, and that good inference requires good programming. I find all of these ideas to be valuable contributions. On a first read, however, I did feel a little disoriented. In particular, I felt like the chapter really wanted to discuss the practice (and history) of applied statistical modeling on the whole, and it was unclear to me if I should be focusing on big picture abstractions about statistical modeling, or on the technical details involved in regression.
Section 14.2 started to discuss technical elements of regression in more detail, first describing estimation for a univariate normal model. It was unclear to me if estimation is discussed earlier in the book, or if readers will have a good grasp of samples vs population by this point. In this section, I also wondered about the conceptual jump for univariate normal distributions to regression models -- in particular, there is a jump from marginal reasoning to conditional reasoning, and I suspect some additionally commentary about this difference might help readers at this point. Readers might get confused by the comment that y does not have to be normal; clarifying conditionals vs marginals should help with this. A focus on conditional mean modeling may also help smooth the conceptual transition from OLS (introduced in terms of parametric residuals) to GLMs (where residuals don't have densities) later in the chapter.
I appreciated that Section 14.2 was grounded with a concrete running example, and found Figure 14.2 to be a useful aid. I think the example stands reasonably well on its own, but runners may observe that speeds sustainable for a 5K are not sustainable for a marathon, such that a linear relationship between 5K time and marathon time is not strictly realistic (Aside: I got curious about this a while back, and it turns out that the relationship is modeling fairly accurately using Reigel's formula. Apologies for the self-promotion, but I wrote up a short blog post about this.) I appreciated the careful attention while interpreting regression coefficients in Section 14.2, and almost felt like there was a nice semi-parametric framing hidden just out of sight. By which I mean, an introduction to regression that focuses not on parameters, but rather on defining non-parametric functionals of interest, and then estimating these functionals under parametric models. Certainly the focus on marginal effects later in the chapter is in line with this vision. It may be worthwhile discussing this targeted, functional-estimation centric philosophy explicitly.
In Section 14.2, and perhaps also more broadly, I did take issue with some characterizations of uncertainty in estimates. For example, consider this discussion of the uncertainty in a regression coefficient.
The standard error of betahat does an awful lot of work here in accounting for a variety of factors, only some of which it can actually account for, as does our choice as to what it would take to convince us.
As a student, I would find the phrase "only some of which it can actually account for" concerning. It suggests a sort of hopelessness, that uncertainty is fundamentally too hard. Now, I think you do want to have some of this attitude around, but I think it is helpful to characterize when you should be optimism or pessimistic about modeling. As a field, this is something we tend to do poorly: we highlight ways things can be "incorrect," without ever teaching a mental model of correctness. I have the feeling you have a vision of correctness lurking somewhere in the sub-text of this chapter, and think that it would be an immense act of pedagogical service to bring that vision more to the surface.
At the same, it's probably worthwhile thinking about where in Telling Stories to include such big picture wisdom about modeling. Towards the end of Section 14.2, for example, there is a general discussion of p-values and the Duhem-Quine problem. This is super valuable, but organizationally it probably makes sense to separate this material from technical details about regression.
Section 14.3 begins a move away from general commentary about modeling to model specific details about regression, and extensions to OLS in particular. Here I appreciated the focus on code examples. I was a little surprised by the use of testthat -- in particular, it was unclear to me what kinds of errors the tests were designed to catch. Personally, it seemed like the tests were in a data-validation vein, and therefore might fit better in other chapters of the book discussing data cleaning (i.e. it would be natural to mention these tests alongside the pointblank package).
Section 14.4 pivots back to big picture discussion, describing the difference between inference and prediction, while simultaneously introducing the frequentist-Bayesian divide (aside: it may be worthwhile to discuss priors in slightly more detail). I'm really glad to see this material in here, as this cultural divide was a huge sticking point when I was an undergraduate myself. That said, I didn't feel like I left these sections with a clear understanding of the difference between inference tasks and prediction tasks. I think this is partially because it is difficult to illustrate inference vs prediction exclusively within the context of linear models. To me, prediction is about using arbitrary, non-linear estimators, and evaluating model performance using resampling tools like cross-validation, and hold-out test sets. The key ideas in prediction land are overfitting, hyperparameter tuning, cross-validation, and plugging in your favorite estimator. A prediction model is successful when it performs well empirically, so assessing empirical performances and getting predictions are the crucial bits. This is in contrast to inference, where there is typically a research question, and we use parametric models for the sake of interpretability. I think this larger context is somewhat absent in Section 14.4, which favors code examples, but that it would be valuable to emphasize broad context here rather than code. It also might make sense to have the discussion about prediction vs inference in a separate modeling chapter separate from the regression chapter.
Section 14.5 pivots back to a regression focus, and discusses logistic regression, dividing into marginal effects fairly quickly. I think it is would be worthwhile to formally define marginal effects, so that students can understand the estimand at hand, and possibly also to explain how inferential results for parameters theta can be used to obtain inferential results for functions q(theta). This is a place I notice a lot of people getting confused (see, for example, this explainer thread). After this introduction to logistic regression, the focus shifts back to code for a while, and then on to risk estimation in the political support example. I found this transition somewhat challenging to follow: it led me to believe that the predictions from a logistic regression were related to the quality of the marginal effect estimates, but there was no explicit connection (or differentiation) made between these modes of modeling. More broadly, I think the code examples in this section are useful, but that the modeling choices are not motivated explicitly. In this section I would also refer to male and female as sex rather than gender, and would also take a moment to consider avoiding sex-differences examples, or to find a dataset with more inclusive sex/gender definitions.
Sections 14.6 and 14.7 discuss poisson and negative binomial regression. Personally I think this is too much material to cover in one chapter. I thought the data example with e/E in this section was well-motivated, but I felt like model checking was appearing as a new topic in the chapter for the first time. I would move this material to another chapter. I feel similarly about the Plumber examples in Section 14.8.
To summarize: there is a lot of really cool stuff going on in this chapter. There are solid introductions to estimation, modeling as an abstract activity, and also modeling as an applied activity. Each of these topics deserves time and attention, and ideally they should probably be treated as separate topics. I would strongly consider splitting Chapter 14 into three separate chapters. I think the resulting chapters would feel more coherent, and be easier to digest.
The text was updated successfully, but these errors were encountered:
Thanks for inviting me to review a chapter of Telling Stories with Data. I really admire the scope and ambition of your project here, and hope my feedback is able to contribute in some small way. Reading through the current draft of Chapter 14, on linear models, I frequently found myself nodding along in appreciation. There is a lot of information that is both interesting and immediately useful for applied modeling, as well as high level discussions about modeling and the field of statistics. My primary suggestions have to do with the organization of the chapter, and potentially narrowing the focus to allow for a greater sense of coherence. My apologies from the rambling nature of the subsequent review, I'm a bit over-caffeinated the moment -- I'm happy to clarify any unclear comments!
Section 14.1 introduces least squares and regression, through the lens of parametric error minimization. It also discusses the history of least squares estimation, as well as providing some wisdom about models as limited tools for understanding the world. Section 14.1 also introduces the idea that software is important to the practice of modeling, and that good inference requires good programming. I find all of these ideas to be valuable contributions. On a first read, however, I did feel a little disoriented. In particular, I felt like the chapter really wanted to discuss the practice (and history) of applied statistical modeling on the whole, and it was unclear to me if I should be focusing on big picture abstractions about statistical modeling, or on the technical details involved in regression.
Section 14.2 started to discuss technical elements of regression in more detail, first describing estimation for a univariate normal model. It was unclear to me if estimation is discussed earlier in the book, or if readers will have a good grasp of samples vs population by this point. In this section, I also wondered about the conceptual jump for univariate normal distributions to regression models -- in particular, there is a jump from marginal reasoning to conditional reasoning, and I suspect some additionally commentary about this difference might help readers at this point. Readers might get confused by the comment that y does not have to be normal; clarifying conditionals vs marginals should help with this. A focus on conditional mean modeling may also help smooth the conceptual transition from OLS (introduced in terms of parametric residuals) to GLMs (where residuals don't have densities) later in the chapter.
I appreciated that Section 14.2 was grounded with a concrete running example, and found Figure 14.2 to be a useful aid. I think the example stands reasonably well on its own, but runners may observe that speeds sustainable for a 5K are not sustainable for a marathon, such that a linear relationship between 5K time and marathon time is not strictly realistic (Aside: I got curious about this a while back, and it turns out that the relationship is modeling fairly accurately using Reigel's formula. Apologies for the self-promotion, but I wrote up a short blog post about this.) I appreciated the careful attention while interpreting regression coefficients in Section 14.2, and almost felt like there was a nice semi-parametric framing hidden just out of sight. By which I mean, an introduction to regression that focuses not on parameters, but rather on defining non-parametric functionals of interest, and then estimating these functionals under parametric models. Certainly the focus on marginal effects later in the chapter is in line with this vision. It may be worthwhile discussing this targeted, functional-estimation centric philosophy explicitly.
In Section 14.2, and perhaps also more broadly, I did take issue with some characterizations of uncertainty in estimates. For example, consider this discussion of the uncertainty in a regression coefficient.
As a student, I would find the phrase "only some of which it can actually account for" concerning. It suggests a sort of hopelessness, that uncertainty is fundamentally too hard. Now, I think you do want to have some of this attitude around, but I think it is helpful to characterize when you should be optimism or pessimistic about modeling. As a field, this is something we tend to do poorly: we highlight ways things can be "incorrect," without ever teaching a mental model of correctness. I have the feeling you have a vision of correctness lurking somewhere in the sub-text of this chapter, and think that it would be an immense act of pedagogical service to bring that vision more to the surface.
At the same, it's probably worthwhile thinking about where in Telling Stories to include such big picture wisdom about modeling. Towards the end of Section 14.2, for example, there is a general discussion of p-values and the Duhem-Quine problem. This is super valuable, but organizationally it probably makes sense to separate this material from technical details about regression.
Section 14.3 begins a move away from general commentary about modeling to model specific details about regression, and extensions to OLS in particular. Here I appreciated the focus on code examples. I was a little surprised by the use of testthat -- in particular, it was unclear to me what kinds of errors the tests were designed to catch. Personally, it seemed like the tests were in a data-validation vein, and therefore might fit better in other chapters of the book discussing data cleaning (i.e. it would be natural to mention these tests alongside the
pointblank
package).Section 14.4 pivots back to big picture discussion, describing the difference between inference and prediction, while simultaneously introducing the frequentist-Bayesian divide (aside: it may be worthwhile to discuss priors in slightly more detail). I'm really glad to see this material in here, as this cultural divide was a huge sticking point when I was an undergraduate myself. That said, I didn't feel like I left these sections with a clear understanding of the difference between inference tasks and prediction tasks. I think this is partially because it is difficult to illustrate inference vs prediction exclusively within the context of linear models. To me, prediction is about using arbitrary, non-linear estimators, and evaluating model performance using resampling tools like cross-validation, and hold-out test sets. The key ideas in prediction land are overfitting, hyperparameter tuning, cross-validation, and plugging in your favorite estimator. A prediction model is successful when it performs well empirically, so assessing empirical performances and getting predictions are the crucial bits. This is in contrast to inference, where there is typically a research question, and we use parametric models for the sake of interpretability. I think this larger context is somewhat absent in Section 14.4, which favors code examples, but that it would be valuable to emphasize broad context here rather than code. It also might make sense to have the discussion about prediction vs inference in a separate modeling chapter separate from the regression chapter.
Section 14.5 pivots back to a regression focus, and discusses logistic regression, dividing into marginal effects fairly quickly. I think it is would be worthwhile to formally define marginal effects, so that students can understand the estimand at hand, and possibly also to explain how inferential results for parameters
theta
can be used to obtain inferential results for functionsq(theta)
. This is a place I notice a lot of people getting confused (see, for example, this explainer thread). After this introduction to logistic regression, the focus shifts back to code for a while, and then on to risk estimation in the political support example. I found this transition somewhat challenging to follow: it led me to believe that the predictions from a logistic regression were related to the quality of the marginal effect estimates, but there was no explicit connection (or differentiation) made between these modes of modeling. More broadly, I think the code examples in this section are useful, but that the modeling choices are not motivated explicitly. In this section I would also refer tomale
andfemale
assex
rather thangender
, and would also take a moment to consider avoiding sex-differences examples, or to find a dataset with more inclusive sex/gender definitions.Sections 14.6 and 14.7 discuss poisson and negative binomial regression. Personally I think this is too much material to cover in one chapter. I thought the data example with e/E in this section was well-motivated, but I felt like model checking was appearing as a new topic in the chapter for the first time. I would move this material to another chapter. I feel similarly about the Plumber examples in Section 14.8.
To summarize: there is a lot of really cool stuff going on in this chapter. There are solid introductions to estimation, modeling as an abstract activity, and also modeling as an applied activity. Each of these topics deserves time and attention, and ideally they should probably be treated as separate topics. I would strongly consider splitting Chapter 14 into three separate chapters. I think the resulting chapters would feel more coherent, and be easier to digest.
The text was updated successfully, but these errors were encountered: