Skip to content

Commit

Permalink
Merge pull request #65 from mattblackwell/design
Browse files Browse the repository at this point in the history
Design-based chapter and general edits
  • Loading branch information
mattblackwell authored Jun 9, 2024
2 parents d98372e + d69cba4 commit 309d1ff
Show file tree
Hide file tree
Showing 109 changed files with 5,627 additions and 3,044 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
/_book/
.auto/
*_files/

site_libs
*.rel
.Rhistory
.DS_Store
/index.aux
15 changes: 0 additions & 15 deletions 01_intro.qmd

This file was deleted.

435 changes: 0 additions & 435 deletions 02_estimation.qmd

This file was deleted.

434 changes: 0 additions & 434 deletions 04_hypothesis_tests.qmd

This file was deleted.

4 changes: 2 additions & 2 deletions _freeze/02_estimation/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/02_estimation/execute-results/tex.json

Large diffs are not rendered by default.

Binary file modified _freeze/02_estimation/figure-pdf/mse-1.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions _freeze/03_asymptotics/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/03_asymptotics/execute-results/tex.json

Large diffs are not rendered by default.

Binary file modified _freeze/03_asymptotics/figure-html/fig-lln-sim-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/03_asymptotics/figure-html/indist-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/03_asymptotics/figure-html/sequence-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _freeze/03_asymptotics/figure-pdf/fig-ci-sim-1.pdf
Binary file not shown.
Binary file modified _freeze/03_asymptotics/figure-pdf/fig-clt-1.pdf
Binary file not shown.
Binary file modified _freeze/03_asymptotics/figure-pdf/fig-delta-1.pdf
Binary file not shown.
Binary file modified _freeze/03_asymptotics/figure-pdf/fig-lln-sim-1.pdf
Binary file not shown.
Binary file modified _freeze/03_asymptotics/figure-pdf/fig-std-normal-1.pdf
Binary file not shown.
Binary file added _freeze/03_asymptotics/figure-pdf/indist-1.pdf
Binary file not shown.
Binary file added _freeze/03_asymptotics/figure-pdf/sequence-1.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions _freeze/04_hypothesis_tests/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/04_hypothesis_tests/execute-results/tex.json

Large diffs are not rendered by default.

Binary file modified _freeze/04_hypothesis_tests/figure-pdf/fig-shape-of-t-1.pdf
Binary file not shown.
Binary file modified _freeze/04_hypothesis_tests/figure-pdf/fig-size-power-1.pdf
Binary file not shown.
Binary file modified _freeze/04_hypothesis_tests/figure-pdf/fig-two-sided-1.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions _freeze/06_linear_model/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/06_linear_model/execute-results/tex.json

Large diffs are not rendered by default.

Binary file modified _freeze/06_linear_model/figure-pdf/fig-blp-limits-1.pdf
Binary file not shown.
Binary file modified _freeze/06_linear_model/figure-pdf/fig-cef-binned-1.pdf
Binary file not shown.
Binary file modified _freeze/06_linear_model/figure-pdf/fig-cef-blp-1.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions _freeze/07_least_squares/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/07_least_squares/execute-results/tex.json

Large diffs are not rendered by default.

Binary file modified _freeze/07_least_squares/figure-pdf/fig-ajr-scatter-1.pdf
Binary file not shown.
Binary file modified _freeze/07_least_squares/figure-pdf/fig-influence-1.pdf
Binary file not shown.
Binary file modified _freeze/07_least_squares/figure-pdf/fig-outlier-1.pdf
Binary file not shown.
Binary file modified _freeze/07_least_squares/figure-pdf/fig-ssr-comp-1.pdf
Binary file not shown.
Binary file modified _freeze/07_least_squares/figure-pdf/fig-ssr-vs-tss-1.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions _freeze/08_ols_properties/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/08_ols_properties/execute-results/tex.json

Large diffs are not rendered by default.

Binary file modified _freeze/08_ols_properties/figure-pdf/fig-wald-1.pdf
Binary file not shown.
17 changes: 17 additions & 0 deletions _freeze/asymptotics/execute-results/html.json

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions _freeze/asymptotics/execute-results/tex.json

Large diffs are not rendered by default.

Binary file added _freeze/asymptotics/figure-html/fig-ci-sim-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/asymptotics/figure-html/fig-clt-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/asymptotics/figure-html/fig-delta-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/asymptotics/figure-html/fig-lln-sim-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/asymptotics/figure-html/indist-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/asymptotics/figure-html/sequence-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/asymptotics/figure-pdf/fig-ci-sim-1.pdf
Binary file not shown.
Binary file added _freeze/asymptotics/figure-pdf/fig-clt-1.pdf
Binary file not shown.
Binary file added _freeze/asymptotics/figure-pdf/fig-delta-1.pdf
Binary file not shown.
Binary file added _freeze/asymptotics/figure-pdf/fig-lln-sim-1.pdf
Binary file not shown.
Binary file not shown.
Binary file added _freeze/asymptotics/figure-pdf/indist-1.pdf
Binary file not shown.
Binary file added _freeze/asymptotics/figure-pdf/sequence-1.pdf
Binary file not shown.
17 changes: 17 additions & 0 deletions _freeze/design/execute-results/html.json

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions _freeze/design/execute-results/tex.json

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions _freeze/estimation/execute-results/html.json

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions _freeze/estimation/execute-results/tex.json

Large diffs are not rendered by default.

Binary file added _freeze/estimation/figure-html/mse-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/estimation/figure-pdf/mse-1.pdf
Binary file not shown.
17 changes: 17 additions & 0 deletions _freeze/hypothesis_tests/execute-results/html.json

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions _freeze/hypothesis_tests/execute-results/tex.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file not shown.
17 changes: 17 additions & 0 deletions _freeze/least_squares/execute-results/html.json

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions _freeze/least_squares/execute-results/tex.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
17 changes: 17 additions & 0 deletions _freeze/linear_model/execute-results/html.json

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions _freeze/linear_model/execute-results/tex.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file added _freeze/linear_model/figure-pdf/fig-cef-blp-1.pdf
Binary file not shown.
17 changes: 17 additions & 0 deletions _freeze/ols_properties/execute-results/html.json

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions _freeze/ols_properties/execute-results/tex.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _freeze/ols_properties/figure-pdf/fig-wald-1.pdf
Binary file not shown.
21 changes: 13 additions & 8 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,17 @@ book:
downloads: [pdf]
chapters:
- index.qmd
- 01_intro.qmd
- part: "Statistical Inference"
chapters:
- 02_estimation.qmd
- 03_asymptotics.qmd
- 04_hypothesis_tests.qmd
- design.qmd
- estimation.qmd
- asymptotics.qmd
- hypothesis_tests.qmd
- part: "Regression"
chapters:
- 06_linear_model.qmd
- 07_least_squares.qmd
- 08_ols_properties.qmd
- linear_model.qmd
- least_squares.qmd
- ols_properties.qmd
- references.qmd

bibliography: references.bib
Expand All @@ -45,8 +45,13 @@ format:

pdf:
documentclass: scrreprt
fontfamily: cochineal
keep-tex: true
fig-pos: th
fig-width: 10
fig-height: 6.18
fig-align: center
fontfamily: cochineal
fontsize: 13pt
include-in-header:
- _bold.tex
- _macros_pdf.tex
221 changes: 174 additions & 47 deletions 03_asymptotics.qmd → asymptotics.qmd

Large diffs are not rendered by default.

File renamed without changes.
512 changes: 512 additions & 0 deletions design.qmd

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions estimands.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# What to estimate?



## Three goals


## Descriptive estimands

The simplest

## Causal estimands


## Predictive estimands

Basic idea: find a function of $X$ that minimizes some loss function.
475 changes: 475 additions & 0 deletions estimation.qmd

Large diffs are not rendered by default.

453 changes: 453 additions & 0 deletions hypothesis_tests.qmd

Large diffs are not rendered by default.

52 changes: 43 additions & 9 deletions index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,54 @@

# Preface {.unnumbered}

The goal of this text is to provide a rigorous yet accessible introduction to the foundational topics in statistical inference with a special application to linear regression, a workhorse tool in the social sciences. The material is intended for first-year PhD students in political science, but it may be of interest more broadly. Much of the material has been adopted from various sources (far too many to recount now), but this book is especially indebted to the following texts:

- Hansen, Bruce. [*Probability & Statistics for Economists*](https://www.amazon.com/Probability-Statistics-Economists-Bruce-Hansen/dp/0691235945/). Princeton University Press.
- Hansen, Bruce. [*Econometrics*](https://www.amazon.com/Econometrics-Bruce-Hansen/dp/0691235899/). Princeton University Press.
- Wasserman, Larry. [*All of Statistics: A Concise Course in Statistical Inference*](https://link.springer.com/book/10.1007/978-0-387-21736-9). Springer.
- Wooldridge, Jeffrey. [*Econometric Analysis of Cross Section and Panel Data*](https://mitpress.mit.edu/9780262232586/econometric-analysis-of-cross-section-and-panel-data/). The MIT Press.
This book, like many before it, will try to teach you statistics. The field of statistics describes how we learn about the world using quantitative data. In the social sciences, an increasing share of empirical studies use statistical methods to provide evidence for or against conceptual arguments. And, while it is possible to conduct quantitative research without understanding statistics at an intuitive level, it is not a good idea. Quantitative research involves a host of *choices* about the model to use, variables to include, tuning parameters to set, assumptions to make, and so on. Without a deep understanding of statistics, you may find these choices bewildering and confusing, and you may simply (and possibly erroneously) yield to the default settings of your statistical software.

The goal of this book is to give you the foundation to make methodological choices for your specific application with knowledge and with confidence. The material is intended for first-year PhD students in political science, but it may be of interest more broadly.

We will focus on two key goals:


1. **Understand the basic ways to assess estimators** With quantitative data, we often want to make statistical inferences about some unknown feature of the world. We use estimators (which are just ways of summarizing our data) to estimate these features. This book will introduce the basics of this task at a general enough level to be applicable to almost any estimator that you are likely to encounter in empirical research in the social sciences. We will also cover major concepts such as bias, sampling variance, consistency, and asymptotic normality, which are so common to such a large swath of (frequentist) inference that understanding them at a deep level will yield an enormous return on your time investment. Once you understand these core ideas, you will have a language to analyze any fancy new estimator that pops up in the next few decades.

2. **Apply these ideas to the estimation of regression models** This book will apply these ideas to one particular social science workhorse: regression. Many methods either use regression estimators like ordinary least squares or extend them in some way. Understanding how these estimators work is vital for conducting research, for reading and reviewing contemporary scholarship, and, frankly, for being a good and valuable colleague in seminars and workshops. Regression and regression estimators also provide an entry point for discussing parametric models as approximations, rather than as rigid assumptions about the truth of a given specification.


Why write a book on statistics and regression when so many already exist? While some texts at this level exist in the fields of statistics and economics, they tend to focus on applications and models less relevant to other social sciences. This book attempts to correct this. The book also seeks to introduce a fairly high level of mathematical sophistication that will challenge and push you to develop stronger foundations in the material.


## Roadmap

This book has two major parts. Part I introduces the basics of statistical inference.

We start in @sec-design-based by demonstrating basic concepts of estimation and inference from the design-based perspective in which we sample from a fixed, finite population, and all uncertainty comes from randomness over who is and is not included in the sample. This framework for inference has deep roots in the statistical literature and provides a great deal of intuition for how estimation and uncertainty work in simple settings. We will discuss how to use design-based inference to estimate features of the population from samples when the analyst knows the exact sampling design. Unfortunately, researchers often lack this knowledge about how their data came to be, limiting the usefulness of this approach.

@sec-model-based introduces a more flexible approach to estimation: model-based inference. With this approach, the researcher posits a probability model for how the data came to be. This book focuses on models that posit “independent and identically distributed” data for this model. The chapter describes how estimation and inference proceed under these models and also introduces a broad class of estimators based on the plug-in principle.

These two chapters focus on finite sample properties of different estimation techniques, but we can say more about an estimator if we consider how it behaves on larger and larger samples. @sec-asymptotics introduces this type of asymptotic analysis. It covers the core results of asymptotic theory, such as the law of large numbers, the central limit theorem, and the delta method, but also shows why these results are important for statistical inference. In particular, the chapter shows how these results enable the creation of asymptotically valid confidence intervals.

@sec-hypothesis-tests wraps up Part I of the book by introducing statistical inference with hypothesis testing. This chapter shows how to build hypothesis tests and provides intuition for all their aspects. We also cover power analyses for planning studies and the connection between confidence intervals and hypothesis tests.

Part II of the book focuses on one particular estimator of great importance to quantitative social sciences: the least squares estimator.

@sec-regression begins by describing exactly what quantity of interest we are targeting when we discuss “linear models.” In particular, we discuss how a population best linear predictor exists even if the relationship between two variables is nonlinear. This provides a coherent basis for linear regression estimation as a linear approximation to a potentially nonlinear function. The chapter also shows how to interpret the coefficients in these linear regression models.

@sec-ols-mechanics introduces the more mechanical properties of the least squares estimator: how the estimator is constructed, its geometrical interpretation, and how influential observations may affect the estimates it returns. This chapter introduces the least squares estimator in matrix form and provides key intuition for understanding this compact notation.

Finally, @sec-ols-statistics describes the statistical properties of the least squares estimator. The chapter shows how modeling assumptions affect the kinds of properties we can obtain. The weakest modeling assumptions allow us to derive the surprisingly strong asymptotic properties of least squares that we depend on in most settings. The chapter then shows how stronger assumptions such as linearity and normally distributed errors can provide even stronger results but that they do so at the expense of potential model misspecification.


## Acknowledgements

Much of how I approach this material comes from Adam Glynn, for whom I was a teaching fellow during graduate school. Thanks to the students of Gov 2000 and Gov 2002 over years for helping me refine the material in this book. Also very special thanks to those who have provided valuable feedback including Zeki Akyol, Noah Dasanaike, Maya Sen, and Jarell Cheong Tze Wen.

## Colophon

You can find the source for this book at <https://github.com/mattblackwell/gov2002-book>. Any typos or errors can be reported at <https://github.com/mattblackwell/gov2002-book/issues>. Thanks for reading.


This is a Quarto book. To learn more about Quarto books visit <https://quarto.org/docs/books>.

$\,$

# Acknowledgements

Much of how I approach this material comes from Adam Glynn, for whom I was a teaching fellow during graduate school. Thanks to the students of Gov 2000 and Gov 2002 over years for helping me refine the material in this book. Also very special thanks to those who have provided valuable feedback including Zeki Akyol, Noah Dasanaike, and Jarell Cheong Tze Wen.
$\,$
$\,$
Loading

0 comments on commit 309d1ff

Please sign in to comment.