-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathclustered-standard-errors.domd
47 lines (37 loc) · 1.96 KB
/
clustered-standard-errors.domd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
layout: post
title: xtreg fe - Clustered Standard Errors - Use Caution
---
I have been implementing a fixed-effects estimator in Python so I can
work with data that is too large to hold in memory. To make sure I
was calculating my coefficients and standard errors correctly I have
been comparing the calculations of my Python code to results from
Stata. This lead me to find a surprising inconsistency in Stata's
calculation of standard errors. I illustrate the issue by comparing
standard errors computed by Stata's `xtreg fe` command to those
computed by the standard `regress` command. To start, I use the first
hundred observations of the `nlswork` dataset:
webuse nlswork, clear
keep if _n <= 100
I then estimate a fixed-effects model using regress:
regress ln_w age tenure i.idcode, vce(cluster idcode)
For comparison, I estimate the same model using `xtreg fe`:
xtset idcode
xtreg ln_w age tenure, fe vce(cluster idcode)
Notice that the coefficients on age and tenure match perfectly across
the two regressions, but the standard errors calculated by `xtreg fe`
are smaller. This is due to a difference in the degrees of freedom
adjustment used. `regress` counts the fixed effects as coefficients
estimated, while `xtreg fe` by default does not. As Kevin Goulding
explains
[here](http://thetarzan.wordpress.com/2011/06/11/clustered-standard-errors-in-r/),
clustered standard errors are generally computed by multiplying the
estimated asymptotic variance by (M / (M - 1)) ((N - 1) / (N - K)). M
is the number of individuals, N is the number of observations, and K
is the number of parameters estimated. The standard `regress` command
correctly sets K = 12, `xtreg fe` sets K = 3. Making the asymptotic
variance (99 - 12) / (99 - 3) = 0.90625 times the correct value.
To get the correct standard errors from `xtreg fe` use the `dfadj`
option:
xtreg ln_w age tenure, fe vce(cluster idcode) dfadj
This seems like an option that could be missed easily.