-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correlation causation #148
base: main
Are you sure you want to change the base?
Conversation
* On gravity * On athletic/IQ scores
…esampling-with into correlation_causation
* Need to iterate on this, not add to it.
* Lightly edit the rest.
source/correlation_causation.Rmd
Outdated
(column) vector. Note that we want to find a solution for any such system - there are no conditions other than that the | ||
number of rows of $\mathbf{x}$ must be the same as the number of columns of $A$, and the number of rows of $\mathbf{y} | ||
must be the same as the number of rows of $A$. Note in particular that $m$ need not equal $n$, and if $m=n$ we don't | ||
require that the determinant of $A$ be non-zero. Let's give examples of the typical situations that one encounters in general, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the determinant is sure to confuse
|
||
|
||
1. The first example represents all systems of equations where $m=n$ with non-zero determinant. In all these cases the equation is *solvable* by perhaps using | ||
Gaussian elimination with partial pivoting. (Not Cramer's rule!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those new to linear algebra won't know about Gaussian elimination or pivoting
$$ | ||
This equation represents all equations where there are more equations than unknowns, i.e. all *over determined* systems. | ||
Since this system cannot be solved, we look | ||
instead for a solution that best fits the system in a sense that we'll explain later. Please take our word for it, for now, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"take our word" is my least favorite expression! Perhaps we can give a quick intuitive version of the answer, e.g., we have to pick a value, and it looks like that value will have to be somewhere between 1 and 1.2. A best guess turns out to be , and we'll soon learn why.
that the solution is given by the *normal equations*, | ||
$$ | ||
A^T A \mathbf{x} = A^T \mathbf{y}. | ||
$$ | ||
Here $A^T$ is the transpose of $A$, $A^TA$ is an $n\times n$, square, symmetrid matrix and $A^T \mathbf{y}$ is an $n\times 1$ (column) | ||
vector. Moreover, if the columns of $A$ are linearly independent, it can be shown that $A^TA$ has an inverse, | ||
a situation that is almost always true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is helpful until the reader can comprehend what it says.
to identify a natural solution among the infinity of available solutions. To find this solution one has to calculate | ||
the generalized inverse that will take us too far from our core focus. But it turns out that it can again be cast as an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think simply mentioning these terms will lead to confusion. We'd need to find an accessible way to explain that a solution is possible, but that you need to go about it carefully.
It should be obvious that this is indeed a solution of the equation. What is more, it is the solution | ||
that is the closest to the origin, i.e. out of the infinite number of solutions, this is the one with the | ||
shortest length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type of language, e.g., is good for beginner level.
Returning to our question above, we want to identify that value of $\mathbf{x}$ that will minimize the errors, | ||
$e_1, \ldots, e_m$. We are back at the question, minimize in what sense? A generally used measure for the error is, | ||
$$ | ||
\mathbf{e}^T\mathbf{e} = e_1^2 + \cdots + e_m^2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd swap these around, since the student is more likely to understand the latter.
|
||
Armed with the normal equations we can explain the linear correlation between variables. | ||
|
||
:::{.callout-note} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this callout a lot.
\mathbf{y}^T \mathbf{y}. | ||
$$ | ||
In order to find the values of $\mathbf{x}$ that will minimize the sum of the squares of the errors, we need to set the | ||
partial derivatives to all the components, $x_1, \ldots, x_n$ in the equation to zero. The detailed calculations are messy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it also messy to derive it from e_1^2 + e_2^2 ...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I.e., can we get a sense of least squares without matrix formulation, and in the end just state that the solution can also be written as ... using matrices?
Here's the slope with assumed intercept of 0 : https://lisds.github.io/textbook/mean-slopes/mean_and_slopes.html |
Source at : https://github.com/lisds/textbook/ including datasets. |
Incomplete