Skip to content

Commit

Permalink
Merge pull request #34 from milo-trujillo/master
Browse files Browse the repository at this point in the history
Fix minor documentation typos
  • Loading branch information
ryanjgallagher committed Jun 20, 2022
2 parents 61f81ef + 9c6d267 commit d12379f
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/cookbook/frequency_shifts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Proportion shifts are easy to interpret, but they are simplistic and have a diff
H(P) = \sum_i p_i \log \frac{1}{p_i},
where the factor :math:`-\log p_i` is the *surprisal* of a word. The less often a word appears in a text, the mor surprising that it is. The Shannon entropy can be interpreted as the average surprisal of a text. We can compare two texts by taking the difference between their entropies, :math:`H(P^{(2)}) - H(P^{(1)})`. When we do this, we can get the contribution :math:`\delta H_i` of each word to that difference:
where the factor :math:`-\log p_i` is the *surprisal* of a word. The less often a word appears in a text, the more surprising that it is. The Shannon entropy can be interpreted as the average surprisal of a text. We can compare two texts by taking the difference between their entropies, :math:`H(P^{(2)}) - H(P^{(1)})`. When we do this, we can get the contribution :math:`\delta H_i` of each word to that difference:

.. math::
Expand Down
2 changes: 1 addition & 1 deletion docs/cookbook/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,4 @@ Throughout this tutorial, we compare the speeches of two U.S. presidents, Lyndon

Necessary Data Structures
-------------------------
We load the parsed text into two dictionaries: :code:`type2freq_1` (for Lyndon B. Johnson) and :code:`type2freq_2` (for George W. Bush). These are dictionaries where keys are word types and valules are their frequencies in each text. For many word shifts, this is the only input that is required.
We load the parsed text into two dictionaries: :code:`type2freq_1` (for Lyndon B. Johnson) and :code:`type2freq_2` (for George W. Bush). These are dictionaries where keys are word types and values are their frequencies in each text. For many word shifts, this is the only input that is required.
2 changes: 1 addition & 1 deletion docs/cookbook/weighted_avg_shifts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ First consider the case where the word scores do not depend on the texts, i.e. :
where :math:`\Phi^{(ref)}` is a *reference score*.

We use reference scores to distinguish between different regimes of interest in word scores. For example, in the case of sentiment analysis, we not ony know each word's score, we also know qualitatively whether that word is more or less positive. We know that "sunshine" is a relatively happy word and that "terror" is a relatively negative word. We can encode that qualitative knowledge into our word shift scores using the reference value. If our dictionary scores range from 1 to 9, we may set the reference value to :math:`\Phi^{(ref)} = 5`, the center of our scale. Or, we may take the average sentiment of our first text and set :math:`\Phi^{(ref)} = \Phi^{(1)}`, and so a word is relatively positive if it is even more positive than the overall sentiment of the first text.
We use reference scores to distinguish between different regimes of interest in word scores. For example, in the case of sentiment analysis, we not only know each word's score, we also know qualitatively whether that word is more or less positive. We know that "sunshine" is a relatively happy word and that "terror" is a relatively negative word. We can encode that qualitative knowledge into our word shift scores using the reference value. If our dictionary scores range from 1 to 9, we may set the reference value to :math:`\Phi^{(ref)} = 5`, the center of our scale. Or, we may take the average sentiment of our first text and set :math:`\Phi^{(ref)} = \Phi^{(1)}`, and so a word is relatively positive if it is even more positive than the overall sentiment of the first text.

.. note::
The reference score can be set to any value that distinguishes between different score regimes of interest. It does not change the overall weighted average of a text.
Expand Down

0 comments on commit d12379f

Please sign in to comment.