Skip to content

Commit

Permalink
updated readme, setup.py, and docs in anticipation of 1.1.1 release
Browse files Browse the repository at this point in the history
  • Loading branch information
rishi-kulkarni committed Jun 11, 2021
1 parent cab235d commit 1b41aa6
Show file tree
Hide file tree
Showing 6 changed files with 41 additions and 27 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## A Hierarchical Resampling Package for Python

Version 1.1
Version 1.1.1

hierarch is a package for hierarchical resampling (bootstrapping, permutation) of datasets in Python. Because for loops are ultimately intrinsic to cluster-aware resampling, hierarch uses Numba to accelerate many of its key functions.

Expand Down
17 changes: 10 additions & 7 deletions docs/user/confidence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ confidence interval. ::

from hierarch.stats import confidence_interval

ha.stats.confidence_interval(
confidence_interval(
data,
treatment_col=0,
compare='means',
Expand All @@ -84,7 +84,7 @@ Because ha.stats.confidence_interval is based on a hypothesis test, it requires
the same input parameters as hypothesis_test. However,
the new **interval** parameter determines the width of the interval. ::

ha.stats.confidence_interval(
confidence_interval(
data,
treatment_col=0,
compare='means',
Expand All @@ -96,7 +96,7 @@ the new **interval** parameter determines the width of the interval. ::

(-0.9086402840632387, 0.25123067872990457)

ha.stats.confidence_interval(
confidence_interval(
data,
treatment_col=0,
compare='means',
Expand Down Expand Up @@ -141,7 +141,8 @@ this value. You can test this with the following code. ::

for i in range(loops):
data = sim.generate()
lower, upper = ha.stats.confidence_interval(data, 0, interval=95, bootstraps=100, permutations='all')
lower, upper = confidence_interval(data, 0, interval=95,
bootstraps=100, permutations='all')
if lower <= true_difference <= upper:
coverage += 1

Expand Down Expand Up @@ -223,7 +224,7 @@ for **compare** when computing a confidence interval. ::

from hierarch.stats import confidence_interval

ha.stats.confidence_interval(
confidence_interval(
data,
treatment_col=0,
compare='corr',
Expand Down Expand Up @@ -260,7 +261,8 @@ set up a simulation as above to check the coverage of the 95% confidence interva

for i in range(loops):
data = datagen.generate()
lower, upper = ha.stats.confidence_interval(data, 0, interval=95, bootstraps=100, permutations='all')
lower, upper = confidence_interval(data, 0, interval=95,
bootstraps=100, permutations='all')
if lower <= true_difference <= upper:
coverage += 1

Expand All @@ -279,7 +281,8 @@ interest. ::

for i in range(loops):
data = datagen.generate()
lower, upper = ha.stats.confidence_interval(data, 0, interval=99, bootstraps=100, permutations='all')
lower, upper = confidence_interval(data, 0, interval=99,
bootstraps=100, permutations='all')
if lower <= true_difference <= upper:
coverage += 1

Expand Down
29 changes: 18 additions & 11 deletions docs/user/hypothesis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,9 @@ column - in this case, "Condition." Indexing starts at 0, so you input
treatment_col=0. In this case, there are only 6c3 = 20 ways to permute the
treatment labels, so you should specify "all" permutations be used. ::

p_val = ha.stats.hypothesis_test(data, treatment_col=0, compare='means',
from hierarch.stats import hypothesis_test

p_val = hypothesis_test(data, treatment_col=0, compare='means',
bootstraps=500, permutations='all',
random_state=1)

Expand All @@ -84,20 +86,23 @@ treatment labels, so you should specify "all" permutations be used. ::

There are a number of parameters that can be used to modify hypothesis_test. ::

ha.stats.hypothesis_test(data_array,
treatment_col,
compare="means",
skip=None,
bootstraps=100,
permutations=1000,
kind='weights',
return_null=False,
random_state=None)
hypothesis_test(data_array,
treatment_col,
compare="means",
skip=None,
bootstraps=100,
permutations=1000,
kind='weights',
return_null=False,
random_state=None)

**compare**: The default "means" assumes that you are testing for a difference in means, so it uses the Welch t-statistic.
"corr" uses a studentized covariance based test statistic which gives the same result as the Welch t-statistic for two-sample
datasets, but can be used on datasets with any number of related treatment groups. For flexibility, hypothesis_test can
also take a test statistic function as an argument.
also take a test statistic function as an argument.

**alternative** : "two-sided" or "less" or "greater" specifies the alternative hypothesis. "two-sided" conducts
a two-tailed test, while "less" or "greater" conduct the appropriate one-tailed test.

**skip**: indicates the indices of columns that should be skipped in the bootstrapping procedure.

Expand Down Expand Up @@ -228,6 +233,8 @@ treatment 2 represents a slight difference and treatment 4 represents a large di
There are six total comparisons that can be made, which can be performed automatically
using multi_sample_test as follows. ::

from hierarch.stats import multi_sample_test

multi_sample_test(data, treatment_col=0, hypotheses="all",
correction=None, bootstraps=1000,
permutations="all", random_state=111)
Expand Down
6 changes: 4 additions & 2 deletions docs/user/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,5 +51,7 @@ Here is the sort of data that hierarch is designed to perform hypothesis tests o

The code to perform a hierarchical permutation t-test on this dataset looks like::

hierarch.stats.hypothesis_test(data, treatment_col=0,
bootstraps=1000, permutations='all')
from hierarch.stats import hypothesis_test

hypothesis_test(data, treatment_col=0,
bootstraps=1000, permutations='all')
12 changes: 7 additions & 5 deletions docs/user/power.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,11 +91,13 @@ permutations (though this is overkill in the 2, 3, 3 case) on each
of 100 simulated datasets and prints the fraction of them that return
a significant result, assuming a p-value cutoff of 0.05. ::

from hierarch.stats import hypothesis_test

pvalues = []
loops = 100
for i in range(loops):
data = sim.generate()
pvalues.append(ha.stats.hypothesis_test(data, 0, bootstraps=500, permutations=100))
pvalues.append(hypothesis_test(data, 0, bootstraps=500, permutations=100))
print(np.less(pvalues, 0.05).sum() / loops)

Expand All @@ -111,7 +113,7 @@ you determine the column 1 sample size that achieves at least 80% power. ::
loops = 100
for i in range(loops):
data = sim.generate()
pvalues.append(ha.stats.hypothesis_test(data, 0, bootstraps=500, permutations=100))
pvalues.append(hypothesis_test(data, 0, bootstraps=500, permutations=100))
print(np.less(pvalues, 0.05).sum() / loops)

Expand All @@ -134,7 +136,7 @@ achieved with an experimental design that makes more column 2 measurements. ::
loops = 100
for i in range(loops):
data = sim.generate()
pvalues.append(ha.stats.hypothesis_test(data, 0, bootstraps=500, permutations=100))
pvalues.append(hypothesis_test(data, 0, bootstraps=500, permutations=100))
print(np.less(pvalues, 0.05).sum() / loops)

Expand All @@ -154,7 +156,7 @@ only 30 column 2 samples. ::
loops = 100
for i in range(loops):
data = sim.generate()
pvalues.append(ha.stats.hypothesis_test(data, 0, bootstraps=500, permutations=100))
pvalues.append(hypothesis_test(data, 0, bootstraps=500, permutations=100))
print(np.less(pvalues, 0.05).sum() / loops)
Expand All @@ -180,7 +182,7 @@ the error for an event that happens 5% probability is +/- 2%, but at
loops = 1000
for i in range(loops):
data = sim.generate()
pvalues.append(ha.stats.hypothesis_test(data, 0, bootstraps=500, permutations=100))
pvalues.append(hypothesis_test(data, 0, bootstraps=500, permutations=100))
print(np.less(pvalues, 0.05).sum() / loops)

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

setuptools.setup(
name="hierarch",
version="1.1.0",
version="1.1.1",
author="Rishi Kulkarni",
author_email="[email protected]",
description="Hierarchical hypothesis testing library",
Expand Down

0 comments on commit 1b41aa6

Please sign in to comment.