Skip to content
/ DBCI Public

Computation of confidence intervals for binomial proportions and for difference of binomial proportions.

License

Notifications You must be signed in to change notification settings

DeepPSP/DBCI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

144d1d1 Β· Oct 13, 2024
Jul 24, 2024
Oct 13, 2024
Jul 29, 2024
Feb 5, 2024
Aug 27, 2022
Apr 28, 2024
Jul 25, 2024
Aug 27, 2022
Oct 13, 2024
Jul 19, 2024
Aug 26, 2024
Mar 18, 2024
Aug 26, 2024
Jul 23, 2024
Aug 26, 2024

Repository files navigation

Confidence Intervals for Difference of Binomial Proportions

pytest random-test codecov PyPI RTD Status gh-page status downloads license GitHub Release Date - Published_At GitHub commits since latest release (by SemVer including pre-releases) Streamlit App

Computation of confidence intervals for binomial proportions and for difference of binomial proportions.

[GitHub Pages] [Read the Docs]

πŸš€ NEW πŸš€ Streamlit support! See here for an app deployed on Streamlit Community Cloud.

Installation

Run

python -m pip install diff-binom-confint

or install the latest version in GitHub using

python -m pip install git+https://github.com/DeepPSP/DBCI.git

or git clone this repository and install locally via

cd DBCI
python -m pip install .

Numba accelerated version

Install using

python -m pip install diff-binom-confint[acc]

Usage examples

from diff_binom_confint import compute_difference_confidence_interval

n_positive, n_total = 84, 101
ref_positive, ref_total = 89, 105

confint = compute_difference_confidence_interval(
    n_positive,
    n_total,
    ref_positive,
    ref_total,
    conf_level=0.95,
    method="wilson",
)

Implemented methods

Confidence intervals for binomial proportions

Click to view!
Method (type) Implemented
wilson βœ”οΈ
wilson-cc βœ”οΈ
wald βœ”οΈ
wald-cc βœ”οΈ
agresti-coull βœ”οΈ
jeffreys βœ”οΈ
clopper-pearson βœ”οΈ
arcsine βœ”οΈ
logit βœ”οΈ
pratt βœ”οΈ
witting βœ”οΈ
mid-p βœ”οΈ
lik βœ”οΈ
blaker βœ”οΈ
modified-wilson βœ”οΈ
modified-jeffreys βœ”οΈ

Confidence intervals for difference of binomial proportions

Click to view!
Method (type) Implemented
wilson βœ”οΈ
wilson-cc βœ”οΈ
wald βœ”οΈ
wald-cc βœ”οΈ
haldane βœ”οΈ
jeffreys-perks βœ”οΈ
mee βœ”οΈ
miettinen-nurminen βœ”οΈ
true-profile βœ”οΈ
hauck-anderson βœ”οΈ
agresti-caffo βœ”οΈ
carlin-louis βœ”οΈ
brown-li βœ”οΈ
brown-li-jeffrey βœ”οΈ
miettinen-nurminen-brown-li βœ”οΈ
exact ❌
mid-p ❌
santner-snell ❌
chan-zhang ❌
agresti-min ❌
wang ❌
pradhan-banerjee ❌

Creating report

One can use the make_risk_report function to create a report of the confidence intervals for difference of binomial proportions.

from diff_binom_confint import make_risk_report

# df_train and df_test are pandas.DataFrame providing the data
table = make_risk_report((df_train, df_test), target = "binary_target")
# or if df_data is a pandas.DataFrame containing both training and testing data
table = make_risk_report(df_data, target = "binary_target")

For more details, see corresponding documenation. The produced table is similar to the following:

Click to view!

risk report

References

  1. SAS
  2. PASS
  3. statsmodels.stats.proportion
  4. scipy.stats._binomtest
  5. corplingstats
  6. DescTools.StatsAndCIs
  7. Newcombee

NOTE

Reference 1 has errors in the description of the methods Wilson CC, Mee, Miettinen-Nurminen. The correct computation of Wilson CC is given in Reference 5. The correct computation of Mee, Miettinen-Nurminen are given in the code blocks in Reference 1

Test data

Test data are

  1. taken (with slight modification, e.g. the upper_bound of miettinen-nurminen-brown-li method in the edge case file) from Reference 1 for automatic test of the correctness of the implementation of the algorithms.

  2. generated using DescTools.StatsAndCIs via

    library("DescTools")
    library("data.table")
    
    results = data.table()
    for (m in c("wilson", "wald", "waldcc", "agresti-coull", "jeffreys",
                    "modified wilson", "wilsoncc", "modified jeffreys",
                    "clopper-pearson", "arcsine", "logit", "witting", "pratt",
                    "midp", "lik", "blaker")){
        ci = BinomCI(84,101,method = m)
        new_row = data.table("method" = m, "ratio"=ci[1], "lower_bound" = ci[2], "upper_bound" = ci[3])
        results = rbindlist(list(results, new_row))
    }
    fwrite(results, "./test/test-data/example-84-101.csv")  # with manual slight adjustment of method names
  3. taken from Reference 7 (Table II).

The filenames has the following pattern:

# for computing confidence interval for difference of binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)-vs-(?P<ref_positive>[\\d]+)-(?P<ref_total>[\\d]+)\\.csv"

# for computing confidence interval for binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)\\.csv"

Note that the out-of-range values (e.g. > 1) are left as empty values in the .csv files.

Known Issues

  1. Edge cases incorrect for the method true-profile.

About

Computation of confidence intervals for binomial proportions and for difference of binomial proportions.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages