Skip to content

Commit

Permalink
Merge pull request #18 from Ciela-Institute/contingencytable
Browse files Browse the repository at this point in the history
Add unit tests, refactor for scipy contingency table
  • Loading branch information
ConnorStoneAstro authored Nov 6, 2024
2 parents 3f4d352 + a50adf3 commit 9b17829
Show file tree
Hide file tree
Showing 11 changed files with 592 additions and 703 deletions.
6 changes: 4 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
build:
runs-on: ${{matrix.os}}
strategy:
fail-fast: false
fail-fast: true
matrix:
python-version: ["3.9", "3.10", "3.11"]
os: [ubuntu-latest, windows-latest, macOS-latest]
Expand Down Expand Up @@ -68,4 +68,6 @@ jobs:
pytest -vvv --cov=${{ env.PROJECT_NAME }} --cov-report=xml --cov-report=term tests/
- name: Upload coverage reports to Codecov with GitHub Action
uses: codecov/codecov-action@v4
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
[![CI](https://github.com/Ciela-Institute/PQM/actions/workflows/ci.yml/badge.svg)](https://github.com/Ciela-Institute/PQM/actions/workflows/ci.yml)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
![PyPI - Downloads](https://img.shields.io/pypi/dm/pqm)
[![codecov](https://codecov.io/gh/Ciela-Institute/PQM/graph/badge.svg?token=wbkUiRkYtg)](https://codecov.io/gh/Ciela-Institute/PQM)
[![arXiv](https://img.shields.io/badge/arXiv-2402.04355-b31b1b.svg)](https://arxiv.org/abs/2402.04355)

[PQMass](https://arxiv.org/abs/2402.04355) is a new sample-based method for evaluating the quality of generative models as well as assessing distribution shifts to determine if two datasets come from the same underlying distribution.
Expand All @@ -22,9 +23,11 @@ PQMass takes in $x$ and $y$ two datasets and determines if they come from the sa

![Headline plot showing an example tessellation for PQMass](media/Voronoi.png "")
PQMass partitions the space by taking reference points from $x$ and $y$ and creating Voronoi tessellations around the reference points. On the left is an example of one such region, which we note follows a Binomial Distribution; the samples are either inside or outside the region. On the right is the entire space partitioned, allowing us to see that this is a multinomial distribution, a given sample can be in region P or any other region. This is crucial as it allows for two metrics to be defined that can be used to determine if $x$ and $y$ come from the same underlying distribution. The first is the $\chi_{PQM}^2$

$$\chi_{PQM}^2 \equiv \sum_{i = 1}^{n_R} \left[ \frac{(k({\bf x}, R_i) - \hat{N}_{x, i})^2}{\hat{N}_{x, i}} + \frac{(k({\bf y}, R_i) - \hat{N}_{y, i})^2}{\hat{N}_{y, i}} \right]$$

and the second is the $\text{p-value}(\chi_{PQM}^2)$

$$\text{p-value}(\chi_{PQM}^2) \equiv \int_{-\infty}^{\chi^2_{\rm {PQM}}} \chi^2_{n_R - 1}(z) dz$$

For $\chi_{PQM}^2$ metric, given your two sets of samples, if they come from the same
Expand Down
139 changes: 23 additions & 116 deletions notebooks/mnist.ipynb

Large diffs are not rendered by default.

76 changes: 40 additions & 36 deletions notebooks/test.ipynb

Large diffs are not rendered by default.

64 changes: 31 additions & 33 deletions notebooks/time_series.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit 9b17829

Please sign in to comment.