DAGnosis: Localized Identification of Data Inconsistencies using Structures

This repository accompanies the AISTATS'24 paper: "DAGnosis: Localized Identification of Data Inconsistencies using Structures".

Usage

We suggest creating a new environment before using the code, e.g. with:

conda create --name dagnosis python=3.10

We can then install the package from source:

pip install .

Synthetic

We illustrate how to use DAGnosis in a synthetic setup, via the files in the folder experiments/synthetic. The bash scripts run_linear.sh and run_mlp.sh run the full pipeline: generate the data, train the conformal estimators, and test the conformal estimators, for linear and MLP SEMs respectively. The bash commands for these must be run from inside the experiments/synthetic directory.

To compute the inconsistency detection metrics (F1, Precision, Recall), go to the folder experiments/synthetic and run:

python compute_metrics.py PATH_SAVE_METRIC=path_metrics

where path_metrics denotes the folder where the metrics are saved.

Similarly, you can reproduce the sensitivity experiment by going to the folder experiments/synthetic/sensitivity and using the script run.sh, followed by

python compute_metrics.py PATH_SAVE_METRIC=path_metrics

UCI Adult Income

To run the experiments on the UCI Adult Income dataset, go to the folder experiments/adult. In order to train and test the conformal estimators, run

python train_test_adult.py

The artifacts will be saved in the folder artifacts_adult. Then, the results can be obtained by executing:

python proportion_flagging.py

which will print the list of downstream accuracies and proportions of samples flagged (Figure 3 a) and b)).

Citing

If you use this software, please cite the original paper:

TODO

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
experiments		experiments
src/dagnosis		src/dagnosis
.gitattributes		.gitattributes
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAGnosis: Localized Identification of Data Inconsistencies using Structures

Usage

Synthetic

UCI Adult Income

Citing

About

Releases

Packages

Contributors 2

Languages

License

nicolashuynh/DAGNOSIS

Folders and files

Latest commit

History

Repository files navigation

DAGnosis: Localized Identification of Data Inconsistencies using Structures

Usage

Synthetic

UCI Adult Income

Citing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages