TRAIS-Lab · TheaperDeng · Oct 21, 2024 · Oct 17, 2024 · Oct 17, 2024 · Oct 17, 2024
diff --git a/.github/workflows/examples_test.yml b/.github/workflows/examples_test.yml
@@ -26,7 +26,7 @@ jobs:
         sed -i 's/range(1000)/range(100)/g' examples/noisy_label_detection/trak_noisy_label.py
         python examples/noisy_label_detection/trak_noisy_label.py --device cpu
         python examples/pretrained_benchmark/influence_function_lds.py --device cpu
-        python examples/pretrained_benchmark/trak_lds.py --device cpu
+        python examples/pretrained_benchmark/trak_loo.py --device cpu
         python examples/brittleness/mnist_lr_brittleness.py --method cg --device cpu
     - name: Uninstall the package
       run: |

diff --git a/README.md b/README.md
@@ -26,20 +26,20 @@ git clone https://github.com/TRAIS-Lab/dattri
 pip install -e .
 ```
 
-If you want to use all features on CUDA and accelerate the library, you may install the full version by
+If you want to use `fast_jl` to accelerate the random projection, you may install the full version by
 
 ```bash
-pip install -e .[all]
+pip install -e .[fast_jl]
 ```
 
 > [!NOTE]
-> It's highly recommended to use a device support CUDA to run `dattri`, especially for moderately large or larger models or datasets. And it's required to have CUDA if you want to install the full version `dattri`.
+> It's highly recommended to use a device support CUDA to run `dattri`, especially for moderately large or larger models or datasets.
 
 > [!NOTE]
-> If you are using `dattri[all]`, please use `pip<23` and `torch<2.3` due to some known issue of `fast_jl` library.
+> It's required to have CUDA if you want to install and use the fast_jl version `dattri[fast_jl]` to accelerate the random projection. The projection is mainly used in `TRAKAttributor`. Please use `pip<23` and `torch<2.3` due to some known issue of `fast_jl` library.
 
 #### Recommended enviroment setup
-It's not required to follow the exact same steps in this section. But this is a verified environment setup flow that may help users to avoid most of the issues during the installation.
+It's **not** required to follow the exact same steps in this section. But this is a verified environment setup flow that may help users to avoid most of the issues during the installation.
 
 ```bash
 conda create -n dattri python=3.10
@@ -49,7 +49,7 @@ conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
 pip3 install torch==2.1.0 --index-url https://download.pytorch.org/whl/cu118
 
 git clone https://github.com/TRAIS-Lab/dattri
-pip install -e .[all]
+pip install -e .[fast_jl]
 ```
 
 ### Apply data attribution methods on PyTorch models
@@ -171,6 +171,7 @@ model = activate_dropout(model, ["dropout1", "dropout2"], dropout_prob=0.2)
 ```
 
 ## Algorithms Supported
+We have implemented most of the state-of-the-art methods. The categories and reference paper of the algorithms are listed in the following table.
 | Family |               Algorithms              |
 |:------:|:-------------------------------------:|
 |   [IF](https://arxiv.org/abs/1703.04730)   | [Explicit](https://arxiv.org/abs/1703.04730) |

diff --git a/dattri/metric/ground_truth.py b/dattri/metric/ground_truth.py
@@ -160,7 +160,8 @@ def target_func(ckpt_path, dataloader):
             target function calculated on all test samples under `num_subsets` models,
             each retrained on a subset of the training data. The second tensor has the
             shape (num_subsets, subset_size), where each row refers to the indices of
-            the training samples used to retrain the model.
+            the training samples used to retrain the model. The targeted value will be
+            flipped to be consistent with the score calculated by the attributors.
     """
     retrain_dir = Path(retrain_dir)
 
@@ -186,4 +187,7 @@ def target_func(ckpt_path, dataloader):
             target_values[i] += target_func(ckpt_path, test_dataloader)
     target_values /= num_runs_per_subset
 
+    # flip the target values
+    target_values = -target_values
+
     return target_values, indices
diff --git a/examples/pretrained_benchmark/trak_lds.py → examples/pretrained_benchmark/trak_loo.py b/examples/pretrained_benchmark/trak_lds.py → examples/pretrained_benchmark/trak_loo.py
diff --git a/pyproject.toml b/pyproject.toml
@@ -21,13 +21,14 @@ dependencies = [
     "numpy>=1.25",
     "scipy>=1.11",
     "pyyaml",
+    "pretty_midi"
 ]
 
 [project.urls]
 homepage = "https://github.com/TRAIS-Lab/dattri"
 
 [project.optional-dependencies]
-all = ["fast_jl"]
+fast_jl = ["fast_jl"]
 test = ["build", "pytest", "pre-commit", "ruff", "darglint", "scikit-learn", "pretty_midi", "requests"]
 
 [tool.setuptools.packages]