[dattri.algorithm, dattri.func] Refactor the implementation of EKFAC #143

sx-liu · 2024-09-17T02:48:59Z

Description

1. Motivation and Context

The current implementation of EK-FAC is out-dated. Refactor the EK-FAC attribution to follow the format of other IF attribution, as well as update the hook mechanism to avoid redundency.

2. Summary of the change

Refactor EK-FAC base functions, add estimate_covariance, estimate_eigenvector and estimate_lambda for direct use by the users.
Remove ifvp_at_x_ekfac, add IFAttributorEKFAC instead for attribution.
Remove MLPCache class and manual_cache_forward function, and use torch.register_forward_hook instead.

3. What tests have been added/updated for the change?

N/A: No test will be added (please justify)
Unit test: Typically, this should be included if you implemented a new function/fixed a bug.
Application test: If you wrote an example for the toolkit, this test should be added.
Document test: If you added an external API, then you should check if the document is correctly generated.
...

jiaqima · 2024-09-18T22:36:37Z

dattri/algorithm/influence_function.py

@@ -4,6 +4,8 @@

 from typing import TYPE_CHECKING

+from dattri.task import AttributionTask


This line is likely not needed

jiaqima · 2024-09-18T22:38:34Z

dattri/algorithm/influence_function.py

+
+
+class IFAttributorEKFAC(BaseInnerProductAttributor):
+    """The inner product attributor with DataInf inverse hessian transformation."""


DataInf -> EKFAC

jiaqima · 2024-10-04T02:14:25Z

Hi @sx-liu , thanks for the PR. Could you try to add some more detailed test about the attributor's internal functions. Could be something like this:

dattri/test/dattri/algorithm/test_influence_function.py

Line 250 in aaf1928

def test_datainf_transform_test_rep(self):

sx-liu · 2024-10-13T21:06:08Z

Hi @jiaqima , I added one additional unit test for transformed test rep, which is looking at the correlation with ground truth. I think it might be hard to otherwise check the correctness about the FIM. What do you think?

jiaqima · 2024-10-13T21:08:03Z

dattri/algorithm/influence_function.py

+                 device: Optional[str] = "cpu",
+                 damping: float = 0.0,
+    ) -> None:
+        """Initialize the DataInf inverse Hessian attributor.


DataInf -> EKFAC

Hessian -> FIM

jiaqima · 2024-10-13T21:08:44Z

dattri/algorithm/influence_function.py

+
+
+class IFAttributorEKFAC(BaseInnerProductAttributor):
+    """The inner product attributor with EK-FAC inverse hessian transformation."""


hessian -> FIM

jiaqima · 2024-10-13T21:10:49Z

dattri/algorithm/influence_function.py

+                         "Ensemble of EK-FAC is not supported.")
+            raise ValueError(error_msg)
+
+        if not module_name:


if module_name is None

jiaqima · 2024-10-13T21:14:35Z

dattri/algorithm/influence_function.py

+
+        self.layer_cache = {}  # cache for each layer
+
+        def _ekfac_hook(module: torch.nn.Module,


Please add a docstring to explain the requirements for these arguments

jiaqima · 2024-10-13T21:16:30Z

dattri/algorithm/influence_function.py

+        full_train_dataloader: DataLoader,
+        max_iter: Optional[int] = None,
+    ) -> None:
+        """Cache the dataset and statistics for inverse hessian/fisher calculation.


hessian/fisher -> FIM

jiaqima · 2024-10-13T21:17:07Z

dattri/algorithm/influence_function.py

+
+        Cache the full training dataset as other attributors.
+        Estimate and cache the covariance matrices, eigenvector matrices
+        and corrected eigenvalues based on the distribution of training data.


based on the distribution of training data -> based on samples of training data.

jiaqima · 2024-10-13T21:17:22Z

dattri/algorithm/influence_function.py

+
+        Args:
+            full_train_dataloader (DataLoader): The dataloader
+                with full training samples for inverse hessian calculation.


hessian -> FIM

jiaqima · 2024-10-13T21:17:57Z

dattri/algorithm/influence_function.py

+                with full training samples for inverse hessian calculation.
+            max_iter (int, optional): An integer indicating the maximum number of
+                batches that will be used for estimating the the covariance matrices
+                and lambdas.


jiaqima · 2024-10-13T21:21:07Z

dattri/algorithm/influence_function.py

+            # Cache the inputs and outputs
+            self.layer_cache[name] = (inputs, outputs)
+
+        self.handles = []


Move this part to self.cache()? It seems that self.handles are only needed in self.cache()

jiaqima · 2024-10-13T21:26:03Z

dattri/algorithm/influence_function.py

+        full_model_params = {
+            k: p for k, p in self.task.model.named_parameters() if p.requires_grad
+        }
+        partial_model_params = {


How about using self.task.get_param(layer_name=self.layer_name, layer_split=True)

I think it's a bit hard to use this function, because it only provides the flattened parameters? Here we need the original shape information for each layer.

I think layer_split = True will give you a map to the module name.

Maybe @TheaperDeng has better ideas here.

Since I think this can be done by easily change to the get_param. So I think I can handle this in next PR, and leave it as it is now in this PR.

TheaperDeng · 2024-10-14T20:02:40Z

dattri/algorithm/influence_function.py

+        self.module_name = module_name
+
+        # Update layer_name corresponding to selected modules
+        self.layer_name = [name + ".weight" for name in self.module_name]


[bias issue] Here I think we also need to apend the name + ".bias"

dattri/algorithm/influence_function.py

TheaperDeng · 2024-10-15T03:10:20Z

dattri/func/fisher.py

+    max_iter: Optional[int] = None,
+    device: Optional[str] = "cpu",
+) -> Dict[str, torch.tensor]:
+    """Estimate the 'covariance' matrices S and A in EK-FAC IFVP.


The comment here needs change.

TheaperDeng · 2024-10-16T04:49:32Z

dattri/algorithm/influence_function.py

+            """
+            # Unpack tuple outputs if necessary
+            if isinstance(inputs, tuple):
+                inputs = inputs[0]


[bias issue] Here, we are using this inputs as the a_prev in our calculation for cov and lambda. While here we should append a torch.ones to the input to handle the bias.

TheaperDeng · 2024-10-16T04:49:55Z

dattri/algorithm/influence_function.py

+        ifvp = {}
+
+        for name in self.module_name:
+            _v = layer_test_rep[name + ".weight"]


[bias issue] again, I think we need the ".bias".

TheaperDeng · 2024-10-16T04:52:16Z

dattri/func/fisher.py


    Args:
        func (Callable): A Python function that takes one or more arguments.
            Must return the following,
-            - losses: a tensor of shape (batch_size,).
+            - loss: a single tensor of loss. Should be the mean loss by the
+                    batch size.
            - mask (optional): a tensor of shape (batch_size, t), where 1's


maybe we need to make this mask output clear in the init function's document of EKFAC IF Attributor

TheaperDeng · 2024-10-16T04:56:46Z

The EKFAC implementation overall looks good to me.

One comment is that it use loss.backward so we can not close the autograd during the attribution process. This is not consistent with other attribution, but I think it's fine for now.

Another comment is about the bias term, I have made some comments to the place I think need to change.

sx-liu · 2024-10-17T18:13:21Z

I just incorporated the bias into the gradient and FIM calculation, but I found that the correlation drops drastically. I will try some other metrics and double check the implementation.

TheaperDeng

Thanks @sx-liu , the last bug was really hard to identify. LGTM

Refactor EKFAC base functions

12416c2

sx-liu added the work-in-progress label Sep 17, 2024

Sx Lau and others added 4 commits September 18, 2024 12:51

Update ekfac base functions

c0625d6

Add ekfac attributor

fa4bdee

Add _unflatten_partial_params

2d2f45d

Update transform_test_rep to ekfac

1990777

jiaqima reviewed Sep 18, 2024

View reviewed changes

Sx Lau added 8 commits September 18, 2024 23:17

Complete ekfac attributor

2ebd965

Remove unused function

2b8cf3d

Remove ifvp_at_x_ekfac

fee5503

Remove redundant import

c22389e

Make max_iter optional

1071f24

Enable device=cuda

049639b

Add ekfac base test

3ab5634

Format the codes

36c063a

sx-liu removed the work-in-progress label Sep 20, 2024

sx-liu requested review from jiaqima and TheaperDeng September 20, 2024 03:55

jiaqima changed the title ~~Refactor the implementation of EKFAC~~ [dattri.algorithms] Refactor the implementation of EKFAC Oct 12, 2024

jiaqima changed the title ~~[dattri.algorithms] Refactor the implementation of EKFAC~~ [dattri.algorithm, dattri.func] Refactor the implementation of EKFAC Oct 12, 2024

Add unit test

614f800

jiaqima requested changes Oct 13, 2024

View reviewed changes

Update format

ad07c9b

TheaperDeng requested changes Oct 16, 2024

View reviewed changes

Sx Lau added 2 commits October 16, 2024 20:55

Update comments and docstrings

e3c3017

Add bias

41ef25e

Sx Lau added 3 commits October 25, 2024 19:57

Fix bug in bias

beef82e

Update code format

8b61c5e

Update testcase

1e573e3

sx-liu requested review from jiaqima and TheaperDeng October 26, 2024 01:22

TheaperDeng approved these changes Oct 26, 2024

View reviewed changes

jiaqima approved these changes Oct 27, 2024

View reviewed changes

jiaqima merged commit 0038c57 into TRAIS-Lab:main Oct 27, 2024
5 checks passed

jiaqima mentioned this pull request Oct 27, 2024

Use get_param in EKFAC #148

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dattri.algorithm, dattri.func] Refactor the implementation of EKFAC #143

[dattri.algorithm, dattri.func] Refactor the implementation of EKFAC #143

sx-liu commented Sep 17, 2024 •

edited

Loading

jiaqima Sep 18, 2024

jiaqima Sep 18, 2024

jiaqima commented Oct 4, 2024

sx-liu commented Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

jiaqima Oct 13, 2024

sx-liu Oct 13, 2024

jiaqima Oct 13, 2024 •

edited

Loading

TheaperDeng Oct 16, 2024

TheaperDeng Oct 14, 2024

TheaperDeng Oct 15, 2024

TheaperDeng Oct 16, 2024

TheaperDeng Oct 16, 2024

TheaperDeng Oct 16, 2024

TheaperDeng commented Oct 16, 2024

sx-liu commented Oct 17, 2024

TheaperDeng left a comment

		@@ -4,6 +4,8 @@

		from typing import TYPE_CHECKING

		from dattri.task import AttributionTask



		class IFAttributorEKFAC(BaseInnerProductAttributor):
		"""The inner product attributor with DataInf inverse hessian transformation."""



		class IFAttributorEKFAC(BaseInnerProductAttributor):
		"""The inner product attributor with EK-FAC inverse hessian transformation."""


		self.layer_cache = {} # cache for each layer

		def _ekfac_hook(module: torch.nn.Module,

[dattri.algorithm, dattri.func] Refactor the implementation of EKFAC #143

[dattri.algorithm, dattri.func] Refactor the implementation of EKFAC #143

Conversation

sx-liu commented Sep 17, 2024 • edited Loading

Description

1. Motivation and Context

2. Summary of the change

3. What tests have been added/updated for the change?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiaqima commented Oct 4, 2024

sx-liu commented Oct 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiaqima Oct 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheaperDeng commented Oct 16, 2024

sx-liu commented Oct 17, 2024

TheaperDeng left a comment

Choose a reason for hiding this comment

sx-liu commented Sep 17, 2024 •

edited

Loading

jiaqima Oct 13, 2024 •

edited

Loading