[dattri.trak] Improving memory efficiency when calculating TRAK #154

KurisuTheAmadeus · 2024-11-29T02:20:33Z

Current Implementation of TRAK calculates final result using

    return (running_xinv_XTX_XT @ self.Q.diag().to(self.device)).T

and the introduction of diag matrix creates large memory overhead, especially in the case when number of training data points is large.
A possible fix is as follows

        if train_dataloader is not None:
            return (running_xinv_XTX_XT * running_Q.to(self.device).unsqueeze(0)).T
        return (running_xinv_XTX_XT * self.Q.to(self.device).unsqueeze(0)).T

and I have verified the equivalency using two different setups.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dattri.trak] Improving memory efficiency when calculating TRAK #154

[dattri.trak] Improving memory efficiency when calculating TRAK #154

KurisuTheAmadeus commented Nov 29, 2024

[dattri.trak] Improving memory efficiency when calculating TRAK #154

[dattri.trak] Improving memory efficiency when calculating TRAK #154

Comments

KurisuTheAmadeus commented Nov 29, 2024