TorchInfluence

Implementation of Training Data Attribution (TDA) methods using PyTorch - namely torch.func.

TDA methods attempt to attribute a score to training points in relation to how important or influential they are for the prediction of a given test point.

For now I have only implemented the simple gradient similarity as proposed in [4] which also is a key element of TracIn [2] as a warmup exercise but I plan on working towards an implementation of influence functions that utilize Arnoldi iterations to efficiently estimate the inverse Hessian as done in [3].

Literature

[1] Koh, P. W., & Liang, P. (2017, July). Understanding black-box predictions via influence functions. In International conference on machine learning (pp. 1885-1894). PMLR.
[2] Pruthi, G., Liu, F., Kale, S., & Sundararajan, M. (2020). Estimating training data influence by tracing gradient descent. Advances in Neural Information Processing Systems, 33, 19920-19930.
[3] Schioppa, A., Zablotskaia, P., Vilar, D., & Sokolov, A. (2022, June). Scaling up influence functions. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 8, pp. 8179-8186).
[4] Charpiat, G., Girard, N., Felardos, L., & Tarabalka, Y. (2019). Input similarity from the neural network perspective. Advances in Neural Information Processing Systems, 32.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

TorchInfluence

Literature

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

TorchInfluence

Literature