A library for efficient patching and automatic circuit discovery.
Transformer Circuit Metrics are not Robust (Oral spotlight, COLM 2024)
pip install auto-circuit
auto-circuit/experiments/demos/zero_ablate_an_edge.py
Lines 20 to 26 in 03ce552
auto-circuit/experiments/demos/patch_an_edge.py
Lines 49 to 55 in 03ce552
@inproceedings{
miller2024transformer,
title={Transformer Circuit Evaluation Metrics Are Not Robust},
author={Joseph Miller and Bilal Chughtai and William Saunders},
booktitle={First Conference on Language Modeling},
year={2024},
url={https://openreview.net/forum?id=zSf8PJyQb2}
}