Popular repositories Loading
- 
      surgical_knowledge_editingsurgical_knowledge_editing PublicThis repository provides the official implementation of an "unlearn-then-learn" strategy that uses interpretability-driven circuit localization and the $(IA)^{3}$ PEFT method to achieve precise, su… Python 2 
- 
      mue_projectmue_project PublicA PyTorch implementation of MUE, a minimalist framework that guides pre-trained diffusion models to autonomously explore novel, coherent outputs by leveraging local denoising instabilities as an un… Python 1 
- 
      
- 
      taming_polysemanticitytaming_polysemanticity PublicA PyTorch toy model for mechanistic interpretability (MI) exploring incidental polysemanticity. This project uses an MLP and sparse autoencoder (SAE) to systematically ablate how training artifacts… Python 
If the problem persists, check the GitHub status page or contact support.