Explore visualization tools for understanding Transformer-based large language models (LLMs).
Ordered by publication date, from newest to oldest.
Apply interpretability to unlock deep reasoning and control of models, enabling the next generation of human-AI interaction.
Tilde Research, 2024.11 Thread / Website
Tools for understanding and steering AI systems, and insights derived from their use inform our research.
Transluce Team, 2024.10 Demo / GitHub / Website
Learn several key components of DL models by using customized excels
Tom Yeh, 2024.09 GitHub
Learn How Transformer Models Work with Interactive Visualization
Georgia Tech and IBM, 2024.08 Demo / GitHub / arXiv
Help the safety community shed light on the inner workings of language models
Google DeepMind, 2024.07 Demo / Blog / PDF
Inspectus is a versatile visualization tool for machine learning. It runs smoothly in Jupyter notebooks via an easy-to-use Python API.
labml.ai, 2024.06 GitHub
We used new scalable methods to decompose GPT-4’s internal representations into 16 million oft-interpretable patterns.
OpenAI, 2024.06 Demo / GitHub / Blog / arXiv
An open-source interactive toolkit for analyzing internal workings of Transformer-based language models.
Meta, 2024.04 Demo / GitHub / arXiv
Neuronpedia is a platform for mechanistic interpretability research. Its goal is to accelerate researchers for Sparse Autoencoders (SAEs) by hosting models, feature dashboards, data visualizations, tooling, and more.
Johnny Lin and Joseph Bloom, 2024.03 Demo
Mechanistic Interpretability visualizations, that work both in both Python (e.g. with Jupyter Lab) and JavaScript (e.g. React or plain HTML).
Alan Cooney and Neel Nanda, 2023.10 Demo / GitHub
A visualization and walkthrough of the LLM algorithm that backs OpenAI's ChatGPT. Explore the algorithm down to every add & multiply, seeing the whole process in action.
Brendan Bycroft, 2023.05 Demo / GitHub
A library for mechanistic interpretability of GPT-style language models
Neel Nanda and Joseph Bloom, 2022.08 GitHub / Distill / Documentation
Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Jesse Vig, 2019.07 GitHub / ACL Anthology
DESCRIPTION