Plotting for individual predictions across depth #67

norabelrose · 2023-02-15T06:57:09Z

Given a set of reporters for each layer of a model and a fixed input, we can extract the model's "belief" at each layer and see how it evolves over time, similar to how the tuned lens works.

This is low-ish priority, but I think this should be done in time for the paper at least.

norabelrose added the enhancement New feature or request label Feb 15, 2023

norabelrose added this to the NeurIPS Paper milestone Feb 15, 2023

norabelrose assigned Benw8888 Feb 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plotting for individual predictions across depth #67

Plotting for individual predictions across depth #67

norabelrose commented Feb 15, 2023

Plotting for individual predictions across depth #67

Plotting for individual predictions across depth #67

Comments

norabelrose commented Feb 15, 2023