Skip to content

An ongoing investigation into mechanistic interpretability of reasoning in open LLMs like Pythia and OLMo

Notifications You must be signed in to change notification settings

Neelectric/pythia_trials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Pythia Trials

Intro

Public repository to perform some research on Pythia and OLMo and how they behave when varying sizes/checkpoints in training. The hope is to reverse engineer some algorithms!

2-digit Addition

I have started by looking at 2-digit addition because it is a super simple task that is easy to understand, reason about, and to implement. With 2-digit addition, I am referring to the task

question = “19+21” prompt = f"Question: What is {question}? Answer: {question}="

Where the response has to be the correct summation. I only iterate through questions where the result is still two digits, ie. I stop iterating before 50+50=100.

The following table shows the performance of OLMo and Pythia along various sizes

Model Size 2-sum temp
EleutherAI/pythia-14m 14m 0.00%
EleutherAI/pythia-70m-deduped 70m 0.00%
EleutherAI/pythia-160m-deduped 70m 0.16%
EleutherAI/pythia-410m-deduped 410m 4.56%
EleutherAI/pythia-1B-deduped 1B 5.08%
EleutherAI/pythia-1.4B-deduped 1.4B 20.69%
EleutherAI/pythia-2.8B-deduped 2.8B 86.32%
EleutherAI/pythia-6.9B-deduped 6.9B 91.88%
EleutherAI/pythia-12B-deduped 12B 88.96%
allenai/OLMo-1B-0724-hf 1B 42.86%
hamishivi/OLMo-1B-0724-SFT-hf 1B 53.50%
hamishivi/OLMo-1B-0724-Instruct-hf 1B 52.02%
allenai/OLMo-7B-0724-hf 7B 99.88%
allenai/OLMo-7B-0724-SFT-hf 7B
allenai/OLMo-7B-0724-Instruct-hf 7B

About

An ongoing investigation into mechanistic interpretability of reasoning in open LLMs like Pythia and OLMo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published