[Proposal] Guide to adding new models #729

deven367 · 2024-09-26T02:19:36Z

Proposal

To have a guide/support document for adding support for a new model in the library

Motivation

Lower the barrier for getting into mechanistic interpretability. Though the current list of supported models is large, it does not support a lot of the newer models, which prevents messing around with a lot of such models.

Pitch

I could only find a small piece on the documentation site which briefly described adding new models in the Roadmap over here → here

I am quite new to using Hooks, I may be able to add support for some models that I and my lab are currently planning to use, but most of it will require some handholding.

Alternatives

I am open to any new suggestions. Would this be something that you be interested in? @neelnanda-io

Checklist

I have checked that there is no similar issue in the repo (required)

neelnanda-io · 2024-09-26T15:42:01Z

I think this would be cool to exist! I unfortunately don't have capacity to make this myself. Looking at past PRs to eg add LLaMA or Gemma should give some idea

…

On Thu, 26 Sept 2024 at 03:19, Deven Mistry ***@***.***> wrote: Proposal To have a guide/support document for adding support for a new model in the library Motivation Lower the barrier for getting into mechanistic interpretability. Though the current list of supported models is large, it does not support a lot of the newer models, which prevents messing around with a lot of such models. Pitch I could only find a small piece on the documentation site which briefly described adding new models in the Roadmap over here → here <https://transformerlensorg.github.io/TransformerLens/content/news/release-2.0.html#streamlining-adding-new-models> image.png (view on web) <https://github.com/user-attachments/assets/09a7a6d0-b39a-4472-9808-9d9fc431c692> I am quite new to using Hooks, I may be able to add support for some models that I and my lab are currently planning to use, but most of it will require some handholding. Alternatives I am open to any new suggestions. Would this be something that you be interested in? @neelnanda-io <https://github.com/neelnanda-io> Checklist - I have checked that there is no similar issue <https://github.com/TransformerLensOrg/Transformerlens/issues> in the repo (*required*) — Reply to this email directly, view it on GitHub <#729>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASRPNKKYONB37TXIYIFCPKLZYNVM5AVCNFSM6AAAAABO36J55GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU2DSMZTGIYDAMQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

bryce13950 · 2024-09-29T18:14:28Z

@deven367 It's in the pipeline at the moment. I am in the middle of going through a few key pieces of model compatibility, and I have been hesitant on putting together a guide on how to do it today when a couple key steps are going to either be added or change relatively quickly. If you want to meet to discuss how to do it in the time being, I am happy to do so. LLaMA 3.1 would be on the simpler side to add right now, and I can go through the more complicated process after that.

deven367 · 2024-10-01T15:49:09Z

@bryce13950 I feel that would be nice, I am open to discussing this.

bryce13950 · 2024-10-02T16:10:59Z

Someone opened a PR last night for LLaMA 3.1, but I am sure we can find one for you to add. Is there a specific model you are interested in? Are you on the Slack channel?

deven367 · 2024-10-03T14:56:02Z

Hey @bryce13950, our lab is specifically interested in models having learnable positional encoding. So, if I possible, I would first like to start with Bart and then try to see Mamba (as its a model that doesn't use attention)

Also, I am not on the Slack channel

deven367 · 2024-10-03T17:35:01Z

The slack link in the readme is broken.

neelnanda-io · 2024-10-03T22:46:46Z

Here's a new Slack link, sorry! They break after 400 people use them... https://join.slack.com/t/opensourcemechanistic/shared_invite/zt-2n26nfoh1-TzMHrzyW6HiOsmCESxXtyw

I made a PR to update it, but will leave it up to @bryce13950 whether he's OK merging it directly into main #742

Re models to use, Bart should be doable, though may be a pain, as it's an encoder-decoder model, and most models used here are decoder-only. But the HookedEncoderDecoder.py file should be a good place to start, as we support T5, and you can hopefully adapt that?

Mamba will be a whole different beast, as it's recurrent, so eg if you give it a 1000 token sequence and add a hook on a recurrent layer, I think it'll be run 1000 times in a single forward pass? Fortunately, Danielle Ensign already implemented a Mamba port of TransformerLens: https://github.com/Phylliida/MambaLens

deven367 · 2024-10-04T16:02:06Z

Thanks a lot @neelnanda-io, I've joined the Slack channel!

Re: This is soo cool, I wasn't aware that something like MambaLens existed I will look into it. Regarding Bart I think I can create try to create a PR for that. Alongside, I will see if I can generalize the instructions which could potentially become a guide. How does this sound @bryce13950?

bryce13950 · 2024-10-05T15:48:40Z

@deven367 That sounds perfect to me! Ping me on Slack, and we can discuss further if you like. The HookedEncoderDecord is a relatively newer addition to TransformerLens, and the only models that are supported are t5. That means that there may be a bit more that needs be done to add a second architecture type, but I don't think it is going to be too difficult.

bryce13950 added documentation Improvements or additions to documentation complexity-moderate Moderately complicated issues for people who have intermediate experience with the code labels Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Guide to adding new models #729

[Proposal] Guide to adding new models #729

deven367 commented Sep 26, 2024

neelnanda-io commented Sep 26, 2024 via email

bryce13950 commented Sep 29, 2024

deven367 commented Oct 1, 2024

bryce13950 commented Oct 2, 2024

deven367 commented Oct 3, 2024

deven367 commented Oct 3, 2024

neelnanda-io commented Oct 3, 2024

deven367 commented Oct 4, 2024

bryce13950 commented Oct 5, 2024

[Proposal] Guide to adding new models #729

[Proposal] Guide to adding new models #729

Comments

deven367 commented Sep 26, 2024

Proposal

Motivation

Pitch

Alternatives

Checklist

neelnanda-io commented Sep 26, 2024 via email

bryce13950 commented Sep 29, 2024

deven367 commented Oct 1, 2024

bryce13950 commented Oct 2, 2024

deven367 commented Oct 3, 2024

deven367 commented Oct 3, 2024

neelnanda-io commented Oct 3, 2024

deven367 commented Oct 4, 2024

bryce13950 commented Oct 5, 2024