[Question] Would it be possible to adopt TransformerLens on models with a different layernorm implementation? #773
Labels
complexity-high
Very complicated changes for people to address who are quite familiar with the code
question
Further information is requested
Question
I am looking to adopt TransformerLens on a custom model currently not supported the TransformerLens library. The custom model have the same GPT-2 like architecture except the implementation of the LayerNorm operation. Specifically, for each layer it implements a LayerNorm (with weight and bias) at the end of the mlp output. I looked into the Othello GPT but am still not sure about how to avoid the architecture mismatch.
Would it still be possible to run analysis on the custom model with TransformerLens? Thanks!
The text was updated successfully, but these errors were encountered: