-
Notifications
You must be signed in to change notification settings - Fork 314
Issues: TransformerLensOrg/TransformerLens
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Proposal] Compatibility for OLMo and OLMo2?
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
model-request
Any issues related to requesting additional model support
#804
opened Nov 28, 2024 by
spaidataiga
1 task done
[Bug Report] Padding side inconsistency with Huggingface Transformers
bug
Something isn't working
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#801
opened Nov 27, 2024 by
spfrommer
1 task done
[Bug Report] Load model problem
question
Further information is requested
#800
opened Nov 27, 2024 by
LiuJinzhe-Keepgoing
[Proposal] Remove instances where Simple issues, which may be good for beginners
model_args
and model_kwargs
are provided
breaking-change
complexity-simple
#798
opened Nov 26, 2024 by
bryce13950
1 task done
[Proposal] change FactoredMatric.svd() so it doesn't prevent all instances of FactoredMatrix from being garbage collected
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#796
opened Nov 25, 2024 by
manulari
1 task done
[Question] How to load a model in smaller precision?
needs-information
More information is needed from the issue creator before moving forward.
#784
opened Nov 18, 2024 by
MittelmanDaniel
[Bug Report] Global and Local Attn layer order of Gemma2 is wrong?
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
#778
opened Nov 9, 2024 by
huangxt39
[Question] Would it be possible to adopt TransformerLens on models with a different layernorm implementation?
complexity-high
Very complicated changes for people to address who are quite familiar with the code
question
Further information is requested
#773
opened Nov 8, 2024 by
Steven-Yiran
[Proposal] Ensure TransformerLens does not load from hugging face when config is passed in
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#754
opened Oct 11, 2024 by
hamind
1 task done
[Bug Report] hook_normalized is inconsistent between RMSNorm and LayerNorm
breaking-change
bug
Something isn't working
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#747
opened Oct 6, 2024 by
neelnanda-io
[Bug Report] Q cannot be reshaped correctly when model is loaded in 4bit
bug
Something isn't working
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#737
opened Sep 28, 2024 by
po13on
[Proposal] Guide to adding new models
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
documentation
Improvements or additions to documentation
#729
opened Sep 26, 2024 by
deven367
1 task done
[Bug Report] Review current matmul function usages
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
#720
opened Sep 10, 2024 by
bryce13950
1 task done
[Proposal] Add MVP Support For 1-2 Models Per-Modality
complexity-high
Very complicated changes for people to address who are quite familiar with the code
discussion
No action needed yet
#710
opened Aug 31, 2024 by
4gatepylon
1 task done
[Proposal] Add support for TracrBench
complexity-high
Very complicated changes for people to address who are quite familiar with the code
new-architecture
This card involves adding a new architecture .
#704
opened Aug 14, 2024 by
HannesThurnherr
How to get the Activation cache while the LLM is generating new tokens?
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#697
opened Aug 7, 2024 by
Meehaohao
[Proposal] Expand quantization model support
complexity-high
Very complicated changes for people to address who are quite familiar with the code
#684
opened Jul 26, 2024 by
miguel-kjh
[Proposal] Allow tied embeddings
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
enhancement
New feature or request
#671
opened Jul 12, 2024 by
neelnanda-io
ValueError: microsoft/Phi-3-mini-128k-instruct not found.
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
model-request
Any issues related to requesting additional model support
#670
opened Jul 12, 2024 by
joykirat18
[Proposal] Allow recent versions of beartype
complexity-simple
Simple issues, which may be good for beginners
tooling
Anything pertaining to outside tools used within the codebase
#665
opened Jul 10, 2024 by
jettjaniak
1 task done
[Bug Report] Pythia output inconsistent across batch sizes when use_split_qkv_input=True
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
#661
opened Jul 8, 2024 by
oliveradk
1 task done
[Bug Report] RMSNormPre in Transformer_lens is maybe different from Llama source code?
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#657
opened Jul 6, 2024 by
wangyifei0047
[Proposal] Documentation: Map the Act Names to the Transformer
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
documentation
Improvements or additions to documentation
#644
opened Jun 21, 2024 by
JuVogt
1 task done
[Proposal] Remove the overhead caused by full_hook.__name__ = (hook.__repr__())?
#631
opened Jun 8, 2024 by
verlocks
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.