-
Notifications
You must be signed in to change notification settings - Fork 0
maxtli/plausibleablation
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
# # perform inference with batched toxic samples # # perform inference with untoxic samples # # perform inference with ablated untoxic samples # # take specific untoxic examples from the finetuned model, and perform inference # # do this 144x, once for each attention head. do i need to save the indices? (also, ???) # # (i guess this is just activation patching) # # do some arithmetic on the output logits # # check the ablated loss on the toxic samples
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published