You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The paper discusses the selection of layers 4-14 of Self attention in the copying. I can't seem to find where in the code it is actually done. It appears to em that the 16 layers are being copied?
Also, is it possible to share the code for the Ablation in Figure 10 in the paper?
Many thanks.
The text was updated successfully, but these errors were encountered:
The paper discusses the selection of layers 4-14 of Self attention in the copying. I can't seem to find where in the code it is actually done. It appears to em that the 16 layers are being copied?
Also, is it possible to share the code for the Ablation in Figure 10 in the paper?
Many thanks.
The text was updated successfully, but these errors were encountered: