Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing SERAC and MEND results #447

Open
shariqahn opened this issue Dec 13, 2024 · 4 comments
Open

Reproducing SERAC and MEND results #447

shariqahn opened this issue Dec 13, 2024 · 4 comments
Labels
question Further information is requested

Comments

@shariqahn
Copy link

shariqahn commented Dec 13, 2024

I understand from #442 that you provided the checkpoints for SERAC and MEND trained on the CounterFact dataset, but I am not seeing CounterFact results for Llama in any of your papers. Do you happen to have those results or the checkpoints trained on ZsRE so I can ensure I have reproduced your solution? I am getting the following results for ZsRE with the provided SERAC checkpoint:

Metrics Summary:  {'pre': {'rewrite_acc': 0.40287348593761324, 'rephrase_acc': 0.39428592943539903, 'portability': {'one_hop_acc': 0.566645032103213}}, 'post': {'rewrite_acc': 0.9630600327562718, 'rephrase_acc': 0.6961480793930168, 'locality': {'neighborhood_acc': 0.9986392371156113}, 'portability': {'one_hop_acc': 0.5736743852938706}}}

The rephrase_acc is low.

@zxlzr zxlzr added the question Further information is requested label Dec 14, 2024
@shariqahn shariqahn changed the title Reproducing SERAC results Reproducing SERAC and MEND results Dec 14, 2024
@XeeKee
Copy link
Collaborator

XeeKee commented Dec 17, 2024

Previously, some users did not have the necessary resource to complete the training of SERAC and MEND on CounterFact.
As a result, I found a ckpt I had previously used on my local server and uploaded it to Google Drive.
To make it easier for other users, we included the link in the README.
Please note that we did not specify that this checkpoint is intended for reproducing the results in the paper.

@zxlzr
Copy link
Contributor

zxlzr commented Dec 18, 2024

Sorry, due to limited computing and storage resources, we will run it as soon as possible and upload the complete checkpoint to help you. Thank you for using EasyEdit!

@shariqahn
Copy link
Author

shariqahn commented Dec 21, 2024

I actually did manage to run SERAC training on ZsRE, but only with data/zsre/zsre_mend_train_10000.json rather than the entire ZsRE training set.

My portability results are different from what was reported in the README - 'portability': {'one_hop_acc': 0.3953153117763729} as opposed to 57.82 that was listed here. I see that similar results were reported here that you discuss match expectations for a smaller model, but I do not understand why that does not match the reported values.

I did have a similar issue to #123 and switched over to the https://huggingface.co/Cheng98/llama-160m model as you suggest, but I would not expect this change to lead to such different final values.

@zxlzr
Copy link
Contributor

zxlzr commented Dec 22, 2024

Sorry, we will try to handle this ASAP. Recently, computing resources have been extremely tight, making it very difficult to have machines available for debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants