Can not reproduce results on the table #3

simplelifetime · 2023-04-21T15:06:29Z

Great thanks for your work! I try exacy the same setting but I got different results on MMLU and BBH. The alpaca-tuned llama always perform worse than original llama(7B or 13B). Is there anything wrong with the loaded models?

chiayewken · 2023-04-24T03:19:37Z

Thanks for raising this issue, we are currently investigating it. Based on initial checking, it may be due to changed behavior of LlamaTokenizer when we upgraded transformer library version (git+https://github.com/huggingface/transformers.git@057e1d74733f52817dc05b673a340b4e3ebea08c to 4.28.1)

simplelifetime · 2023-04-24T03:23:36Z

Thanks a lot! So can the provided transformers version(git+https://github.com/huggingface/transformers.git@057e1d74733f52817dc05b673a340b4e3ebea08c ) help me reproduce the correct results?

chiayewken · 2023-04-24T03:32:16Z

We are currently retesting the models, but it would be a great help if you could also try with the older transformers version (pip install git+https://github.com/huggingface/transformers.git@057e1d74733f52817dc05b673a340b4e3ebea08c). If you can also reproduce the results, then we know the cause of the issue for sure, and we can revert to this transformers version in the short term. In the long term, we may need to debug the LlamaTokenizer in the newer library version

chiayewken · 2023-04-24T06:31:02Z

We have confirmed that the problem is due to transformers library version, this has been fixed in the latest commit. For example, the command python main.py mmlu --model_name llama --model_path chavinlo/alpaca-native gives a result of Average accuracy: 0.416. Would you mind trying on your end if it is fixed for you too?

simplelifetime · 2023-04-24T07:05:40Z

We have confirmed that the problem is due to transformers library version, this has been fixed in the latest commit. For example, the command python main.py mmlu --model_name llama --model_path chavinlo/alpaca-native gives a result of Average accuracy: 0.416. Would you mind trying on your end if it is fixed for you too?

Thanks for your reply. I've tried again and the result seems to be fine. Can you provide more details about the cause of this problem, so I won't have such problems as the version conflict in the future. I'd be very grateful!

chiayewken · 2023-04-24T11:24:35Z

No problem, we are still working to ensure that the issue is fully resolved in the newer transformer version, it is a subtle issue as the newer LlamaTokenizer tokenizes whitespace a bit differently

sglucas · 2023-10-26T18:11:39Z

Hi, May I ask the update about this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not reproduce results on the table #3

Can not reproduce results on the table #3

simplelifetime commented Apr 21, 2023

chiayewken commented Apr 24, 2023

simplelifetime commented Apr 24, 2023

chiayewken commented Apr 24, 2023 •

edited

Loading

chiayewken commented Apr 24, 2023

simplelifetime commented Apr 24, 2023

chiayewken commented Apr 24, 2023

sglucas commented Oct 26, 2023

Can not reproduce results on the table #3

Can not reproduce results on the table #3

Comments

simplelifetime commented Apr 21, 2023

chiayewken commented Apr 24, 2023

simplelifetime commented Apr 24, 2023

chiayewken commented Apr 24, 2023 • edited Loading

chiayewken commented Apr 24, 2023

simplelifetime commented Apr 24, 2023

chiayewken commented Apr 24, 2023

sglucas commented Oct 26, 2023

chiayewken commented Apr 24, 2023 •

edited

Loading