-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Larger 13B model underperforms BASE model, any idea why ? #22
Comments
Hi @vjeronymo2 @lhbonifacio maybe do you have a hint of what is going on here? |
Hi @cramraj8 |
Hi @lhbonifacio , yes I am evaluating on Mr TyDi. I am a bit of confused here. If I interpret your reply correctly, In summary, in the context of multilingual re-ranking when the model size increases (580M --> 13B) we should increase the training iterations or training sample size too ? |
I tried to evaluate both unicamp-dl/mt5-base-en-msmarco and unicamp-dl/mt5-13b-mmarco-100k, but the performance of 13b is lower than base model. Here is a simple comparison of reranking results of BM25 top-100 results measured in nDCG@10. Did you observe similar trend, or there can be any underling reasons ? @rodrigonogueira4
The text was updated successfully, but these errors were encountered: