Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM-based methods on the leaderboard - reg. #8

Open
ashok-arjun opened this issue Dec 8, 2024 · 3 comments
Open

LLM-based methods on the leaderboard - reg. #8

ashok-arjun opened this issue Dec 8, 2024 · 3 comments

Comments

@ashok-arjun
Copy link

Hi; this is really cool work!!! Thanks for putting this together!

I have an LLM based method that just prompts an LLM zero-shot, and we show that it achieves strong forecasts (See Direct Prompt in https://arxiv.org/abs/2410.18959, Method in Page 41). Specifically we benchmarked on data that is beyond the cutoff date of the LLM and so is definitely not memorized.

Do you think adding this method to GIFT-Eval would be worthwhile? I'm on the fence because the test data in this benchmark is from public sources so there is no guarantee that LLMs haven't memorized it.

@liu-jc
Copy link
Contributor

liu-jc commented Dec 8, 2024

Hi @ashok-arjun,

Thanks for this issue. In my view, it is worthwhile to put on new a LLM-based method. Meanwhile, I understand your point on the issue of whether the data is memorized by LLMs.

I am thinking about maybe later we can put a column to indicate if the model is trained with "data leakage" (like [yes, no, 'potentially']). What do you think about this?

@ashok-arjun
Copy link
Author

That's a good idea. I suspect that'll be useful in general as there may be possibly future LLM-based models that we may want to benchmark.

@liu-jc
Copy link
Contributor

liu-jc commented Dec 9, 2024

Yes I totally agree. So, for your model, could you please provide the all_results.csv file and the model config file, and make a PR here? I would love to help you put on the leaderboard :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants