-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add llm-benchmarks proposal #113
Conversation
By the way, my implementation is in https://github.com/IcyFeather233/ianvs/tree/dev, you can follow this readme to try llm single task learning and opencompass |
The PR should be translated into English to facilitate understanding by people from other countries. |
Thanks for the advice! I will translate it after the proposal content will not be changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal is related to #95.
Great to see a comprehensive proposal. This one is close to the final version. As discussed in the routine meeting, there might be some advices.
- Though the directories are clear, an architecture is still needed to clarify what is modified in this project. E.g., what is in the TestEnv, TestCase. It seems to me that at least we have made changes on the data format to support NLP. You might want to take a look at https://github.com/kubeedge/ianvs/pull/122/files for an architecture example.
- Prompts would have different forms, e.g., N shots. How shall we develop a prompt template to adapt all these? It looks interesting and challenging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I add my changes to ianvs core, including 2 flowcharts showing the architecture. And a more complete benchmark format is updated in the doc.
Signed-off-by: IcyFeather <[email protected]>
Signed-off-by: IcyFeather <[email protected]>
Signed-off-by: IcyFeather <[email protected]>
Signed-off-by: IcyFeather <[email protected]>
Signed-off-by: IcyFeather <[email protected]>
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me. Just a few comments would be considered:
- The revision for index in env manager of ianvs core, which should be highlighted: The proposed design also aims to support both the old and new index.
- The integration part of OpenCompass should also be highlighted in the structure.png. A tutorial is needed.
- The prompt template is extendable, including the keys and prompts, which can be also highlighted in the proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Signed-off-by: IcyFeather <[email protected]>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MooreZheng The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @IcyFeather233 , @hsj576 , @MooreZheng , I wanted to discuss one thing regarding this project. Is there any way I can chat/discuss this with you? I am unable to find anyone of you on slack channel |
Hi, the slack channel currently has networking issues for Chinese users. You might want to leave your comment here or raise a issue. |
/lgtm |
What type of PR is this?
OSPP proposal
What this PR does / why we need it:
Investigated various aspects of implementing the LLM Benchmark in Ianvs, introduced the integration plan of opencompass, and achieved the dataset map.