-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation function based on LLM grading #7
base: main
Are you sure you want to change the base?
Conversation
…eturn if a response is correct
for more information, see https://pre-commit.ci
…e actual accuracy. Replace predicted with expected
Test-ran the example cases using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
I like the idea, the only thing is the implementation is very openai-specific, which is probably fine as long as we make it clear. How about we break down evals.py into Goal would be to |
Should we begin with some sort of base eval function/class that could be used for other LLMs? |
Changes
is_correct()
evaluation function, which asks an LLM model to return if a response is correctProof-of-life