Evaluation function based on LLM grading #7

msaelices · 2023-05-24T10:42:52Z

Changes

New is_correct() evaluation function, which asks an LLM model to return if a response is correct

Proof-of-life

…eturn if a response is correct

for more information, see https://pre-commit.ci

…e actual accuracy. Replace predicted with expected

edwardmfho · 2023-06-07T11:34:34Z

Test-ran the example cases using gpt-3.5-turbo instead of the default gpt-4 (still waiting for the API). It is a good function to be added in.

edwardmfho

Looks good.

mistercrunch · 2023-06-19T19:37:16Z

I like the idea, the only thing is the implementation is very openai-specific, which is probably fine as long as we make it clear. How about we break down evals.py into evals/__init__.py and evals/openai.py.

Goal would be to import is_correct from promptimize.evals.openai

edwardmfho · 2023-06-19T21:45:45Z

Should we begin with some sort of base eval function/class that could be used for other LLMs?

lain5etf7w · 2023-06-25T10:34:52Z

[email protected]

msaelices and others added 4 commits May 24, 2023 12:37

feat: New is_correct() evaluation function which ask a LLM model to r…

290a0eb

…eturn if a response is correct

[pre-commit.ci] auto fixes from pre-commit.com hooks

ff6ac67

for more information, see https://pre-commit.ci

feat: Improve the prompt to prioritize the expected answer and not th…

2af42a5

…e actual accuracy. Replace predicted with expected

feat: Fixed example after replacing predicted to expected

b77e30c

edwardmfho approved these changes Jun 7, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation function based on LLM grading #7

Evaluation function based on LLM grading #7

msaelices commented May 24, 2023

edwardmfho commented Jun 7, 2023

edwardmfho left a comment

mistercrunch commented Jun 19, 2023 •

edited

Loading

edwardmfho commented Jun 19, 2023

lain5etf7w commented Jun 25, 2023

Evaluation function based on LLM grading #7

Are you sure you want to change the base?

Evaluation function based on LLM grading #7

Conversation

msaelices commented May 24, 2023

Changes

Proof-of-life

edwardmfho commented Jun 7, 2023

edwardmfho left a comment

Choose a reason for hiding this comment

mistercrunch commented Jun 19, 2023 • edited Loading

edwardmfho commented Jun 19, 2023

lain5etf7w commented Jun 25, 2023

mistercrunch commented Jun 19, 2023 •

edited

Loading