-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added framework to score evaluation metrics #33
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oindrillac The additions look great 🎉 Left a few comments mainly around structuring the notebook
"id": "b74e0295-9269-4dd4-8e38-b6bb22c679db", | ||
"metadata": {}, | ||
"source": [ | ||
"## Quantitative Evaluation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe move the quantitative evaluation along with the examples to a new notebook? This notebook seems to be getting quite lengthy
}, | ||
"outputs": [], | ||
"source": [ | ||
"def get_response(model_id, file, functions, classes, documentation, imports, other, functions_code, functions_doc, classes_code, classes_doc):\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can perhaps move these 3 functions to a helper_functions.ipynb
and import from there
"id": "812910d9-9b4f-4430-bffe-d58bb4b67083", | ||
"metadata": {}, | ||
"source": [ | ||
"## Copy this section, modify and run from here" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can wrap all the steps being done here into a function (and include it in helper_functions.ipynb
) and invoke this function to run each example, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving this, since we are addressing the refactoring comments in a separate PR
To drill down on the best genai evaluation criteria, added a framework to obtain a quantitative evaluation matrix to determine how often these scores are valid by
I added 6 examples, @hemajv @aakankshaduggal feel free to follow the same example structure and append to the dataframe with more examples, the dataframe can be imported from a pickle file in the same folder.