Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a script to support improving the performance (accuracy, cost and latency) of a vanna app.
The Problem;
Context;
vn.ask()
carries out RAG in multiple steps that can all be optimised;Further improvements to vanna in the future could open up even more possibilities like;
The solution;
A script implements
trulens-eval
that allows configuration of what is to be evaluated, and how. It presents the results in a dashboard (see the doc for visuals)Evaluation of the system using TruLens allows evaluation without changing vanna (just adding a log to the vanna model). Alternatives could be to include evaluation in the app's code itself, this might require major refactoring to decouple the vanna components.
Other evaluation frameworks exist, though not many as of yet.
Tests performed
Manual/hand testing only, and only used a few example prompts (shown in the code). No unit tests