Draft of suggested changes to the vectorstore #579
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR includes draft implementation of how I would design the vectorstore client APIs. The idea here is to remain consistent with the rest of the client binding, honor EdgeQL and be Pythonic, at the same time, make the API easier to use.
I didn’t know the team had agreed on the APIs, so please just treat this PR as a reference and cherry-pick what you need.
Refs #573
Usage
Adding records:
Search by vector similarity and filter by metadata:
(note: I find the previous edgeql-query-builder-alike API neither easy to use nor keeping the user away from EdgeQL knowledge. User would easily end up composing a complex filter with many and/or's without dealing with empty sets/missing keys correctly, and get surprises when results are unexpected, and there's no way to correct it once user learned EdgeQL because it's an incomplete query builder. I think now that we cannot easily introduce a full-blown query builder into just the vectorstore API, why not just use the most simple and native API and hide the EdgeQL complications from the AI-users.)
Update records:
All above just works with EdgeDB transactions:
Asynchronous interface is just the same, with explicit awaits before embedding generation:
So that the user would realize and generate embeddings out of transactions:
But no awaits if not calling the model for generation: