Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata filter #422

Open
pmeier opened this issue May 21, 2024 · 0 comments
Open

metadata filter #422

pmeier opened this issue May 21, 2024 · 0 comments
Labels

Comments

@pmeier
Copy link
Member

pmeier commented May 21, 2024

Feature description

Add the metadata filter abstraction proposed in #256. It does not have to be connected to the ragna.core.Chat as this can be done in a follow-up PR. I just want to have it well tested before moving forward with the rest of the design.

The MetadataFilter object should have the following properties:

  • It needs to support the following operators:

    • logical and
    • logical or
    • equal
    • not equal
    • less than
    • less or equal than
    • greater than
    • greater or equal than
    • in
    • not in

    I have surveyed the ecosystem and this set of operators is supported by Chroma, LanceDB, Pinecone and Vectara. Meaning, this is a good base set.

  • It needs to have an "escape hatch operator", which can be used to supply a raw filter to the source storage. This is useful in case the user only uses one source storage (production use case) and it supports more operations than just the base case. Meaning, we are not restricting users to the set of operators above.

  • It needs to be fully JSON (de-)serializable

Value and/or benefit

First building block for allowing users to have a corpus of documents instead of uploading new ones for every chat.

Anything else?

I have a complete implementation already in https://gist.github.com/pmeier/38ee90be6c30ecdf9bbec086a0dabafe. Unless there are some concerns, it can just be ported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Todo
Development

No branches or pull requests

1 participant