Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] RecordManager Does Not Properly Cleanup Vector Store's OLD VALUES #3570

Open
dkindlund opened this issue Nov 25, 2024 · 3 comments
Open
Labels
bug Something isn't working

Comments

@dkindlund
Copy link
Contributor

Describe the bug
In Flowise, I've created a new Document Store where an Airtable Document Loader feeds data into a Postgres Vector Store. We're also using Postgres to store the Record Manager records. Whenever I update an existing record in Airtable and then proceed to perform the Upsert operation, I see the new vector store record created but the OLD vector store record still EXISTS. The OLD vector store record does not get properly deleted!

To Reproduce
Steps to reproduce the behavior:

  1. Create a new Document Store
  2. Specify an Airtable Document Loader
  3. Specify a Postgres Vector Store and Record Manager (with Cleanup = FULL)
  4. Perform initial Upsert
  5. Change a single existing Airtable record
  6. Perform second Upsert
  7. Verify that BOTH the NEW and the OLD vector store records STILL exist in the vector store

Expected behavior
I would expect that Upserting would purge the OLD vector store record.

Screenshots
If applicable, add screenshots to help explain your problem.

Flow
If applicable, add exported flow in order to help replicating the problem.

Setup

  • Installation: docker
  • Flowise Version: 2.1.2
  • OS: Google Cloud Run Linux
  • Browser: chrome

Additional context
So my vector store table is named mechanisms_vec_v1 and my record manager table is named mechanisms_rm_v1. I THINK the way the deletion logic is supposed to work is a match on where mechanisms_vec_v1.id = mechanisms_rm_v1.key. HOWEVER, in my case, there is NO OVERLAP IN VALUES. For some reason, when a vector store document is inserted, that record's mechanisms_vec_v1.id IS NEVER THE SAME VALUE as it's corresponding mechanisms_rm_v1.key value.

Because of this, I think there's some sort of BUG between when the Airtable data gets loaded to when it gets added to the vector store -- specifically, I think the mechanisms_vec_v1.id computed value might NOT be getting set the same way as the mechanisms_rm_v1.key value.

It's exceptionally hard to figure out exactly where the bug is located, because of all the layers of indirection and use of langchain built-in libraries, but I hope these clues help, @HenryHengZJ . Thanks in advance!

@dkindlund
Copy link
Contributor Author

Here's what my RecordManager looks like:

rm

@HenryHengZJ HenryHengZJ added the bug Something isn't working label Nov 25, 2024
@HenryHengZJ
Copy link
Contributor

can you help me confirm 2 things:

  • does it happen only for airtable?
  • does it happen only for FULL, what bout incremental?

@dkindlund
Copy link
Contributor Author

Hey @HenryHengZJ , let me upgrade to the latest version to confirm if this bug still exists. Will get you another update with your answers after that. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants