No "Zap" action available, neither a "iterateDocumentIds()" - how to identify dead documents for removal? #87
Replies: 2 comments 1 reply
-
Let me try to understand: You have IDs that are removed from an external service and those IDs/documents need to be removed from Loupe now? As Loupe is just a search engine, maybe you could illustrate what feature you're missing by referencing how other search engines do it? I'd be especially interested in Meilisearch because Loupe is working similarly from an API standpoint :) |
Beta Was this translation helpful? Give feedback.
-
For the sake of simplicity, let's assume we have a directory containing text files:
Then, some person updates the documents and adds a file A cronjob now iterates over the directory again and adds/updates all documents (we can probably finetune this by validating Problem is, we can not detect that I'm searching for something like this: https://www.meilisearch.com/docs/reference/api/documents#delete-all-documents (callled zap in DBase, truncate in MySQL) to be able to start clean. For iterating the documents (or at least their ids), I only found https://www.meilisearch.com/docs/reference/api/documents#get-documents-with-post which might provide the desired result when |
Beta Was this translation helpful? Give feedback.
-
I wanted to implement a "reindex" operation but am unsure how to do this (aside from deleting the data directory and reinstantiating loupe).
IMO we could have two useful methods to implement this:
Loupe::zap(): void
which will remove all documents from Loupe index and therefore allows to "start over" (essentially the same as removing the data directory but allows to skip the reinstantiating).Loupe::iterateDocumentIds(): iterable<string>
- this could be a generator iterating over all document ids which could then be used to check each id if it is a "dead" one.Background is: we have external changes to the contents of the document directory and files will "disappear" - we now have no way to identify the differences.
Iterating all files will allow to add/update but we can never remove a document from the index at all unless we know that it existed.
Ugly workaround would be to keep track in a separate file but that feels like duplicating the storage.
What do you think, would something like this be useful in Loupe?
Beta Was this translation helpful? Give feedback.
All reactions