Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How to delete nodes using node ids from index which is VectorStoreIndex instance when created directly using nodes instead of documents? #9245

Closed
1 task done
pavansandeep2910 opened this issue Dec 1, 2023 · 7 comments
Labels
question Further information is requested

Comments

@pavansandeep2910
Copy link

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

How to delete nodes using node ids from the index which is a VectorStoreIndex instance when created directly using nodes instead of documents?

I tried the delete_nodes method to delete nodes using node IDs. but got the following error:

Traceback (most recent call last):
  File "/Users/pavan/Documents/GitHub/sentinel-ai-analytics-backend/scripts/create_few_shot_samples_embeddings.py", line 81, in <module>
    index.delete_nodes(deletableNodeIds)
  File "/Users/pavan/Documents/GitHub/sentinel-ai-analytics-backend/venv/lib/python3.11/site-packages/llama_index/indices/vector_store/base.py", line 286, in delete_nodes
    raise NotImplementedError(
NotImplementedError: Vector indices currently only support delete_ref_doc, which deletes nodes using the ref_doc_id of ingested documents.
@pavansandeep2910 pavansandeep2910 added the question Further information is requested label Dec 1, 2023
Copy link

dosubot bot commented Dec 1, 2023

🤖

Hello @pavansandeep2910,

Nice to meet you! I'm Dosu, a friendly bot here to assist you with your queries about LlamaIndex, help solve bugs, or guide you to become a contributor. While we wait for a human maintainer to join us, feel free to ask any questions you might have. I'm here to help!

Based on the error message you provided, it seems like you're trying to use the delete_nodes method in the VectorStoreIndex class. However, this method is currently not implemented in the LlamaIndex, as indicated by the NotImplementedError being raised. This means that the functionality to delete nodes using their node IDs directly is not available in the current version of the LlamaIndex.

However, you can delete nodes indirectly by deleting the reference document that they are associated with. This can be done using the delete_ref_doc method, which deletes a document and its nodes using the ref_doc_id of the ingested document.

Here is an example of how you can use this method:

ref_doc_id = "<your_ref_doc_id>"
vector_store_index_instance.delete_ref_doc(ref_doc_id)

This will delete the document with the given ref_doc_id and all its associated nodes from the index. If you want to also delete the document from the document store, you can do so by setting the delete_from_docstore parameter to True:

vector_store_index_instance.delete_ref_doc(ref_doc_id, delete_from_docstore=True)

Please note that this will only work if the nodes you want to delete are associated with the same reference document. If they are not, you will need to call delete_ref_doc multiple times with the different ref_doc_ids.

This information is based on the llama_index/indices/vector_store/base.py file in the LlamaIndex repository and a similar issue [Feature Request]: delete and update node that was resolved recently.

I hope this helps! If you have any other questions, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@pavansandeep2910
Copy link
Author

🤖

Hello @pavansandeep2910,

Nice to meet you! I'm Dosu, a friendly bot here to assist you with your queries about LlamaIndex, help solve bugs, or guide you to become a contributor. While we wait for a human maintainer to join us, feel free to ask any questions you might have. I'm here to help!

Based on the error message you provided, it seems like you're trying to use the delete_nodes method in the VectorStoreIndex class. However, this method is currently not implemented in the LlamaIndex, as indicated by the NotImplementedError being raised. This means that the functionality to delete nodes using their node IDs directly is not available in the current version of the LlamaIndex.

However, you can delete nodes indirectly by deleting the reference document that they are associated with. This can be done using the delete_ref_doc method, which deletes a document and its nodes using the ref_doc_id of the ingested document.

Here is an example of how you can use this method:

ref_doc_id = "<your_ref_doc_id>"
vector_store_index_instance.delete_ref_doc(ref_doc_id)

This will delete the document with the given ref_doc_id and all its associated nodes from the index. If you want to also delete the document from the document store, you can do so by setting the delete_from_docstore parameter to True:

vector_store_index_instance.delete_ref_doc(ref_doc_id, delete_from_docstore=True)

Please note that this will only work if the nodes you want to delete are associated with the same reference document. If they are not, you will need to call delete_ref_doc multiple times with the different ref_doc_ids.

This information is based on the llama_index/indices/vector_store/base.py file in the LlamaIndex repository and a similar issue [Feature Request]: delete and update node that was resolved recently.

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosu-bot but i don't have ref_doc_id's as i have created the index directly from the nodes without creating documents. is there any other way to delete nodes?

@logan-markewich
Copy link
Collaborator

@pavansandeep2910 there is not. It needs fo be implemented.

Feel free to open a PR though

Copy link

dosubot bot commented Dec 1, 2023

🤖

Based on the information you've provided and the context from the LlamaIndex repository, it seems that you're encountering a NotImplementedError when trying to use the delete_nodes method in the VectorStoreIndex class. This error occurs because the delete_nodes method is not currently implemented for vector indices in the LlamaIndex framework. The error message suggests using the delete_ref_doc method, which deletes nodes using the ref_doc_id of ingested documents. However, as you've mentioned, you don't have ref_doc_ids because you created the index directly from nodes without creating documents.

In the LlamaIndex framework, a ref_doc_id and a node_id serve different purposes. A ref_doc_id is a reference to the original document that was ingested into the index and is associated with all the nodes that were generated from that document. A node_id is a unique identifier for each node in the index. If an index is created directly from nodes instead of documents using the build_index_from_nodes method, a ref_doc_id will not be generated. This is because the ref_doc_id is specifically associated with the ingestion of documents. When creating an index from nodes, there is no reference to a document, and hence no ref_doc_id is generated. You can find more details about this in the issue #9209.

As for deleting nodes from the index, there is an alternative method to delete nodes from the LlamaIndex. If you have created the index directly from nodes and thus don't have ref_doc_ids to use with the delete_ref_doc method, you can use the delete_nodes method instead. The delete_nodes method takes a list of node IDs and an optional boolean parameter delete_from_docstore. If delete_from_docstore is set to True, the nodes will also be deleted from the document store. Here is how you can use it:

index.delete_nodes(['node_id_1', 'node_id_2'], delete_from_docstore=True)

This will delete the nodes with IDs 'node_id_1' and 'node_id_2' from the index and the document store. You can find more details about this in the LlamaIndex source code.

However, please note that the delete_nodes method is not implemented in the VectorStoreIndex class, which is why you're seeing the NotImplementedError. This is a known limitation of the LlamaIndex framework, as mentioned in the issue #6250. As of now, there is no workaround for this issue in the VectorStoreIndex class. You might need to consider using a different index class that supports the delete_nodes method if you need to delete nodes from the index.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@npulaparthi
Copy link

npulaparthi commented Aug 9, 2024

Hi @dosubot,

I am also face same issue to delete data in GPTVectorStoreIndex so used information given to solve but again I face issue to call the index to indexdict I called the vector_store.json assign it as IndexDict but I am getting type error
how can I give the vector index that I have created to vector_store_index_instance to delete the document nodes that I don't want

vector_store_index_instance = GPTVectorStoreIndex(BaseIndex[IndexDict])

vector_store_index_instance.delete_ref_doc(ref_doc_id, delete_from_docstore=True)

TypeError("object of type '_GenericAlias' has no len()")

@mdhuzaifapatel
Copy link

mdhuzaifapatel commented Jan 3, 2025

@pavansandeep2910 there is not. It needs fo be implemented.

Feel free to open a PR though

Hey, is it implemented now? Please let me know, I want to delete nodes I'm using faiss vectorstore.

You can suggest me any other vectorstore if that supports deleting of nodes or indices

@cydcydcyd
Copy link

@mdhuzaifapatel
milvus vector store support delete nodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants