diff --git a/src/content/docs/concepts/intro-to-graphrag.md b/src/content/docs/concepts/intro-to-graphrag.md index 01ac15d..4ca8ed6 100644 --- a/src/content/docs/concepts/intro-to-graphrag.md +++ b/src/content/docs/concepts/intro-to-graphrag.md @@ -116,4 +116,4 @@ In the above example, a vector similarity search is executed on the existing ind * [Neo4j GraphAcademy: Build a Neo4j-backed Chatbot using Python](https://graphacademy.neo4j.com/courses/llm-chatbot-python/) * [Integrating Neo4j into the LangChain ecosystem](https://towardsdatascience.com/integrating-neo4j-into-the-langchain-ecosystem-df0e988344d2) -* [Neo4j GraphAcademy: Mastering Retrieval-Augmented Generation (RAG)]https://graphacademy.neo4j.com/courses/genai-workshop-graphrag/ \ No newline at end of file +* [Neo4j GraphAcademy: Mastering Retrieval-Augmented Generation (RAG)](https://graphacademy.neo4j.com/courses/genai-workshop-graphrag/) \ No newline at end of file diff --git a/src/content/docs/reference/graphrag/basic-retriever.md b/src/content/docs/reference/graphrag/basic-retriever.md index 50e06cb..0f6af71 100644 --- a/src/content/docs/reference/graphrag/basic-retriever.md +++ b/src/content/docs/reference/graphrag/basic-retriever.md @@ -6,21 +6,43 @@ tags: ["Basic"] ## Alternative Names +- Vector Retriever - Naive Retriever - Baseline RAG +- Basic RAG - Typical RAG ## Required Graph Shape [Lexical Graph](/reference/knowledge-graph/lexical-graph) +![](../../../../assets/images/knowledge-graph-lexical-graph.svg) + +## Context + +It’s useful to chunk large documents into smaller pieces when creating embeddings. +An embedding is a text’s semantic representation capturing the meaning of what the text is about. +If the given text is long and contains too many diverse subjects, the informative value of its embedding deteriorates. + ## Description -The user question is embedded using the same embedder that has been used before to create the chunk embeddings. A vector similarity search is executed on the chunk embeddings to retrieve k (number previously configured by developer / user) most similar chunks. +The user question is embedded using the same embedder that has been used before to create the chunk embeddings. +A vector similarity search is executed on the chunk embeddings to retrieve k (number previously configured by developer / user) most similar chunks. ## Usage -This pattern is useful if the user asks for specific information about a topic that exists in one or more (but not too many) chunks. The question should not require complex aggregations or knowledge about the whole dataset. Since the pattern only contains a vector similarity search it is easy to understand, implement and get started with. +This pattern is useful if the user asks for specific information about a topic that exists in one or more (but not too many) chunks. +The question should not require complex aggregations or knowledge about the whole dataset. +Since the pattern only contains a vector similarity search it is easy to understand, implement and get started with. + +## Required pre-processing + +Split documents into chunks and use an embedding model to embed the text content of the chunks. +See [chunking](/guides/chunking). + +## Retrieval Query + +No additional query is necessary since the Neo4j Vector retriever retrieves similar chunks by default. ## Further reading @@ -29,6 +51,7 @@ This pattern is useful if the user asks for specific information about a topic t ## Existing Implementations +- [Neo4j GraphRAG - Vector Retriever](https://neo4j.com/docs/neo4j-graphrag-python/current/user_guide_rag.html#vector-retriever) - [Langchain Retrievers: Vector store-backed retriever](https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/vectorstore/) - [Langchain: Neo4jVector](https://python.langchain.com/v0.2/docs/integrations/vectorstores/neo4jvector/)