Skip to content

Commit

Permalink
Updated basic retriever
Browse files Browse the repository at this point in the history
  • Loading branch information
jexp committed Nov 27, 2024
1 parent b01e837 commit 228fe97
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 3 deletions.
2 changes: 1 addition & 1 deletion src/content/docs/concepts/intro-to-graphrag.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,4 +116,4 @@ In the above example, a vector similarity search is executed on the existing ind

* [Neo4j GraphAcademy: Build a Neo4j-backed Chatbot using Python](https://graphacademy.neo4j.com/courses/llm-chatbot-python/)
* [Integrating Neo4j into the LangChain ecosystem](https://towardsdatascience.com/integrating-neo4j-into-the-langchain-ecosystem-df0e988344d2)
* [Neo4j GraphAcademy: Mastering Retrieval-Augmented Generation (RAG)]https://graphacademy.neo4j.com/courses/genai-workshop-graphrag/
* [Neo4j GraphAcademy: Mastering Retrieval-Augmented Generation (RAG)](https://graphacademy.neo4j.com/courses/genai-workshop-graphrag/)
27 changes: 25 additions & 2 deletions src/content/docs/reference/graphrag/basic-retriever.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,43 @@ tags: ["Basic"]

## Alternative Names

- Vector Retriever
- Naive Retriever
- Baseline RAG
- Basic RAG
- Typical RAG

## Required Graph Shape

[Lexical Graph](/reference/knowledge-graph/lexical-graph)

![](../../../../assets/images/knowledge-graph-lexical-graph.svg)

## Context

It’s useful to chunk large documents into smaller pieces when creating embeddings.
An embedding is a text’s semantic representation capturing the meaning of what the text is about.
If the given text is long and contains too many diverse subjects, the informative value of its embedding deteriorates.

## Description

The user question is embedded using the same embedder that has been used before to create the chunk embeddings. A vector similarity search is executed on the chunk embeddings to retrieve k (number previously configured by developer / user) most similar chunks.
The user question is embedded using the same embedder that has been used before to create the chunk embeddings.
A vector similarity search is executed on the chunk embeddings to retrieve k (number previously configured by developer / user) most similar chunks.

## Usage

This pattern is useful if the user asks for specific information about a topic that exists in one or more (but not too many) chunks. The question should not require complex aggregations or knowledge about the whole dataset. Since the pattern only contains a vector similarity search it is easy to understand, implement and get started with.
This pattern is useful if the user asks for specific information about a topic that exists in one or more (but not too many) chunks.
The question should not require complex aggregations or knowledge about the whole dataset.
Since the pattern only contains a vector similarity search it is easy to understand, implement and get started with.

## Required pre-processing

Split documents into chunks and use an embedding model to embed the text content of the chunks.
See [chunking](/guides/chunking).

## Retrieval Query

No additional query is necessary since the Neo4j Vector retriever retrieves similar chunks by default.

## Further reading

Expand All @@ -29,6 +51,7 @@ This pattern is useful if the user asks for specific information about a topic t

## Existing Implementations

- [Neo4j GraphRAG - Vector Retriever](https://neo4j.com/docs/neo4j-graphrag-python/current/user_guide_rag.html#vector-retriever)
- [Langchain Retrievers: Vector store-backed retriever](https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/vectorstore/)
- [Langchain: Neo4jVector](https://python.langchain.com/v0.2/docs/integrations/vectorstores/neo4jvector/)

Expand Down

0 comments on commit 228fe97

Please sign in to comment.