Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding rerank as a retriever #331

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

jpfcabral
Copy link

@jpfcabral jpfcabral commented Jan 16, 2025

This PR resolves #298

Added:

Some snippets:

  • Example 1 (from documents):
from langchain_core.documents import Document
from langchain_aws import BedrockRerank

# Initialize the class
reranker = BedrockRerank(top_n=5, aws_region="us-west-2")

# List of documents to rerank
documents = [
    Document(page_content="LangChain is a powerful library for LLMs."),
    Document(page_content="AWS Bedrock enables access to AI models."),
    Document(page_content="Artificial intelligence is transforming the world."),
]

# Query for reranking
query = "What is AWS Bedrock?"

# Call the rerank method
results = reranker.compress_documents(documents, query)

# Display the most relevant documents
for doc in results:
    print(f"Content: {doc.page_content}")
    print(f"Score: {doc.metadata['relevance_score']}")
  • Example 2 (with contextual compression retriever):
from langchain_aws import BedrockEmbeddings
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain.vectorstores import FAISS
from langchain_core.documents import Document
from langchain_aws import BedrockRerank

# Create a vector store using FAISS with Bedrock embeddings
documents = [
    Document(page_content="LangChain integrates LLM models."),
    Document(page_content="AWS Bedrock provides cloud-based AI models."),
    Document(page_content="Machine learning can be used for predictions."),
]
embeddings = BedrockEmbeddings()
vectorstore = FAISS.from_documents(documents, embeddings)

# Create the document compressor using BedrockRerank
reranker = BedrockRerank(top_n=2)

# Create the retriever with contextual compression
retriever = ContextualCompressionRetriever(
    base_compressor=reranker,
    base_retriever=vectorstore.as_retriever(),
)

# Execute a query
query = "How does AWS Bedrock work?"
retrieved_docs = retriever.get_relevant_documents(query)

# Display the most relevant documents
for doc in retrieved_docs:
    print(f"Content: {doc.page_content}")
    print(f"Score: {doc.metadata.get('relevance_score', 'N/A')}")
  • Example 3 (from list):
from langchain_aws import BedrockRerank

# Initialize BedrockRerank
reranker = BedrockRerank(top_n=3, aws_region="us-west-2")

# Unstructured documents
documents = [
    "LangChain is used to integrate LLM models.",
    "AWS Bedrock provides access to cloud-based models.",
    "Machine learning is revolutionizing the world.",
]

# Query
query = "What is the role of AWS Bedrock?"

# Rerank the documents
results = reranker.rerank(query=query, documents=documents)

# Display the results
for res in results:
    print(f"Index: {res['index']}, Score: {res['relevance_score']}")
    print(f"Document: {documents[res['index']]}")

@jpfcabral jpfcabral changed the title Adding rerank on langchain format Fixes #298 Adding rerank on langchain format Jan 16, 2025
@jpfcabral jpfcabral changed the title Fixes #298 Adding rerank on langchain format Closes #298 Adding rerank on langchain format Jan 16, 2025
@jpfcabral jpfcabral changed the title Closes #298 Adding rerank on langchain format Adding rerank #298 Jan 16, 2025
@jpfcabral jpfcabral changed the title Adding rerank #298 Adding rerank Jan 16, 2025
@jpfcabral jpfcabral changed the title Adding rerank Adding rerank as a retriever Jan 16, 2025
@jpfcabral jpfcabral force-pushed the main branch 2 times, most recently from df8c35a to 70f4e2d Compare January 16, 2025 19:52
@mgvalverde
Copy link

Hi @jpfcabral, interesting contribution!

I noticed that the default region is set to aws_region="us-west-2". From what I understand, if you have a region configured using a profile, not specifying the region should use the one from the profile. However, it defaults always to "us-west-2" instead.
Would it make sense for you to replace that value with aws_region=None?

@jpfcabral
Copy link
Author

Fair point, @mgvalverde , I just changed on bbc0243

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support bedrock rerank API
2 participants