Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add langchain dappier retriever integration notebooks #28931

Merged
merged 13 commits into from
Jan 3, 2025
Merged
50 changes: 50 additions & 0 deletions docs/docs/integrations/providers/dappier.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dappier\n",
"\n",
"> [Dappier](https://dappier.com) connects any LLM or your Agentic AI to real-time, rights-cleared, proprietary data from trusted sources, making your AI an expert in anything. Our specialized models include Real-Time Web Search, News, Sports, Financial Stock Market Data, Crypto Data, and exclusive content from premium publishers. Explore a wide range of data models in our marketplace at [marketplace.dappier.com](https://marketplace.dappier.com).\n",
"\n",
"> [Dappier](https://dappier.com) delivers enriched, prompt-ready, and contextually relevant data strings, optimized for seamless integration with LangChain. Whether you're building conversational AI, recommendation engines, or intelligent search, Dappier's LLM-agnostic RAG models ensure your AI has access to verified, up-to-date data—without the complexity of building and managing your own retrieval pipeline."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "y8ku6X96sebl"
},
"outputs": [],
"source": [
"from langchain_dappier import DappierRetriever"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
318 changes: 318 additions & 0 deletions docs/docs/integrations/retrievers/dappier.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,318 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e6e1e5d5",
"metadata": {
"id": "e6e1e5d5"
},
"source": [
"# Dappier\n",
"\n",
"> [Dappier](https://dappier.com) connects any LLM or your Agentic AI to real-time, rights-cleared, proprietary data from trusted sources, making your AI an expert in anything. Our specialized models include Real-Time Web Search, News, Sports, Financial Stock Market Data, Crypto Data, and exclusive content from premium publishers. Explore a wide range of data models in our marketplace at [marketplace.dappier.com](https://marketplace.dappier.com).\n",
"\n",
"> [Dappier](https://dappier.com) delivers enriched, prompt-ready, and contextually relevant data strings, optimized for seamless integration with LangChain. Whether you're building conversational AI, recommendation engines, or intelligent search, Dappier's LLM-agnostic RAG models ensure your AI has access to verified, up-to-date data—without the complexity of building and managing your own retrieval pipeline."
]
},
{
"cell_type": "markdown",
"id": "e49f1e0d",
"metadata": {
"id": "e49f1e0d"
},
"source": [
"# DappierRetriever\n",
"\n",
"This will help you getting started with the Dappier [retriever](https://python.langchain.com/docs/concepts/retrievers/). For detailed documentation of all DappierRetriever features and configurations head to the [API reference](https://python.langchain.com/en/latest/retrievers/langchain_dappier.retrievers.Dappier.DappierRetriever.html).\n",
"\n",
"### Integration details\n",
"\n",
"Bring-your-own data (i.e., index and search a custom corpus of documents):\n",
"\n",
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[DappierRetriever](https://python.langchain.com/en/latest/retrievers/langchain_dappier.retrievers.Dappier.DappierRetriever.html) | ❌ | ❌ | langchain-dappier |\n",
"\n",
"### Setup\n",
"\n",
"Install ``langchain-dappier`` and set environment variable\n",
" ``DAPPIER_API_KEY``.\n",
"\n",
" pip install -U langchain-dappier\n",
" export DAPPIER_API_KEY=\"your-api-key\"\n",
"\n",
"We also need to set our Dappier API credentials, which can be generated at the [Dappier site.](https://platform.dappier.com/profile/api-keys).\n",
"\n",
"We can find the supported data models by heading over to the [Dappier marketplace.](https://platform.dappier.com/marketplace)"
]
},
{
"cell_type": "markdown",
"id": "72ee0c4b-9764-423a-9dbf-95129e185210",
"metadata": {
"id": "72ee0c4b-9764-423a-9dbf-95129e185210"
},
"source": [
"If you want to get automated tracing from individual queries, you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a15d341e-3e26-4ca3-830b-5aab30ed66de",
"metadata": {
"id": "a15d341e-3e26-4ca3-830b-5aab30ed66de"
},
"outputs": [],
"source": [
"# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n",
"# os.environ[\"LANGSMITH_TRACING\"] = \"true\""
]
},
{
"cell_type": "markdown",
"id": "0730d6a1-c893-4840-9817-5e5251676d5d",
"metadata": {
"id": "0730d6a1-c893-4840-9817-5e5251676d5d"
},
"source": [
"### Installation\n",
"\n",
"This retriever lives in the `langchain-dappier` package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "652d6238-1f87-422a-b135-f5abbb8652fc",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "652d6238-1f87-422a-b135-f5abbb8652fc",
"outputId": "d1bb87dd-860d-4255-d5a8-b3f42da1d76e"
},
"outputs": [],
"source": [
"%pip install -qU langchain-dappier"
]
},
{
"cell_type": "markdown",
"id": "a38cde65-254d-4219-a441-068766c0d4b5",
"metadata": {
"id": "a38cde65-254d-4219-a441-068766c0d4b5"
},
"source": [
"## Instantiation\n",
"\n",
"- data_model_id: str\n",
" Data model ID, starting with dm_.\n",
" You can find the available data model IDs at:\n",
" [Dappier marketplace.](https://platform.dappier.com/marketplace)\n",
"- k: int\n",
" Number of documents to return.\n",
"- ref: Optional[str]\n",
" Site domain where AI recommendations are displayed.\n",
"- num_articles_ref: int\n",
" Minimum number of articles from the ref domain specified.\n",
" The rest will come from other sites within the RAG model.\n",
"- search_algorithm: Literal[\n",
" \"most_recent\",\n",
" \"most_recent_semantic\",\n",
" \"semantic\",\n",
" \"trending\"\n",
"]\n",
" Search algorithm for retrieving articles.\n",
"- api_key: Optional[str]\n",
" The API key used to interact with the Dappier APIs."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "70cc8e65-2a02-408a-bbc6-8ef649057d82",
"metadata": {
"id": "70cc8e65-2a02-408a-bbc6-8ef649057d82"
},
"outputs": [],
"source": [
"from langchain_dappier import DappierRetriever\n",
"\n",
"retriever = DappierRetriever(data_model_id=\"dm_01jagy9nqaeer9hxx8z1sk1jx6\")"
]
},
{
"cell_type": "markdown",
"id": "5c5f2839-4020-424e-9fc9-07777eede442",
"metadata": {
"id": "5c5f2839-4020-424e-9fc9-07777eede442"
},
"source": [
"## Usage"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "51a60dbe-9f2e-4e04-bb62-23968f17164a",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "51a60dbe-9f2e-4e04-bb62-23968f17164a",
"outputId": "f85bcf8e-4b51-4f82-8e48-582a9643fa0a"
},
"outputs": [
{
"data": {
"text/plain": [
"[Document(metadata={'title': 'Man shot and killed on Wells Street near downtown Fort Wayne', 'author': 'Gregg Montgomery', 'source_url': 'https://www.wishtv.com/news/indiana-news/man-shot-dies-fort-wayne-december-25-2024/', 'image_url': 'https://images.dappier.com/dm_01jagy9nqaeer9hxx8z1sk1jx6/fort-wayne-police-department-vehicle-via-Flickr_.jpg?width=428&height=321', 'pubdata': 'Thu, 26 Dec 2024 01:00:33 +0000'}, page_content='A man was shot and killed on December 25, 2024, in Fort Wayne, Indiana, near West Fourth and Wells streets. Police arrived shortly after 6:30 p.m. following reports of gunfire and found the victim in the 1600 block of Wells Street, where he was pronounced dead. The area features a mix of businesses, including a daycare and restaurants.\\n\\nAs of the latest updates, police have not provided details on the safety of the area, potential suspects, or the motive for the shooting. Authorities are encouraging anyone with information to reach out to the Fort Wayne Police Department or Crime Stoppers.'),\n",
" Document(metadata={'title': 'House cat dies from bird flu in pet food, prompting recall', 'author': 'Associated Press', 'source_url': 'https://www.wishtv.com/news/business/house-cat-bird-flu-pet-food-recall/', 'image_url': 'https://images.dappier.com/dm_01jagy9nqaeer9hxx8z1sk1jx6/BACKGROUND-Northwest-Naturals-cat-food_.jpg?width=428&height=321', 'pubdata': 'Wed, 25 Dec 2024 23:12:41 +0000'}, page_content='An Oregon house cat has died after eating pet food contaminated with the H5N1 bird flu virus, prompting a nationwide recall of Northwest Naturals\\' 2-pound Feline Turkey Recipe raw frozen pet food. The Oregon Department of Agriculture confirmed that the strictly indoor cat contracted the virus solely from the food, which has \"best if used by\" dates of May 21, 2026, and June 23, 2026. \\n\\nThe affected product was distributed across several states, including Arizona, California, and Florida, as well as British Columbia, Canada. Consumers are urged to dispose of the recalled food and seek refunds. This incident raises concerns about the spread of bird flu and its potential impact on domestic animals, particularly as California has declared a state of emergency due to the outbreak affecting various bird species.'),\n",
" Document(metadata={'title': '20 big cats die from bird flu at Washington sanctuary', 'author': 'Nic F. Anderson, CNN', 'source_url': 'https://www.wishtv.com/news/national/bird-flu-outbreak-wild-felid-center-2024/', 'image_url': 'https://images.dappier.com/dm_01jagy9nqaeer9hxx8z1sk1jx6/BACKGROUND-Amur-Bengal-tiger-at-Wild-Felid-Advocacy-Center-of-Washington-FB-post_.jpg?width=428&height=321', 'pubdata': 'Wed, 25 Dec 2024 23:04:34 +0000'}, page_content='The Wild Felid Advocacy Center in Washington state has experienced a devastating bird flu outbreak, resulting in the deaths of 20 big cats, over half of its population. The first death was reported around Thanksgiving, affecting various species, including cougars and a tiger mix. The sanctuary is currently under quarantine, closed to the public, and working with animal health officials to disinfect enclosures and implement prevention strategies.\\n\\nAs the situation unfolds, the Washington Department of Fish and Wildlife has noted an increase in bird flu cases statewide, including infections in cougars. While human infections from bird flu through contact with mammals are rare, the CDC acknowledges the potential risk. The sanctuary hopes to reopen in the new year, focusing on the recovery of the remaining animals and taking measures to prevent further outbreaks, marking an unprecedented challenge in its 20-year history.')]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = \"latest tech news\"\n",
"\n",
"retriever.invoke(query)"
]
},
{
"cell_type": "markdown",
"id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e",
"metadata": {
"id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e"
},
"source": [
"## Use within a chain\n",
"\n",
"Like other retrievers, DappierRetriever can be incorporated into LLM applications via [chains](/docs/how_to/sequence/).\n",
"\n",
"We will need a LLM or chat model:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "25b647a3-f8f2-4541-a289-7a241e43f9df",
"metadata": {
"id": "25b647a3-f8f2-4541-a289-7a241e43f9df"
},
"outputs": [],
"source": [
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae",
"metadata": {
"id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae"
},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"\n",
"prompt = ChatPromptTemplate.from_template(\n",
" \"\"\"Answer the question based only on the context provided.\n",
"\n",
"Context: {context}\n",
"\n",
"Question: {question}\"\"\"\n",
")\n",
"\n",
"\n",
"def format_docs(docs):\n",
" return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
"\n",
"\n",
"chain = (\n",
" {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
" | prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "d47c37dd-5c11-416c-a3b6-bec413cd70e8",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 143
},
"id": "d47c37dd-5c11-416c-a3b6-bec413cd70e8",
"outputId": "f23f18a9-d138-4684-cb5b-b92e0895b5f2"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"\"The key highlights and outcomes from the latest events covered in the article include:\\n\\n1. An Israeli airstrike in Gaza killed five journalists from Al-Quds Today Television, leading to condemnation from their outlet and raising concerns about violence against media professionals in the region.\\n2. The Committee to Protect Journalists reported that since October 7, 2023, at least 141 journalists have been killed in the region, marking the deadliest period for journalists since 1992, with the majority being Palestinians in Gaza.\\n3. A man was shot and killed in Fort Wayne, Indiana, with police not providing details on suspects, motive, or the safety of the area.\\n4. An Oregon house cat died after eating pet food contaminated with the H5N1 bird flu virus, leading to a nationwide recall of Northwest Naturals' Feline Turkey Recipe raw frozen pet food and raising concerns about the spread of bird flu among domestic animals.\""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(\n",
" \"What are the key highlights and outcomes from the latest events covered in the article?\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
"metadata": {
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3"
},
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all DappierRetriever features and configurations head to the [API reference](https://python.langchain.com/en/latest/retrievers/langchain_dappier.retrievers.Dappier.DappierRetriever.html)."
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
4 changes: 4 additions & 0 deletions libs/packages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -308,3 +308,7 @@ packages:
repo: crate/langchain-cratedb
downloads: 362
downloads_updated_at: '2024-12-23T20:53:27.001852+00:00'
- name: langchain-dappier
path: .
repo: DappierAI/langchain-dappier
downloads: 0
Loading