-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3541ceb
commit ccc0b85
Showing
12 changed files
with
1,392 additions
and
1,441 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
# LlamaCloudIndex + LlamaCloudRetriever | ||
|
||
LlamaCloud is a new generation of managed parsing, ingestion, and retrieval services, designed to bring production-grade context-augmentation to your LLM and RAG applications. | ||
|
||
Currently, LlamaCloud supports | ||
|
||
- Managed Ingestion API, handling parsing and document management | ||
- Managed Retrieval API, configuring optimal retrieval for your RAG system | ||
|
||
## Access | ||
|
||
We are opening up a private beta to a limited set of enterprise partners for the managed ingestion and retrieval API. If you’re interested in centralizing your data pipelines and spending more time working on your actual RAG use cases, come [talk to us.](https://www.llamaindex.ai/contact) | ||
|
||
If you have access to LlamaCloud, you can visit [LlamaCloud](https://cloud.llamaindex.ai) to sign in and get an API key. | ||
|
||
## Setup | ||
|
||
First, make sure you have the latest LlamaIndex version installed. | ||
|
||
**NOTE:** If you are upgrading from v0.9.X, we recommend following our [migration guide](https://pretty-sodium-5e0.notion.site/v0-10-0-Migration-Guide-6ede431dcb8841b09ea171e7f133bd77), as well as uninstalling your previous version first. | ||
|
||
``` | ||
pip uninstall llama-index # run this if upgrading from v0.9.x or older | ||
pip install -U llama-index --upgrade --no-cache-dir --force-reinstall | ||
``` | ||
|
||
The `llama-index-indices-managed-llama-cloud` package is included with the above install, but you can also install directly | ||
|
||
``` | ||
pip install -U llama-index-indices-managed-llama-cloud | ||
``` | ||
|
||
## Usage | ||
|
||
You can create an index on LlamaCloud using the following code: | ||
|
||
```python | ||
import os | ||
|
||
os.environ[ | ||
"LLAMA_CLOUD_API_KEY" | ||
] = "llx-..." # can provide API-key in env or in the constructor later on | ||
|
||
from llama_index.core import SimpleDirectoryReader | ||
from llama_index.indices.managed.llama_cloud import LlamaCloudIndex | ||
|
||
# create a new index | ||
index = LlamaCloudIndex.from_documents( | ||
documents, | ||
"my_first_index", | ||
project_name="default", | ||
api_key="llx-...", | ||
verbose=True, | ||
) | ||
|
||
# connect to an existing index | ||
index = LlamaCloudIndex("my_first_index", project_name="default") | ||
``` | ||
|
||
You can also configure a retriever for managed retrieval: | ||
|
||
```python | ||
# from the existing index | ||
index.as_retriever() | ||
|
||
# from scratch | ||
from llama_index.indices.managed.llama_cloud import LlamaCloudRetriever | ||
|
||
retriever = LlamaCloudRetriever("my_first_index", project_name="default") | ||
``` | ||
|
||
And of course, you can use other index shortcuts to get use out of your new managed index: | ||
|
||
```python | ||
query_engine = index.as_query_engine(llm=llm) | ||
|
||
chat_engine = index.as_chat_engine(llm=llm) | ||
``` | ||
|
||
## Retriever Settings | ||
|
||
A full list of retriever settings/kwargs is below: | ||
|
||
- `dense_similarity_top_k`: Optional[int] -- If greater than 0, retrieve `k` nodes using dense retrieval | ||
- `sparse_similarity_top_k`: Optional[int] -- If greater than 0, retrieve `k` nodes using sparse retrieval | ||
- `enable_reranking`: Optional[bool] -- Whether to enable reranking or not. Sacrifices some speed for accuracy | ||
- `rerank_top_n`: Optional[int] -- The number of nodes to return after reranking initial retrieval results | ||
- `alpha` Optional[float] -- The weighting between dense and sparse retrieval. 1 = Full dense retrieval, 0 = Full sparse retrieval. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# LlamaParse | ||
|
||
LlamaParse is an API created by LlamaIndex to efficiently parse and represent files for efficient retrieval and context augmentation using LlamaIndex frameworks. | ||
|
||
LlamaParse directly integrates with [LlamaIndex](https://github.com/run-llama/llama_index). | ||
|
||
Currently available for **free**. Try it out today! | ||
|
||
**NOTE:** Currently, only PDF files are supported. | ||
|
||
## Getting Started | ||
|
||
First, login and get an api-key from `https://cloud.llamaindex.ai`. | ||
|
||
Then, make sure you have the latest LlamaIndex version installed. | ||
|
||
**NOTE:** If you are upgrading from v0.9.X, we recommend following our [migration guide](https://pretty-sodium-5e0.notion.site/v0-10-0-Migration-Guide-6ede431dcb8841b09ea171e7f133bd77), as well as uninstalling your previous version first. | ||
|
||
``` | ||
pip uninstall llama-index # run this if upgrading from v0.9.x or older | ||
pip install -U llama-index --upgrade --no-cache-dir --force-reinstall | ||
``` | ||
|
||
Lastly, install the package: | ||
|
||
`pip install llama-parse` | ||
|
||
Now you can run the following to parse your first PDF file: | ||
|
||
```python | ||
import nest_asyncio | ||
|
||
nest_asyncio.apply() | ||
|
||
from llama_parse import LlamaParse | ||
|
||
parser = LlamaParse( | ||
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY | ||
result_type="markdown", # "markdown" and "text" are available | ||
verbose=True, | ||
) | ||
|
||
# sync | ||
documents = parser.load_data("./my_file.pdf") | ||
|
||
# sync batch | ||
documents = parser.load_data(["./my_file1.pdf", "./my_file2.pdf"]) | ||
|
||
# async | ||
documents = await parser.aload_data("./my_file.pdf") | ||
|
||
# async batch | ||
documents = await parser.aload_data(["./my_file1.pdf", "./my_file2.pdf"]) | ||
``` | ||
|
||
## Using with `SimpleDirectoryReader` | ||
|
||
You can also integrate the parser as the default PDF loader in `SimpleDirectoryReader`: | ||
|
||
```python | ||
import nest_asyncio | ||
|
||
nest_asyncio.apply() | ||
|
||
from llama_parse import LlamaParse | ||
from llama_index.core import SimpleDirectoryReader | ||
|
||
parser = LlamaParse( | ||
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY | ||
result_type="markdown", # "markdown" and "text" are available | ||
verbose=True, | ||
) | ||
|
||
file_extractor = {".pdf": parser} | ||
documents = SimpleDirectoryReader( | ||
"./data", file_extractor=file_extractor | ||
).load_data() | ||
``` | ||
|
||
Full documentation for `SimpleDirectoryReader` can be found on the [LlamaIndex Documentation](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader.html). | ||
|
||
## Examples | ||
|
||
Several end-to-end indexing examples can be found in the examples folder | ||
|
||
- [Getting Started](https://github.com/run-llama/llama_parse/blob/main/examples/demo_basic.ipynb) | ||
- [Advanced RAG Example](https://github.com/run-llama/llama_parse/blob/main/examples/demo_advanced.ipynb) | ||
- [Raw API Usage](https://github.com/run-llama/llama_parse/blob/main/examples/demo_api.ipynb) | ||
|
||
## Terms of Service | ||
|
||
See the [Terms of Service Here](https://github.com/run-llama/llama_parse/blob/main/TOS.pdf). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.