diff --git a/README.md b/README.md index 49742a0b2..be8f0a4cf 100644 --- a/README.md +++ b/README.md @@ -48,6 +48,7 @@ This README provides detailed information on how to set up, develop, and deploy - [Supabase](#supabase) - [Postgres](#postgres) - [AnalyticDB](#analyticdb) + - [DashVector](#dashvector) - [Running the API Locally](#running-the-api-locally) - [Testing a Localhost Plugin in ChatGPT](#testing-a-localhost-plugin-in-chatgpt) - [Personalization](#personalization) @@ -184,6 +185,10 @@ Follow these steps to quickly set up and run the ChatGPT Retrieval Plugin: export ELASTICSEARCH_INDEX= export ELASTICSEARCH_REPLICAS= export ELASTICSEARCH_SHARDS= + + # DashVector + export DASHVECTOR_API_KEY= + export DASHVECTOR_COLLECTION= ``` 10. Run the API locally: `poetry run start` @@ -295,11 +300,11 @@ poetry install The API requires the following environment variables to work: -| Name | Required | Description | -| ---------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `DATASTORE` | Yes | This specifies the vector database provider you want to use to store and query embeddings. You can choose from `elasticsearch`, `chroma`, `pinecone`, `weaviate`, `zilliz`, `milvus`, `qdrant`, `redis`, `azuresearch`, `supabase`, `postgres`, `analyticdb`. | -| `BEARER_TOKEN` | Yes | This is a secret token that you need to authenticate your requests to the API. You can generate one using any tool or method you prefer, such as [jwt.io](https://jwt.io/). | -| `OPENAI_API_KEY` | Yes | This is your OpenAI API key that you need to generate embeddings using the `text-embedding-ada-002` model. You can get an API key by creating an account on [OpenAI](https://openai.com/). | +| Name | Required | Description | +| ---------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `DATASTORE` | Yes | This specifies the vector database provider you want to use to store and query embeddings. You can choose from `elasticsearch`, `chroma`, `pinecone`, `weaviate`, `zilliz`, `milvus`, `qdrant`, `redis`, `azuresearch`, `supabase`, `postgres`, `analyticdb`, `dashvector`. | +| `BEARER_TOKEN` | Yes | This is a secret token that you need to authenticate your requests to the API. You can generate one using any tool or method you prefer, such as [jwt.io](https://jwt.io/). | +| `OPENAI_API_KEY` | Yes | This is your OpenAI API key that you need to generate embeddings using the `text-embedding-ada-002` model. You can get an API key by creating an account on [OpenAI](https://openai.com/). | ### Using the plugin with Azure OpenAI @@ -377,6 +382,10 @@ For detailed setup instructions, refer to [`/docs/providers/llama/setup.md`](/do [Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html) currently supports storing vectors through the `dense_vector` field type and uses them to calculate document scores. Elasticsearch 8.0 builds on this functionality to support fast, approximate nearest neighbor search (ANN). This represents a much more scalable approach, allowing vector search to run efficiently on large datasets. For detailed setup instructions, refer to [`/docs/providers/elasticsearch/setup.md`](/docs/providers/elasticsearch/setup.md). +#### DashVector + +[DashVector](https://help.aliyun.com/document_detail/2510225.html) is a fully-managed vectorDB service that supports high-dimension dense and sparse vectors , real-time insertion, and filtered search. It is built to scale automatically and can adapt to different application requirements. For detailed setup instructions, refer to [`/docs/providers/dashvector/setup.md`](/docs/providers/dashvector/setup.md). + ### Running the API locally To run the API locally, you first need to set the requisite environment variables with the `export` command: @@ -559,7 +568,7 @@ feature/advanced-chunking-strategy-123 While the ChatGPT Retrieval Plugin is designed to provide a flexible solution for semantic search and retrieval, it does have some limitations: -- **Keyword search limitations**: The embeddings generated by the `text-embedding-ada-002` model may not always be effective at capturing exact keyword matches. As a result, the plugin might not return the most relevant results for queries that rely heavily on specific keywords. Some vector databases, like Elasticsearch, Pinecone, Weaviate and Azure Cognitive Search, use hybrid search and might perform better for keyword searches. +- **Keyword search limitations**: The embeddings generated by the `text-embedding-ada-002` model may not always be effective at capturing exact keyword matches. As a result, the plugin might not return the most relevant results for queries that rely heavily on specific keywords. Some vector databases, like DashVector, Elasticsearch, Pinecone, Weaviate and Azure Cognitive Search, use hybrid search and might perform better for keyword searches. - **Sensitive data handling**: The plugin does not automatically detect or filter sensitive data. It is the responsibility of the developers to ensure that they have the necessary authorization to include content in the Retrieval Plugin and that the content complies with data privacy requirements. - **Scalability**: The performance of the plugin may vary depending on the chosen vector database provider and the size of the dataset. Some providers may offer better scalability and performance than others. - **Language support**: The plugin currently uses OpenAI's `text-embedding-ada-002` model, which is optimized for use in English. However, it is still robust enough to generate good results for a variety of languages. @@ -613,3 +622,7 @@ We would like to extend our gratitude to the following contributors for their co - [mmmaia](https://github.com/mmmaia) - [Elasticsearch](https://www.elastic.co/) - [joemcelroy](https://github.com/joemcelroy) +- [DashVector](https://help.aliyun.com/document_detail/2510225.html) + - [yingdachen](http://github.com/yingdachen) + - [yurnom](https://github.com/yurnom) + - [xiaoyuxee](https://github.com/xiaoyuxee) diff --git a/datastore/factory.py b/datastore/factory.py index 41abb60eb..f8d5ba6e7 100644 --- a/datastore/factory.py +++ b/datastore/factory.py @@ -66,8 +66,12 @@ async def get_datastore() -> DataStore: ) return ElasticsearchDataStore() + case "dashvector": + from datastore.providers.dashvector_datastore import DashVectorDataStore + + return DashVectorDataStore() case _: raise ValueError( f"Unsupported vector database: {datastore}. " - f"Try one of the following: llama, elasticsearch, pinecone, weaviate, milvus, zilliz, redis, azuresearch, or qdrant" + f"Try one of the following: llama, elasticsearch, pinecone, weaviate, milvus, zilliz, redis, azuresearch, qdrant, or dashvector" ) diff --git a/datastore/providers/dashvector_datastore.py b/datastore/providers/dashvector_datastore.py new file mode 100644 index 000000000..e999f86c8 --- /dev/null +++ b/datastore/providers/dashvector_datastore.py @@ -0,0 +1,267 @@ +import os +from typing import Any, Dict, List, Optional + +import dashvector +from dashvector import Client, Doc + +from tenacity import retry, wait_random_exponential, stop_after_attempt +import asyncio +from loguru import logger + +from datastore.datastore import DataStore +from models.models import ( + DocumentChunk, + DocumentChunkMetadata, + DocumentChunkWithScore, + DocumentMetadataFilter, + QueryResult, + QueryWithEmbedding, + Source, +) +from services.date import to_unix_timestamp + +# Read environment variables for DashVector configuration +DASHVECTOR_API_KEY = os.environ.get("DASHVECTOR_API_KEY") +DASHVECTOR_COLLECTION = os.environ.get("DASHVECTOR_COLLECTION") +assert DASHVECTOR_API_KEY is not None +assert DASHVECTOR_COLLECTION is not None + +# Set the batch size for vector upsert to DashVector +UPSERT_BATCH_SIZE = 100 + +# Set the dimension for embedding +VECTOR_DIMENSION = 1536 + + +class DashVectorDataStore(DataStore): + def __init__(self): + # Init dashvector client + client = Client(api_key=DASHVECTOR_API_KEY) + self._client = client + + # Get the collection in DashVector + collection = client.get(DASHVECTOR_COLLECTION) + + # Check if the collection exists in DashVector + if collection: + logger.info(f"Connected existed collection {DASHVECTOR_COLLECTION}.") + self._collection = collection + else: + self._create_collection() + + @retry(wait=wait_random_exponential(min=1, max=20), + stop=stop_after_attempt(3)) + async def _upsert(self, chunks: Dict[str, List[DocumentChunk]]) -> List[str]: + """ + Takes in a dict from document id to list of document chunks and inserts them into the collection. + Return a list of document ids. + """ + # Initialize a list of ids to return + doc_ids: List[str] = [] + # Initialize a list of vectors to upsert + docs = [] + # Loop through the dict items + for doc_id, chunk_list in chunks.items(): + # Append the id to the ids list + doc_ids.append(doc_id) + logger.info(f"Upserting document_id: {doc_id}") + for chunk in chunk_list: + fields = self._get_dashvector_fields(chunk.metadata) + # Add the text to the fields + fields["text"] = chunk.text + docs.append( + Doc(id=chunk.id, vector=chunk.embedding, fields=fields) + ) + + # Split the vectors list into batches of the specified size + batches = [ + docs[i: i + UPSERT_BATCH_SIZE] + for i in range(0, len(docs), UPSERT_BATCH_SIZE) + ] + + # Upsert each batch to DashVector + for batch in batches: + logger.info(f"Upserting batch of size {len(batch)}") + resp = self._collection.upsert(docs=batch) + if resp: + logger.info("Upserted batch successfully") + else: + raise Exception(f"Failed to upsert batch, error: {resp}") + + return doc_ids + + @retry(wait=wait_random_exponential(min=1, max=20), + stop=stop_after_attempt(3)) + async def _query( + self, + queries: List[QueryWithEmbedding], + ) -> List[QueryResult]: + """ + Takes in a list of queries with embeddings and filters and returns a list of query results with matching document chunks and scores. + """ + + # Define a helper coroutine that performs a single query and returns a QueryResult + async def _single_query(query: QueryWithEmbedding) -> QueryResult: + logger.debug(f"Query: {query.query}") + + # Convert the metadata filter object to a dict with dashvector filter expressions + dashvector_filter = self._get_dashvector_filter(query.filter) + + resp = self._collection.query(vector=query.embedding, + topk=query.top_k, + filter=dashvector_filter) + if not resp: + raise Exception(f"Error querying in collection: {resp}") + + query_results: List[DocumentChunkWithScore] = [] + for doc in resp: + score = doc.score + metadata = doc.fields + text = metadata.pop("text") + + # Create a document chunk with score object with the result data + result = DocumentChunkWithScore( + id=doc.id, + score=score, + text=text, + metadata=metadata, + ) + query_results.append(result) + return QueryResult(query=query.query, results=query_results) + + # Use asyncio.gather to run multiple _single_query coroutines concurrently and collect their results + results: List[QueryResult] = await asyncio.gather( + *[_single_query(query) for query in queries] + ) + + return results + + @retry(wait=wait_random_exponential(min=1, max=20), + stop=stop_after_attempt(3)) + async def delete( + self, + ids: Optional[List[str]] = None, + filter: Optional[DocumentMetadataFilter] = None, + delete_all: Optional[bool] = None, + ) -> bool: + """ + Removes vectors by ids, filter, or everything from the collection. + """ + + # Delete all vectors from the collection if delete_all is True + if delete_all: + logger.info(f"Deleting all vectors from collection") + resp = self._collection.delete(delete_all=True) + if not resp: + raise Exception( + f"Error deleting all vectors, error: {resp.message}" + ) + logger.info(f"Deleted all vectors successfully") + return True + + # Delete vectors by filter + if filter: + # Query the docs by filter + resp = self._collection.query(topk=1024, filter=self._get_dashvector_filter(filter)) + if not resp: + raise Exception( + f"Error deleting vectors with filter, error: {resp.message}" + ) + if ids is not None: + ids += [doc.id for doc in resp] + else : + ids = [doc.id for doc in resp] + + # Delete vectors that match the document ids from the collection if the ids list is not empty + if ids is not None and len(ids) > 0: + logger.info(f"Deleting vectors with ids {ids}") + resp = self._collection.delete(ids) + if not resp: + raise Exception( + f"Error deleting vectors with ids, error: {resp.message}" + ) + logger.info(f"Deleted vectors with ids successfully") + return True + + def _get_dashvector_filter( + self, filter: Optional[DocumentMetadataFilter] = None + ) -> Optional[str]: + if filter is None: + return None + + dashvector_filter = [] + for field, value in filter.dict().items(): + if value is not None: + if field == "start_date": + dashvector_filter.append(f"created_at >= {to_unix_timestamp(value)}") + elif field == "end_date": + dashvector_filter.append(f"created_at <= {to_unix_timestamp(value)}") + else: + if isinstance(value, str): + dashvector_filter.append(f"{field} = '{value}'") + else: + dashvector_filter.append(f"{field} = {value}") + + return " and ".join(dashvector_filter) + + def _get_dashvector_fields( + self, metadata: Optional[DocumentChunkMetadata] = None + ) -> Dict[str, Any]: + dashvector_fields = {} + # For each field in the Metadata, check if it has a value and add it to the dashvector fields + for field, value in metadata.dict().items(): + if value is not None: + if field == "created_at": + dashvector_fields[field] = to_unix_timestamp(value) + elif field == "source": + dashvector_fields[field] = value.name + else: + dashvector_fields[field] = value + return dashvector_fields + + def _delete_collection(self) -> None: + resp = self._client.delete(DASHVECTOR_COLLECTION) + if not resp: + raise Exception( + f"Error delete collection, error: {resp.message}" + ) + + def _create_collection(self) -> None: + """ + Create dashvector collection for vector management. + """ + + # Get all fields in the metadata object in a list + fields_schema = { + field: str for field in DocumentChunkMetadata.__fields__.keys() + if field != "created_at" + } + # used for compare created time + fields_schema["created_at"] = int + + logger.info( + f"Creating collection {DASHVECTOR_COLLECTION} with metadata config {fields_schema}." + ) + + # Create new collection + resp = self._client.create( + DASHVECTOR_COLLECTION, + dimension=VECTOR_DIMENSION, + fields_schema=fields_schema + ) + if not resp: + raise Exception( + f"Fail to create collection {DASHVECTOR_COLLECTION}. " + f"Error: {resp.message}" + ) + + # set self collection + collection = self._client.get(DASHVECTOR_COLLECTION) + if not collection: + raise Exception( + f"Fail to get collection {DASHVECTOR_COLLECTION}. " + f"Error: {collection}" + ) + self._collection = collection + logger.info( + f"Collection {DASHVECTOR_COLLECTION} created successfully.") \ No newline at end of file diff --git a/docs/deployment/removing-unused-dependencies.md b/docs/deployment/removing-unused-dependencies.md index 44a56c630..321aac929 100644 --- a/docs/deployment/removing-unused-dependencies.md +++ b/docs/deployment/removing-unused-dependencies.md @@ -4,17 +4,18 @@ Before deploying your app, you might want to remove unused dependencies from you Here are the packages you can remove for each vector database provider: -- **Pinecone:** Remove `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Weaviate:** Remove `pinecone-client`, `pymilvus`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity` and `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`. -- **Zilliz:** Remove `pinecone-client`, `weaviate-client`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity` and `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Milvus:** Remove `pinecone-client`, `weaviate-client`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity` and `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Qdrant:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `redis`, `chromadb`, `llama-index`, `azure-identity` and `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Redis:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `chromadb`, `llama-index`, `azure-identity` and `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **LlamaIndex:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `chromadb`, `redis`, `azure-identity` and `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Chroma:**: Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `llama-index`, `redis`, `azure-identity` and `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Azure Cognitive Search**: Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `llama-index`, `redis` and `chromadb`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Supabase:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `llama-index`, `azure-identity` and `azure-search-documents`, `psycopg2`+`pgvector`, and `psycopg2cffi`. -- **Postgres:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `llama-index`, `azure-identity` and `azure-search-documents`, `supabase`, and `psycopg2cffi`. -- **AnalyticDB:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `llama-index`, `azure-identity` and `azure-search-documents`, `supabase`, and `psycopg2`+`pgvector`. +- **Pinecone:** Remove `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Weaviate:** Remove `pinecone-client`, `pymilvus`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Zilliz:** Remove `pinecone-client`, `weaviate-client`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Milvus:** Remove `pinecone-client`, `weaviate-client`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Qdrant:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `redis`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Redis:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **LlamaIndex:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `chromadb`, `redis`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Chroma:**: Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `llama-index`, `redis`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Azure Cognitive Search**: Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `llama-index`, `redis`, `chromadb`, `supabase`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Supabase:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `llama-index`, `azure-identity`, `azure-search-documents`, `psycopg2`+`pgvector`, `psycopg2cffi`, and `dashvector`. +- **Postgres:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2cffi`, and `dashvector`. +- **AnalyticDB:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `dashvector`. +- **DashVector:** Remove `pinecone-client`, `weaviate-client`, `pymilvus`, `qdrant-client`, `redis`, `chromadb`, `llama-index`, `azure-identity`, `azure-search-documents`, `supabase`, `psycopg2`+`pgvector`, and `psycopg2cffi`. After removing the unnecessary packages from the `pyproject.toml` file, you don't need to run `poetry lock` and `poetry install` manually. The provided Dockerfile takes care of installing the required dependencies using the `requirements.txt` file generated by the `poetry export` command. diff --git a/docs/providers/dashvector/setup.md b/docs/providers/dashvector/setup.md new file mode 100644 index 000000000..26e32763b --- /dev/null +++ b/docs/providers/dashvector/setup.md @@ -0,0 +1,24 @@ +# DashVector + +[DashVector](https://help.aliyun.com/document_detail/2510225.html) is a fully-managed vectorDB service that supports high-dimension dense and sparse vectors , real-time insertion, and filtered search. It is built to scale automatically and can adapt to different application requirements. + +- To use DashVector as your vector database provider, you should create an API key in [DashVector console](https://dashvector.console.aliyun.com). +- The app will create a DashVector collection for you automatically when you run it for the first time. Just pick a name for your collection and set it as an environment variable. + +**Environment Variables:** + +| Name | Required | Description | +|-------------------------| -------- | ---------------------------------------------------------------------------------------------------------------------- | +| `DATASTORE` | Yes | Datastore name, set this to `dashvector` | +| `BEARER_TOKEN` | Yes | Your secret token for authenticating requests to the API | +| `OPENAI_API_KEY` | Yes | Your OpenAI API key for generating embeddings with the `text-embedding-ada-002` model | +| `DASHVECTOR_API_KEY` | Yes | Your DashVector API key, found in the [DashVector console](https://dashvector.console.aliyun.com/) | +| `DASHVECTOR_COLLECTION` | Yes | Your chosen DashVector collection name. **Note:** Collection name can only contains alphanumeric characters, `_` or `-`| + +## Running DashVector Integration Tests + +A suite of integration tests verifies the DashVector integration. Launch the test suite with this command: + +```bash +pytest ./tests/datastore/providers/dashvector/test_dashvector_datastore.py +``` \ No newline at end of file diff --git a/poetry.lock b/poetry.lock index fe1d21c99..657cbf8bc 100644 --- a/poetry.lock +++ b/poetry.lock @@ -769,6 +769,26 @@ ssh = ["bcrypt (>=3.1.5)"] test = ["pretend", "pytest (>=6.2.0)", "pytest-benchmark", "pytest-cov", "pytest-xdist"] test-randomorder = ["pytest-randomly"] +[[package]] +name = "dashvector" +version = "1.0.5" +description = "DashVector Client Python Sdk Library" +optional = false +python-versions = ">=3.7,<4.0" +files = [ + {file = "dashvector-1.0.5-py3-none-any.whl", hash = "sha256:a79e5bdb0d6447706cbf3645d9f1d07fa8e280d74842491aaa54e74258def2d6"}, + {file = "dashvector-1.0.5.tar.gz", hash = "sha256:2ee9a8c26699b9d978e7d84ff1cd92fa7ea5411c557ef5fb2a3fea02bd9999c4"}, +] + +[package.dependencies] +aiohttp = ">=3.1.0,<4.0.0" +grpcio = [ + {version = ">=1.22.0", markers = "python_version < \"3.11\""}, + {version = ">=1.49.1", markers = "python_version >= \"3.11\""}, +] +numpy = "*" +protobuf = ">=3.8.0,<4.0.0" + [[package]] name = "dataclasses-json" version = "0.5.7" @@ -1301,61 +1321,62 @@ protobuf = ["grpcio-tools (>=1.53.0)"] [[package]] name = "grpcio-tools" -version = "1.53.0" +version = "1.48.2" description = "Protobuf code generator for gRPC" optional = false -python-versions = ">=3.7" +python-versions = ">=3.6" files = [ - {file = "grpcio-tools-1.53.0.tar.gz", hash = "sha256:925efff2d63ca3266f93c924ffeba5d496f16a8ccbe125fa0d18acf47cc5fa88"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-linux_armv7l.whl", hash = "sha256:41b859cf943256debba1e7b921e3689c89f95495b65f7ad226c4f0e38edf8ee4"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-macosx_12_0_universal2.whl", hash = "sha256:17c557240f7fbe1886dcfb5f3ba79740ecb65fe3b93061e64b8f4dfc6a6a5dc5"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-manylinux_2_17_aarch64.whl", hash = "sha256:6afffd7e97e5bddc63b3ce9abe912b9adb704a36ba86d4406be94426734b97c2"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f55e2c13620271b7f5a81a489a188d6e34a24da8885d46f1566f0e798cb59e6f"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6bd4c732d8d7a736e787b5d0963d4195267fc856e1d313d4532d1625e19a0e4a"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:99ecefb6b66e9fe41468a70ee2f05da2eb9c7bf63867fb9ff07f7dd90ea813ae"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:7754d6466191d327a0eef364ad5b863477a8fcc12953adc06b30b8e470c70e4a"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-win32.whl", hash = "sha256:f31c549d793a0e72c044f724b3373141d2aa9970fe97b1c2cfaa7ea44002b9aa"}, - {file = "grpcio_tools-1.53.0-cp310-cp310-win_amd64.whl", hash = "sha256:b4173b95e2c29a5145c806d16945ce1e5b38a11c7eb6ab1a6d74afc0a2ce47d9"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-linux_armv7l.whl", hash = "sha256:613a84ebd1881635370c12503f2b15b37332a53fbac32904c94ac4c0c10f0a2a"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-macosx_10_10_universal2.whl", hash = "sha256:af686b83bc6b5c1f1591c9f49183717974047de9546adcf5e09a18781b550c96"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-manylinux_2_17_aarch64.whl", hash = "sha256:3cc832e8297e9437bc2b137fe815c8ba1d9af6ffdd76c5c6d7f911bf8e1b0f45"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:39d0a254de49d852f5fe9f9df0a45b2ae66bc04e2d9ee1d6d2c0ba1e70fac91a"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7062109553ec1873c5c09cc379b8ae0aa76a2d6d6aae97759b97787b93fa9786"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:7728407b1e89fb1473b86152fc33be00f1a25a5aa3264245521f05cbbef9d817"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:2758ea125442bc81251267fc9c28f65555a571f6a0afda4d71a6e7d669347095"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-win32.whl", hash = "sha256:8940d59fca790f1bd45785d0661c3a8c081231c9f8049d7fbf6c6c00737e43da"}, - {file = "grpcio_tools-1.53.0-cp311-cp311-win_amd64.whl", hash = "sha256:c2cff79be5a06d63e9a6a7e38f8f160ade21517386eabe27afacef65a8531358"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-linux_armv7l.whl", hash = "sha256:d646d65fafbf70a57416493e719a0df7ffa0772133266cfe1b2b72e072ae64a2"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-macosx_10_10_universal2.whl", hash = "sha256:7da0fc185735050d8240b1d74c4667a02baf1b4fa379a5fc05d1fc067eeba596"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-manylinux_2_17_aarch64.whl", hash = "sha256:2be17265c0f070efd625683cef986e07dbc495103fcc719009ff2f6988003166"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4701d48f649443f1101a24d85e9d5ac13346ccac7781e243f49491328e172266"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b54c64d85bea5c3a3d895454878c7d6bed5cbb80dc3cafcd75dc1e78300d8c95"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:7152045190e9bd665d1feaeaef931d82c75cacce2b116ab150befa90855de3d0"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:e18292123c86975d0aa47f1bcb176393640dcc23912e9f3a2247f1eff81ac8e8"}, - {file = "grpcio_tools-1.53.0-cp37-cp37m-win_amd64.whl", hash = "sha256:b1b76b6ab5c24e44b15d6a7df6c1b81c3099a54b82d41a3ce96e73a2e6a5081c"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-linux_armv7l.whl", hash = "sha256:e76e8dfe6fe4e61ce3049e9d56c0d806d0d3edc28aa32117d1b17f387469c52e"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-macosx_10_10_universal2.whl", hash = "sha256:4c6acaca09cfcd59850e27bd138df9d01c0686c42a5412aa6a92141c15316b1e"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-manylinux_2_17_aarch64.whl", hash = "sha256:76898c1dadf8630a75a40b5a89ab38e326f1288dcfde3413cdfa7a58e149c987"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2b47f8b1bd3af2fb25548b625ad9c3659da30fe83c06f462f357c754f49b71ae"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a2faad4b6362e7ff3ae43ef2d51dfce0a3bc32cf52469e88568c3f65cae377d5"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:830261fe08541f0fd2dd5035264df2b91012988f37aa1d80a0b4ee6404dc25ae"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:4be32c694c760f3281555089f7aed7d48ca7ea4094115a08b5fc895e17d7e62e"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-win32.whl", hash = "sha256:4605db5a5828205d7fa33a5de9e00723bd037709e74e15c028b9dcec2339b7bc"}, - {file = "grpcio_tools-1.53.0-cp38-cp38-win_amd64.whl", hash = "sha256:0229e6cd442915192b8f8ee2e7e1c8b9986c878bc4dd8be3539f3be35f1b8282"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-linux_armv7l.whl", hash = "sha256:ad0c20688a650e731e8328a7a08899c433a59bfc995a7afcf715b5ad9eca9e7b"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-macosx_10_10_universal2.whl", hash = "sha256:a8c3e30c531969c62a5a219be414277b269c1be9a76bcd6948571868894e19b2"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-manylinux_2_17_aarch64.whl", hash = "sha256:326c67b35be69409a88632e6145032d53b8b8141634e9cbcd27fa8f9015a112c"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:102b6d323d7cef7ac29683f949ec66885b417c06df6059f6a88d07c5556c2592"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:861f8634cca3ca5bb5336ba16cc78291dba3e7fcadedff195bfdeb433f2c29f2"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:c9a9e1da1868349eba401e9648eac19132700942c475adcc97b6938bf4bf0182"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:ccf7313e5bee13f2f86d12741489f3ed8c901d6b463dff2604191cd4ff518abb"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-win32.whl", hash = "sha256:65b77532bb8f6ab1bfbdd2ac0788626a6c05b227f4722d3bbc2c54258e49c3e5"}, - {file = "grpcio_tools-1.53.0-cp39-cp39-win_amd64.whl", hash = "sha256:7c0ede22796259e83aa1f108038513e86672b2892d3654f94415e3930b74b871"}, + {file = "grpcio-tools-1.48.2.tar.gz", hash = "sha256:8902a035708555cddbd61b5467cea127484362decc52de03f061a1a520fe90cd"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-linux_armv7l.whl", hash = "sha256:92acc3e10ba2b0dcb90a88ae9fe1cc0ffba6868545207e4ff20ca95284f8e3c9"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-macosx_10_10_x86_64.whl", hash = "sha256:e5bb396d63495667d4df42e506eed9d74fc9a51c99c173c04395fe7604c848f1"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-manylinux_2_17_aarch64.whl", hash = "sha256:84a84d601a238572d049d3108e04fe4c206536e81076d56e623bd525a1b38def"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:70564521e86a0de35ea9ac6daecff10cb46860aec469af65869974807ce8e98b"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bdbbe63f6190187de5946891941629912ac8196701ed2253fa91624a397822ec"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:ae56f133b05b7e5d780ef7e032dd762adad7f3dc8f64adb43ff5bfabd659f435"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:f0feb4f2b777fa6377e977faa89c26359d4f31953de15e035505b92f41aa6906"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-win32.whl", hash = "sha256:80f450272316ca0924545f488c8492649ca3aeb7044d4bf59c426dcdee527f7c"}, + {file = "grpcio_tools-1.48.2-cp310-cp310-win_amd64.whl", hash = "sha256:21ff50e321736eba22210bf9b94e05391a9ac345f26e7df16333dc75d63e74fb"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-linux_armv7l.whl", hash = "sha256:d598ccde6338b2cfbb3124f34c95f03394209013f9b1ed4a5360a736853b1c27"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-macosx_10_10_x86_64.whl", hash = "sha256:a43d26714933f23de93ea0bf9c86c66a6ede709b8ca32e357f9e2181703e64ae"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-manylinux_2_17_aarch64.whl", hash = "sha256:55fdebc73fb580717656b1bafa4f8eca448726a7aa22726a6c0a7895d2f0f088"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:8588819b22d0de3aa1951e1991cc3e4b9aa105eecf6e3e24eb0a2fc8ab958b3e"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9771d4d317dca029dfaca7ec9282d8afe731c18bc536ece37fd39b8a974cc331"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-musllinux_1_1_i686.whl", hash = "sha256:d886a9e052a038642b3af5d18e6f2085d1656d9788e202dc23258cf3a751e7ca"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-musllinux_1_1_x86_64.whl", hash = "sha256:d77e8b1613876e0d8fd17709509d4ceba13492816426bd156f7e88a4c47e7158"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-win32.whl", hash = "sha256:dcaaecdd5e847de5c1d533ea91522bf56c9e6b2dc98cdc0d45f0a1c26e846ea2"}, + {file = "grpcio_tools-1.48.2-cp36-cp36m-win_amd64.whl", hash = "sha256:0119aabd9ceedfdf41b56b9fdc8284dd85a7f589d087f2694d743f346a368556"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-linux_armv7l.whl", hash = "sha256:189be2a9b672300ca6845d94016bdacc052fdbe9d1ae9e85344425efae2ff8ef"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-macosx_10_10_x86_64.whl", hash = "sha256:9443f5c30bac449237c3cf99da125f8d6e6c01e17972bc683ee73b75dea95573"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-manylinux_2_17_aarch64.whl", hash = "sha256:e0403e095b343431195db1305248b50019ad55d3dd310254431af87e14ef83a2"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5410d6b601d1404835e34466bd8aee37213489b36ee1aad2276366e265ff29d4"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:51be91b7c7056ff9ee48b1eccd4a2840b0126230803a5e09dfc082a5b16a91c1"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:516eedd5eb7af6326050bc2cfceb3a977b9cc1144f283c43cc4956905285c912"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:d18599ab572b2f15a8f3db49503272d1bb4fcabb4b4d1214ef03aca1816b20a0"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-win32.whl", hash = "sha256:d18ef2adc05a8ef9e58ac46357f6d4ce7e43e077c7eda0a4425773461f9d0e6e"}, + {file = "grpcio_tools-1.48.2-cp37-cp37m-win_amd64.whl", hash = "sha256:6d9753944e5a6b6b78b76ce9d2ae0fe3f748008c1849deb7fadcb64489d6553b"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-linux_armv7l.whl", hash = "sha256:3c8749dca04a8d302862ceeb1dfbdd071ee13b281395975f24405a347e5baa57"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-macosx_10_10_x86_64.whl", hash = "sha256:7307dd2408b82ea545ae63502ec03036b025f449568556ea9a056e06129a7a4e"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-manylinux_2_17_aarch64.whl", hash = "sha256:072234859f6069dc43a6be8ad6b7d682f4ba1dc2e2db2ebf5c75f62eee0f6dfb"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6cc298fbfe584de8876a85355efbcf796dfbcfac5948c9560f5df82e79336e2a"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f75973a42c710999acd419968bc79f00327e03e855bbe82c6529e003e49af660"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:f766050e491d0b3203b6b85638015f543816a2eb7d089fc04e86e00f6de0e31d"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:8e0d74403484eb77e8df2566a64b8b0b484b5c87903678c381634dd72f252d5e"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-win32.whl", hash = "sha256:cb75bac0cd43858cb759ef103fe68f8c540cb58b63dda127e710228fec3007b8"}, + {file = "grpcio_tools-1.48.2-cp38-cp38-win_amd64.whl", hash = "sha256:cabc8b0905cedbc3b2b7b2856334fa35cce3d4bc79ae241cacd8cca8940a5c85"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-linux_armv7l.whl", hash = "sha256:e712a6d00606ad19abdeae852a7e521d6f6d0dcea843708fecf3a38be16a851e"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-macosx_10_10_x86_64.whl", hash = "sha256:e7e7668f89fd598c5469bb58e16bfd12b511d9947ccc75aec94da31f62bc3758"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-manylinux_2_17_aarch64.whl", hash = "sha256:a415fbec67d4ff7efe88794cbe00cf548d0f0a5484cceffe0a0c89d47694c491"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d96e96ae7361aa51c9cd9c73b677b51f691f98df6086860fcc3c45852d96b0b0"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e20d7885a40e68a2bda92908acbabcdf3c14dd386c3845de73ba139e9df1f132"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:8a5614251c46da07549e24f417cf989710250385e9d80deeafc53a0ee7df6325"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:ace0035766fe01a1b096aa050be9f0a9f98402317e7aeff8bfe55349be32a407"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-win32.whl", hash = "sha256:4fa4300b1be59b046492ed3c5fdb59760bc6433f44c08f50de900f9552ec7461"}, + {file = "grpcio_tools-1.48.2-cp39-cp39-win_amd64.whl", hash = "sha256:0fb6c1c1e56eb26b224adc028a4204b6ad0f8b292efa28067dff273bbc8b27c4"}, ] [package.dependencies] -grpcio = ">=1.53.0" -protobuf = ">=4.21.6,<5.0dev" +grpcio = ">=1.48.2" +protobuf = ">=3.12.0,<4.0dev" setuptools = "*" [[package]] @@ -2522,24 +2543,33 @@ test = ["coverage", "flake8", "freezegun (==0.3.15)", "mock (>=2.0.0)", "pylint" [[package]] name = "protobuf" -version = "4.23.2" -description = "" +version = "3.20.3" +description = "Protocol Buffers" optional = false python-versions = ">=3.7" files = [ - {file = "protobuf-4.23.2-cp310-abi3-win32.whl", hash = "sha256:384dd44cb4c43f2ccddd3645389a23ae61aeb8cfa15ca3a0f60e7c3ea09b28b3"}, - {file = "protobuf-4.23.2-cp310-abi3-win_amd64.whl", hash = "sha256:09310bce43353b46d73ba7e3bca78273b9bc50349509b9698e64d288c6372c2a"}, - {file = "protobuf-4.23.2-cp37-abi3-macosx_10_9_universal2.whl", hash = "sha256:b2cfab63a230b39ae603834718db74ac11e52bccaaf19bf20f5cce1a84cf76df"}, - {file = "protobuf-4.23.2-cp37-abi3-manylinux2014_aarch64.whl", hash = "sha256:c52cfcbfba8eb791255edd675c1fe6056f723bf832fa67f0442218f8817c076e"}, - {file = "protobuf-4.23.2-cp37-abi3-manylinux2014_x86_64.whl", hash = "sha256:86df87016d290143c7ce3be3ad52d055714ebaebb57cc659c387e76cfacd81aa"}, - {file = "protobuf-4.23.2-cp37-cp37m-win32.whl", hash = "sha256:281342ea5eb631c86697e1e048cb7e73b8a4e85f3299a128c116f05f5c668f8f"}, - {file = "protobuf-4.23.2-cp37-cp37m-win_amd64.whl", hash = "sha256:ce744938406de1e64b91410f473736e815f28c3b71201302612a68bf01517fea"}, - {file = "protobuf-4.23.2-cp38-cp38-win32.whl", hash = "sha256:6c081863c379bb1741be8f8193e893511312b1d7329b4a75445d1ea9955be69e"}, - {file = "protobuf-4.23.2-cp38-cp38-win_amd64.whl", hash = "sha256:25e3370eda26469b58b602e29dff069cfaae8eaa0ef4550039cc5ef8dc004511"}, - {file = "protobuf-4.23.2-cp39-cp39-win32.whl", hash = "sha256:efabbbbac1ab519a514579ba9ec52f006c28ae19d97915951f69fa70da2c9e91"}, - {file = "protobuf-4.23.2-cp39-cp39-win_amd64.whl", hash = "sha256:54a533b971288af3b9926e53850c7eb186886c0c84e61daa8444385a4720297f"}, - {file = "protobuf-4.23.2-py3-none-any.whl", hash = "sha256:8da6070310d634c99c0db7df48f10da495cc283fd9e9234877f0cd182d43ab7f"}, - {file = "protobuf-4.23.2.tar.gz", hash = "sha256:20874e7ca4436f683b64ebdbee2129a5a2c301579a67d1a7dda2cdf62fb7f5f7"}, + {file = "protobuf-3.20.3-cp310-cp310-manylinux2014_aarch64.whl", hash = "sha256:f4bd856d702e5b0d96a00ec6b307b0f51c1982c2bf9c0052cf9019e9a544ba99"}, + {file = "protobuf-3.20.3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:9aae4406ea63d825636cc11ffb34ad3379335803216ee3a856787bcf5ccc751e"}, + {file = "protobuf-3.20.3-cp310-cp310-win32.whl", hash = "sha256:28545383d61f55b57cf4df63eebd9827754fd2dc25f80c5253f9184235db242c"}, + {file = "protobuf-3.20.3-cp310-cp310-win_amd64.whl", hash = "sha256:67a3598f0a2dcbc58d02dd1928544e7d88f764b47d4a286202913f0b2801c2e7"}, + {file = "protobuf-3.20.3-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:899dc660cd599d7352d6f10d83c95df430a38b410c1b66b407a6b29265d66469"}, + {file = "protobuf-3.20.3-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:e64857f395505ebf3d2569935506ae0dfc4a15cb80dc25261176c784662cdcc4"}, + {file = "protobuf-3.20.3-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:d9e4432ff660d67d775c66ac42a67cf2453c27cb4d738fc22cb53b5d84c135d4"}, + {file = "protobuf-3.20.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:74480f79a023f90dc6e18febbf7b8bac7508420f2006fabd512013c0c238f454"}, + {file = "protobuf-3.20.3-cp37-cp37m-win32.whl", hash = "sha256:b6cc7ba72a8850621bfec987cb72623e703b7fe2b9127a161ce61e61558ad905"}, + {file = "protobuf-3.20.3-cp37-cp37m-win_amd64.whl", hash = "sha256:8c0c984a1b8fef4086329ff8dd19ac77576b384079247c770f29cc8ce3afa06c"}, + {file = "protobuf-3.20.3-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:de78575669dddf6099a8a0f46a27e82a1783c557ccc38ee620ed8cc96d3be7d7"}, + {file = "protobuf-3.20.3-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:f4c42102bc82a51108e449cbb32b19b180022941c727bac0cfd50170341f16ee"}, + {file = "protobuf-3.20.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:44246bab5dd4b7fbd3c0c80b6f16686808fab0e4aca819ade6e8d294a29c7050"}, + {file = "protobuf-3.20.3-cp38-cp38-win32.whl", hash = "sha256:c02ce36ec760252242a33967d51c289fd0e1c0e6e5cc9397e2279177716add86"}, + {file = "protobuf-3.20.3-cp38-cp38-win_amd64.whl", hash = "sha256:447d43819997825d4e71bf5769d869b968ce96848b6479397e29fc24c4a5dfe9"}, + {file = "protobuf-3.20.3-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:398a9e0c3eaceb34ec1aee71894ca3299605fa8e761544934378bbc6c97de23b"}, + {file = "protobuf-3.20.3-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:bf01b5720be110540be4286e791db73f84a2b721072a3711efff6c324cdf074b"}, + {file = "protobuf-3.20.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:daa564862dd0d39c00f8086f88700fdbe8bc717e993a21e90711acfed02f2402"}, + {file = "protobuf-3.20.3-cp39-cp39-win32.whl", hash = "sha256:819559cafa1a373b7096a482b504ae8a857c89593cf3a25af743ac9ecbd23480"}, + {file = "protobuf-3.20.3-cp39-cp39-win_amd64.whl", hash = "sha256:03038ac1cfbc41aa21f6afcbcd357281d7521b4157926f30ebecc8d4ea59dcb7"}, + {file = "protobuf-3.20.3-py2.py3-none-any.whl", hash = "sha256:a7ca6d488aa8ff7f329d4c545b2dbad8ac31464f1d8b1c87ad1346717731e4db"}, + {file = "protobuf-3.20.3.tar.gz", hash = "sha256:2e3427429c9cffebf259491be0af70189607f365c2f41c7c3764af6f337105f2"}, ] [[package]] @@ -2771,6 +2801,7 @@ files = [ {file = "pymongo-4.5.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6422b6763b016f2ef2beedded0e546d6aa6ba87910f9244d86e0ac7690f75c96"}, {file = "pymongo-4.5.0-cp312-cp312-win32.whl", hash = "sha256:77cfff95c1fafd09e940b3fdcb7b65f11442662fad611d0e69b4dd5d17a81c60"}, {file = "pymongo-4.5.0-cp312-cp312-win_amd64.whl", hash = "sha256:e57d859b972c75ee44ea2ef4758f12821243e99de814030f69a3decb2aa86807"}, + {file = "pymongo-4.5.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:8443f3a8ab2d929efa761c6ebce39a6c1dca1c9ac186ebf11b62c8fe1aef53f4"}, {file = "pymongo-4.5.0-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:2b0176f9233a5927084c79ff80b51bd70bfd57e4f3d564f50f80238e797f0c8a"}, {file = "pymongo-4.5.0-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:89b3f2da57a27913d15d2a07d58482f33d0a5b28abd20b8e643ab4d625e36257"}, {file = "pymongo-4.5.0-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:5caee7bd08c3d36ec54617832b44985bd70c4cbd77c5b313de6f7fce0bb34f93"}, @@ -4323,4 +4354,4 @@ postgresql = ["psycopg2cffi"] [metadata] lock-version = "2.0" python-versions = "^3.10" -content-hash = "75528d93a802f01a02594d42fe33c96146fbf6c6e35edade1b6c86afce50f9e1" +content-hash = "3c6fc0cd19baf63c8f8cd1cc1bc474ad10b3f7d7d54756d881f2cb86448b9e7f" diff --git a/pyproject.toml b/pyproject.toml index 1ba7bf7cc..608e5d961 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -37,6 +37,7 @@ psycopg2cffi = {version = "^2.9.0", optional = true} loguru = "^0.7.0" elasticsearch = "8.8.2" pymongo = "^4.3.3" +dashvector = "1.0.5" [tool.poetry.scripts] start = "server.main:start" diff --git a/tests/datastore/providers/dashvector/test_dashvector_datastore.py b/tests/datastore/providers/dashvector/test_dashvector_datastore.py new file mode 100644 index 000000000..d48dfbfa0 --- /dev/null +++ b/tests/datastore/providers/dashvector/test_dashvector_datastore.py @@ -0,0 +1,187 @@ +from time import sleep +from typing import Dict, List + +import pytest +from datastore.providers.dashvector_datastore import DashVectorDataStore +from models.models import ( + DocumentChunk, + DocumentChunkMetadata, + QueryWithEmbedding, + DocumentMetadataFilter, + Source, +) + +DIM = 1536 + + +@pytest.fixture +def dashvector_datastore(): + return DashVectorDataStore() + + +@pytest.fixture +def document_chunks() -> Dict[str, List[DocumentChunk]]: + doc_id = "zerp" + doc_chunks = [] + + ids = ["abc_123", "def_456", "ghi_789"] + texts = [ + "lorem ipsum dolor sit amet", + "consectetur adipiscing elit", + "sed do eiusmod tempor incididunt", + ] + sources = [Source.email, Source.file, Source.chat] + source_ids = ["foo", "bar", "baz"] + urls = ["foo.com", "bar.net", "baz.org"] + created_ats = [ + "1929-10-28T09:30:00-05:00", + "2009-01-03T16:39:57-08:00", + "2021-01-21T10:00:00-02:00", + ] + authors = ["Max Mustermann", "John Doe", "Jane Doe"] + + for i, id in enumerate(ids): + chunk = DocumentChunk( + id=id, + text=texts[i], + metadata=DocumentChunkMetadata( + document_id=doc_id, + source=sources[i], + source_id=source_ids[i], + url=urls[i], + created_at=created_ats[i], + author=authors[i], + ), + embedding=[i] * DIM + ) + doc_chunks.append(chunk) + + return {doc_id: doc_chunks} + + +@pytest.mark.asyncio +async def test_upsert( + dashvector_datastore: DashVectorDataStore, + document_chunks: Dict[str, List[DocumentChunk]] +) -> None: + # clear docs + await dashvector_datastore.delete(delete_all=True) + + # upsert + doc_ids = await dashvector_datastore._upsert(document_chunks) + assert doc_ids == list(document_chunks.keys()) + + # the vector insert operation is async by design, we wait here a bit for the insertion to complete. + sleep(1.0) + stats = dashvector_datastore._collection.stats() + + # assert total doc count + assert 3 == stats.output.total_doc_count + + +@pytest.mark.asyncio +async def test_query( + dashvector_datastore: DashVectorDataStore, + document_chunks: Dict[str, List[DocumentChunk]] +) -> None: + # upsert docs + await dashvector_datastore._upsert(document_chunks) + + # the vector insert operation is async by design, we wait here a bit for the insertion to complete. + sleep(0.5) + query = QueryWithEmbedding( + query="lorem", + top_k=1, + embedding=[0] * DIM + ) + result = await dashvector_datastore._query([query]) + assert 1 == len(result) + assert 1 == len(result[0].results) + assert "abc_123" == result[0].results[0].id + assert "lorem ipsum dolor sit amet" == result[0].results[0].text + + +@pytest.mark.asyncio +async def test_query_with_date_filter( + dashvector_datastore: DashVectorDataStore, + document_chunks: Dict[str, List[DocumentChunk]] +) -> None: + # upsert docs + await dashvector_datastore._upsert(document_chunks) + + # the vector insert operation is async by design, we wait here a bit for the insertion to complete. + sleep(0.5) + query = QueryWithEmbedding( + query="lorem", + filter=DocumentMetadataFilter( + start_date="2009-01-03T16:39:57-08:00" + ), + top_k=3, + embedding=[0] * DIM + ) + result = await dashvector_datastore._query([query]) + assert len(result) == 1 + assert len(result[0].results) == 2 + assert {"def_456", "ghi_789"} == {doc.id for doc in result[0].results} + assert { + "consectetur adipiscing elit", + "sed do eiusmod tempor incididunt" + } == {doc.text for doc in result[0].results} + + +@pytest.mark.asyncio +async def test_delete_with_ids( + dashvector_datastore: DashVectorDataStore, + document_chunks: Dict[str, List[DocumentChunk]] +) -> None: + # upsert docs + await dashvector_datastore._upsert(document_chunks) + + # delete with id + await dashvector_datastore.delete(ids=["abc_123"]) + + # the vector insert/delete operation is async by design, we wait here a bit for the insertion to complete. + sleep(0.5) + stats = dashvector_datastore._collection.stats() + assert 2 == stats.output.total_doc_count + + +@pytest.mark.asyncio +async def test_delete_all( + dashvector_datastore: DashVectorDataStore, + document_chunks: Dict[str, List[DocumentChunk]] +) -> None: + # upsert docs + await dashvector_datastore._upsert(document_chunks) + + # delete with id + await dashvector_datastore.delete(delete_all=True) + + # the vector insert/delete operation is async by design, we wait here a bit for the insertion to complete. + sleep(1.0) + stats = dashvector_datastore._collection.stats() + assert 0 == stats.output.total_doc_count + + +@pytest.mark.asyncio +async def test_delete_with_filter( + dashvector_datastore: DashVectorDataStore, + document_chunks: Dict[str, List[DocumentChunk]] +) -> None: + # upsert docs + await dashvector_datastore._upsert(document_chunks) + sleep(0.5) + + # delete with id + await dashvector_datastore.delete( + filter=DocumentMetadataFilter( + start_date="2009-01-03T16:39:57-08:00" + ) + ) + + # the vector insert/delete operation is async by design, we wait here a bit for the insertion to complete. + sleep(0.5) + stats = dashvector_datastore._collection.stats() + assert 1 == stats.output.total_doc_count + +