From 9b0537291c88490dce9351a49d9b8ef0cafcbf40 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Fri, 26 Jul 2024 09:42:30 -0500 Subject: [PATCH 1/8] Creating notebook to ingest CloudSQL database using kubernetes --- ...rag-data-ingest-with-kubernetes-docs.ipynb | 1809 +++++++++++++++++ 1 file changed, 1809 insertions(+) create mode 100644 applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb diff --git a/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb b/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb new file mode 100644 index 000000000..7dd40f32d --- /dev/null +++ b/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb @@ -0,0 +1,1809 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "ThxQrIhibaJ9" + }, + "source": [ + "# RAG-on-GKE Application\n", + "\n", + "This is a Python notebook for generating the vector embeddings based on [Kubernetes docs](https://github.com/dohsimpson/kubernetes-doc-pdf/) used by the RAG on GKE application. \n", + "For full information, please checkout the GitHub documentation [here](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/applications/rag/README.md).\n", + "\n", + "\n", + "\n", + "- Clone the kubernetes docs repo\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "k8d6_U2sbaJ_", + "executionInfo": { + "status": "ok", + "timestamp": 1721926267799, + "user_tz": 300, + "elapsed": 569, + "user": { + "displayName": "", + "userId": "" + } + }, + "outputId": "e15c65de-1382-4923-a3ee-15b3f3f21f86" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "fatal: destination path '/data/kubernetes-docs' already exists and is not an empty directory.\n" + ] + } + ], + "source": [ + "!mkdir /data/kubernetes-docs -p\n", + "!git clone https://github.com/dohsimpson/kubernetes-doc-pdf /data/kubernetes-docs\n" + ] + }, + { + "cell_type": "markdown", + "source": [ + "- Install the required packages" + ], + "metadata": { + "id": "iRtu4buBamab" + } + }, + { + "cell_type": "code", + "source": [ + "!pip install pgvector\n", + "!pip install langchain langchain-community sentence_transformers unstructured[pdf]\n", + "!pip install google cloud-sql-python-connector[pg8000] langchain-google-cloud-sql-pg" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "collapsed": true, + "id": "xRh2Gn1rcBJY", + "executionInfo": { + "status": "ok", + "timestamp": 1721926317024, + "user_tz": 300, + "elapsed": 35573, + "user": { + "displayName": "", + "userId": "" + } + }, + "outputId": "f0deb85d-1d5c-41d0-b6ff-e3ed86bd3042" + }, + "execution_count": 2, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Requirement already satisfied: pgvector in /usr/local/lib/python3.10/dist-packages (0.3.2)\n", + "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from pgvector) (1.25.2)\n", + "Requirement already satisfied: langchain in /usr/local/lib/python3.10/dist-packages (0.2.11)\n", + "Requirement already satisfied: langchain-community in /usr/local/lib/python3.10/dist-packages (0.2.10)\n", + "Requirement already satisfied: sentence_transformers in /usr/local/lib/python3.10/dist-packages (3.0.1)\n", + "Requirement already satisfied: unstructured[pdf] in /usr/local/lib/python3.10/dist-packages (0.15.0)\n", + "Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (6.0.1)\n", + "Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.0.31)\n", + "Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (3.9.5)\n", + "Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (4.0.3)\n", + "Requirement already satisfied: langchain-core<0.3.0,>=0.2.23 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.2.23)\n", + "Requirement already satisfied: langchain-text-splitters<0.3.0,>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.2.2)\n", + "Requirement already satisfied: langsmith<0.2.0,>=0.1.17 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.1.93)\n", + "Requirement already satisfied: numpy<2,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain) (1.25.2)\n", + "Requirement already satisfied: pydantic<3,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain) (1.10.17)\n", + "Requirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.31.0)\n", + "Requirement already satisfied: tenacity!=8.4.0,<9.0.0,>=8.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (8.4.2)\n", + "Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /usr/local/lib/python3.10/dist-packages (from langchain-community) (0.6.7)\n", + "Requirement already satisfied: transformers<5.0.0,>=4.34.0 in /usr/local/lib/python3.10/dist-packages (from sentence_transformers) (4.41.2)\n", + "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from sentence_transformers) (4.66.4)\n", + "Requirement already satisfied: torch>=1.11.0 in /usr/local/lib/python3.10/dist-packages (from sentence_transformers) (2.3.0+cu121)\n", + "Requirement already satisfied: scikit-learn in /usr/local/lib/python3.10/dist-packages (from sentence_transformers) (1.2.2)\n", + "Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (from sentence_transformers) (1.11.4)\n", + "Requirement already satisfied: huggingface-hub>=0.15.1 in /usr/local/lib/python3.10/dist-packages (from sentence_transformers) (0.23.4)\n", + "Requirement already satisfied: Pillow in /usr/local/lib/python3.10/dist-packages (from sentence_transformers) (10.4.0)\n", + "Requirement already satisfied: chardet in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (5.2.0)\n", + "Requirement already satisfied: filetype in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (1.2.0)\n", + "Requirement already satisfied: python-magic in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.4.27)\n", + "Requirement already satisfied: lxml in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (4.9.4)\n", + "Requirement already satisfied: nltk in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (3.8.1)\n", + "Requirement already satisfied: tabulate in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.9.0)\n", + "Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (4.12.3)\n", + "Requirement already satisfied: emoji in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (2.12.1)\n", + "Requirement already satisfied: python-iso639 in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (2024.4.27)\n", + "Requirement already satisfied: langdetect in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (1.0.9)\n", + "Requirement already satisfied: rapidfuzz in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (3.9.4)\n", + "Requirement already satisfied: backoff in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (2.2.1)\n", + "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (4.12.2)\n", + "Requirement already satisfied: unstructured-client in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.24.1)\n", + "Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (1.14.1)\n", + "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (5.9.5)\n", + "Requirement already satisfied: onnx in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (1.16.1)\n", + "Requirement already satisfied: pdf2image in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (1.17.0)\n", + "Requirement already satisfied: pdfminer.six in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (20231228)\n", + "Requirement already satisfied: pikepdf in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (9.1.0)\n", + "Requirement already satisfied: pillow-heif in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.17.0)\n", + "Requirement already satisfied: pypdf in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (4.3.1)\n", + "Requirement already satisfied: pytesseract in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.3.10)\n", + "Requirement already satisfied: google-cloud-vision in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (3.7.3)\n", + "Requirement already satisfied: effdet in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.4.1)\n", + "Requirement already satisfied: unstructured-inference==0.7.36 in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.7.36)\n", + "Requirement already satisfied: unstructured.pytesseract>=0.3.12 in /usr/local/lib/python3.10/dist-packages (from unstructured[pdf]) (0.3.12)\n", + "Requirement already satisfied: layoutparser in /usr/local/lib/python3.10/dist-packages (from unstructured-inference==0.7.36->unstructured[pdf]) (0.3.4)\n", + "Requirement already satisfied: python-multipart in /usr/local/lib/python3.10/dist-packages (from unstructured-inference==0.7.36->unstructured[pdf]) (0.0.9)\n", + "Requirement already satisfied: opencv-python!=4.7.0.68 in /usr/local/lib/python3.10/dist-packages (from unstructured-inference==0.7.36->unstructured[pdf]) (4.8.0.76)\n", + "Requirement already satisfied: onnxruntime>=1.17.0 in /usr/local/lib/python3.10/dist-packages (from unstructured-inference==0.7.36->unstructured[pdf]) (1.18.1)\n", + "Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (from unstructured-inference==0.7.36->unstructured[pdf]) (3.7.1)\n", + "Requirement already satisfied: timm in /usr/local/lib/python3.10/dist-packages (from unstructured-inference==0.7.36->unstructured[pdf]) (1.0.7)\n", + "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1)\n", + "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (23.2.0)\n", + "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.4.1)\n", + "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.0.5)\n", + "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.9.4)\n", + "Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community) (3.21.3)\n", + "Requirement already satisfied: typing-inspect<1,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community) (0.9.0)\n", + "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.15.1->sentence_transformers) (3.15.4)\n", + "Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.15.1->sentence_transformers) (2023.6.0)\n", + "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.15.1->sentence_transformers) (24.1)\n", + "Requirement already satisfied: jsonpatch<2.0,>=1.33 in /usr/local/lib/python3.10/dist-packages (from langchain-core<0.3.0,>=0.2.23->langchain) (1.33)\n", + "Requirement already satisfied: orjson<4.0.0,>=3.9.14 in /usr/local/lib/python3.10/dist-packages (from langsmith<0.2.0,>=0.1.17->langchain) (3.10.6)\n", + "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (3.3.2)\n", + "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (3.7)\n", + "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (2.0.7)\n", + "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (2024.6.2)\n", + "Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy<3,>=1.4->langchain) (3.0.3)\n", + "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (1.12.1)\n", + "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (3.3)\n", + "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (3.1.4)\n", + "Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (12.1.105)\n", + "Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (12.1.105)\n", + "Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (12.1.105)\n", + "Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (8.9.2.26)\n", + "Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (12.1.3.1)\n", + "Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (11.0.2.54)\n", + "Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (10.3.2.106)\n", + "Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (11.4.5.107)\n", + "Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (12.1.0.106)\n", + "Requirement already satisfied: nvidia-nccl-cu12==2.20.5 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (2.20.5)\n", + "Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (12.1.105)\n", + "Requirement already satisfied: triton==2.3.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11.0->sentence_transformers) (2.3.0)\n", + "Requirement already satisfied: nvidia-nvjitlink-cu12 in /usr/local/lib/python3.10/dist-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.11.0->sentence_transformers) (12.5.82)\n", + "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers<5.0.0,>=4.34.0->sentence_transformers) (2024.5.15)\n", + "Requirement already satisfied: tokenizers<0.20,>=0.19 in /usr/local/lib/python3.10/dist-packages (from transformers<5.0.0,>=4.34.0->sentence_transformers) (0.19.1)\n", + "Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from transformers<5.0.0,>=4.34.0->sentence_transformers) (0.4.3)\n", + "Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4->unstructured[pdf]) (2.5)\n", + "Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (from effdet->unstructured[pdf]) (0.18.0+cu121)\n", + "Requirement already satisfied: pycocotools>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from effdet->unstructured[pdf]) (2.0.8)\n", + "Requirement already satisfied: omegaconf>=2.0 in /usr/local/lib/python3.10/dist-packages (from effdet->unstructured[pdf]) (2.3.0)\n", + "Requirement already satisfied: google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1 in /usr/local/lib/python3.10/dist-packages (from google-cloud-vision->unstructured[pdf]) (2.16.2)\n", + "Requirement already satisfied: google-auth!=2.24.0,!=2.25.0,<3.0.0dev,>=2.14.1 in /usr/local/lib/python3.10/dist-packages (from google-cloud-vision->unstructured[pdf]) (2.32.0)\n", + "Requirement already satisfied: proto-plus<2.0.0dev,>=1.22.3 in /usr/local/lib/python3.10/dist-packages (from google-cloud-vision->unstructured[pdf]) (1.24.0)\n", + "Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.2 in /usr/local/lib/python3.10/dist-packages (from google-cloud-vision->unstructured[pdf]) (3.20.3)\n", + "Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from langdetect->unstructured[pdf]) (1.16.0)\n", + "Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk->unstructured[pdf]) (8.1.7)\n", + "Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk->unstructured[pdf]) (1.4.2)\n", + "Requirement already satisfied: cryptography>=36.0.0 in /usr/local/lib/python3.10/dist-packages (from pdfminer.six->unstructured[pdf]) (42.0.8)\n", + "Requirement already satisfied: Deprecated in /usr/local/lib/python3.10/dist-packages (from pikepdf->unstructured[pdf]) (1.2.14)\n", + "Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn->sentence_transformers) (3.5.0)\n", + "Requirement already satisfied: deepdiff>=6.0 in /usr/local/lib/python3.10/dist-packages (from unstructured-client->unstructured[pdf]) (7.0.1)\n", + "Requirement already satisfied: httpx>=0.27.0 in /usr/local/lib/python3.10/dist-packages (from unstructured-client->unstructured[pdf]) (0.27.0)\n", + "Requirement already satisfied: jsonpath-python>=1.0.6 in /usr/local/lib/python3.10/dist-packages (from unstructured-client->unstructured[pdf]) (1.0.6)\n", + "Requirement already satisfied: mypy-extensions>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from unstructured-client->unstructured[pdf]) (1.0.0)\n", + "Requirement already satisfied: nest-asyncio>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from unstructured-client->unstructured[pdf]) (1.6.0)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from unstructured-client->unstructured[pdf]) (2.8.2)\n", + "Requirement already satisfied: requests-toolbelt>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from unstructured-client->unstructured[pdf]) (1.0.0)\n", + "Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.10/dist-packages (from cryptography>=36.0.0->pdfminer.six->unstructured[pdf]) (1.16.0)\n", + "Requirement already satisfied: ordered-set<4.2.0,>=4.1.0 in /usr/local/lib/python3.10/dist-packages (from deepdiff>=6.0->unstructured-client->unstructured[pdf]) (4.1.0)\n", + "Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in /usr/local/lib/python3.10/dist-packages (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-vision->unstructured[pdf]) (1.63.2)\n", + "Requirement already satisfied: grpcio<2.0dev,>=1.33.2 in /usr/local/lib/python3.10/dist-packages (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-vision->unstructured[pdf]) (1.64.1)\n", + "Requirement already satisfied: grpcio-status<2.0.dev0,>=1.33.2 in /usr/local/lib/python3.10/dist-packages (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-vision->unstructured[pdf]) (1.48.2)\n", + "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth!=2.24.0,!=2.25.0,<3.0.0dev,>=2.14.1->google-cloud-vision->unstructured[pdf]) (5.3.3)\n", + "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth!=2.24.0,!=2.25.0,<3.0.0dev,>=2.14.1->google-cloud-vision->unstructured[pdf]) (0.4.0)\n", + "Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth!=2.24.0,!=2.25.0,<3.0.0dev,>=2.14.1->google-cloud-vision->unstructured[pdf]) (4.9)\n", + "Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->unstructured-client->unstructured[pdf]) (3.7.1)\n", + "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->unstructured-client->unstructured[pdf]) (1.0.5)\n", + "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx>=0.27.0->unstructured-client->unstructured[pdf]) (1.3.1)\n", + "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx>=0.27.0->unstructured-client->unstructured[pdf]) (0.14.0)\n", + "Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.10/dist-packages (from jsonpatch<2.0,>=1.33->langchain-core<0.3.0,>=0.2.23->langchain) (3.0.0)\n", + "Requirement already satisfied: antlr4-python3-runtime==4.9.* in /usr/local/lib/python3.10/dist-packages (from omegaconf>=2.0->effdet->unstructured[pdf]) (4.9.3)\n", + "Requirement already satisfied: coloredlogs in /usr/local/lib/python3.10/dist-packages (from onnxruntime>=1.17.0->unstructured-inference==0.7.36->unstructured[pdf]) (15.0.1)\n", + "Requirement already satisfied: flatbuffers in /usr/local/lib/python3.10/dist-packages (from onnxruntime>=1.17.0->unstructured-inference==0.7.36->unstructured[pdf]) (24.3.25)\n", + "Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->unstructured-inference==0.7.36->unstructured[pdf]) (1.2.1)\n", + "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib->unstructured-inference==0.7.36->unstructured[pdf]) (0.12.1)\n", + "Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->unstructured-inference==0.7.36->unstructured[pdf]) (4.53.0)\n", + "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->unstructured-inference==0.7.36->unstructured[pdf]) (1.4.5)\n", + "Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->unstructured-inference==0.7.36->unstructured[pdf]) (3.1.2)\n", + "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.11.0->sentence_transformers) (2.1.5)\n", + "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from layoutparser->unstructured-inference==0.7.36->unstructured[pdf]) (2.0.3)\n", + "Requirement already satisfied: iopath in /usr/local/lib/python3.10/dist-packages (from layoutparser->unstructured-inference==0.7.36->unstructured[pdf]) (0.1.10)\n", + "Requirement already satisfied: pdfplumber in /usr/local/lib/python3.10/dist-packages (from layoutparser->unstructured-inference==0.7.36->unstructured[pdf]) (0.11.2)\n", + "Requirement already satisfied: mpmath<1.4.0,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.11.0->sentence_transformers) (1.3.0)\n", + "Requirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi>=1.12->cryptography>=36.0.0->pdfminer.six->unstructured[pdf]) (2.22)\n", + "Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth!=2.24.0,!=2.25.0,<3.0.0dev,>=2.14.1->google-cloud-vision->unstructured[pdf]) (0.6.0)\n", + "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx>=0.27.0->unstructured-client->unstructured[pdf]) (1.2.1)\n", + "Requirement already satisfied: humanfriendly>=9.1 in /usr/local/lib/python3.10/dist-packages (from coloredlogs->onnxruntime>=1.17.0->unstructured-inference==0.7.36->unstructured[pdf]) (10.0)\n", + "Requirement already satisfied: portalocker in /usr/local/lib/python3.10/dist-packages (from iopath->layoutparser->unstructured-inference==0.7.36->unstructured[pdf]) (2.10.1)\n", + "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->layoutparser->unstructured-inference==0.7.36->unstructured[pdf]) (2023.4)\n", + "Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->layoutparser->unstructured-inference==0.7.36->unstructured[pdf]) (2024.1)\n", + "Requirement already satisfied: pypdfium2>=4.18.0 in /usr/local/lib/python3.10/dist-packages (from pdfplumber->layoutparser->unstructured-inference==0.7.36->unstructured[pdf]) (4.30.0)\n", + "Requirement already satisfied: google in /usr/local/lib/python3.10/dist-packages (2.0.3)\n", + "Requirement already satisfied: cloud-sql-python-connector[pg8000] in /usr/local/lib/python3.10/dist-packages (1.11.0)\n", + "Requirement already satisfied: langchain-google-cloud-sql-pg in /usr/local/lib/python3.10/dist-packages (0.6.1)\n", + "Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-packages (from google) (4.12.3)\n", + "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from cloud-sql-python-connector[pg8000]) (3.9.5)\n", + "Requirement already satisfied: cryptography>=42.0.0 in /usr/local/lib/python3.10/dist-packages (from cloud-sql-python-connector[pg8000]) (42.0.8)\n", + "Requirement already satisfied: Requests in /usr/local/lib/python3.10/dist-packages (from cloud-sql-python-connector[pg8000]) (2.31.0)\n", + "Requirement already satisfied: google-auth>=2.28.0 in /usr/local/lib/python3.10/dist-packages (from cloud-sql-python-connector[pg8000]) (2.32.0)\n", + "Requirement already satisfied: pg8000>=1.31.1 in /usr/local/lib/python3.10/dist-packages (from cloud-sql-python-connector[pg8000]) (1.31.2)\n", + "Requirement already satisfied: langchain-core<1.0.0,>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from langchain-google-cloud-sql-pg) (0.2.23)\n", + "Requirement already satisfied: langchain-community<0.3.0,>=0.0.18 in /usr/local/lib/python3.10/dist-packages (from langchain-google-cloud-sql-pg) (0.2.10)\n", + "Requirement already satisfied: numpy<2.0.0,>=1.24.4 in /usr/local/lib/python3.10/dist-packages (from langchain-google-cloud-sql-pg) (1.25.2)\n", + "Requirement already satisfied: pgvector<1.0.0,>=0.2.5 in /usr/local/lib/python3.10/dist-packages (from langchain-google-cloud-sql-pg) (0.3.2)\n", + "Requirement already satisfied: SQLAlchemy[asyncio]<3.0.0,>=2.0.25 in /usr/local/lib/python3.10/dist-packages (from langchain-google-cloud-sql-pg) (2.0.31)\n", + "Requirement already satisfied: asyncpg>=0.29.0 in /usr/local/lib/python3.10/dist-packages (from cloud-sql-python-connector[pg8000]) (0.29.0)\n", + "Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.10/dist-packages (from cryptography>=42.0.0->cloud-sql-python-connector[pg8000]) (1.16.0)\n", + "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth>=2.28.0->cloud-sql-python-connector[pg8000]) (5.3.3)\n", + "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth>=2.28.0->cloud-sql-python-connector[pg8000]) (0.4.0)\n", + "Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth>=2.28.0->cloud-sql-python-connector[pg8000]) (4.9)\n", + "Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.10/dist-packages (from langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (6.0.1)\n", + "Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /usr/local/lib/python3.10/dist-packages (from langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (0.6.7)\n", + "Requirement already satisfied: langchain<0.3.0,>=0.2.9 in /usr/local/lib/python3.10/dist-packages (from langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (0.2.11)\n", + "Requirement already satisfied: langsmith<0.2.0,>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (0.1.93)\n", + "Requirement already satisfied: tenacity!=8.4.0,<9.0.0,>=8.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (8.4.2)\n", + "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->cloud-sql-python-connector[pg8000]) (1.3.1)\n", + "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->cloud-sql-python-connector[pg8000]) (23.2.0)\n", + "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->cloud-sql-python-connector[pg8000]) (1.4.1)\n", + "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->cloud-sql-python-connector[pg8000]) (6.0.5)\n", + "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->cloud-sql-python-connector[pg8000]) (1.9.4)\n", + "Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->cloud-sql-python-connector[pg8000]) (4.0.3)\n", + "Requirement already satisfied: jsonpatch<2.0,>=1.33 in /usr/local/lib/python3.10/dist-packages (from langchain-core<1.0.0,>=0.1.1->langchain-google-cloud-sql-pg) (1.33)\n", + "Requirement already satisfied: packaging<25,>=23.2 in /usr/local/lib/python3.10/dist-packages (from langchain-core<1.0.0,>=0.1.1->langchain-google-cloud-sql-pg) (24.1)\n", + "Requirement already satisfied: pydantic<3,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain-core<1.0.0,>=0.1.1->langchain-google-cloud-sql-pg) (1.10.17)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pg8000>=1.31.1->cloud-sql-python-connector[pg8000]) (2.8.2)\n", + "Requirement already satisfied: scramp>=1.4.5 in /usr/local/lib/python3.10/dist-packages (from pg8000>=1.31.1->cloud-sql-python-connector[pg8000]) (1.4.5)\n", + "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from Requests->cloud-sql-python-connector[pg8000]) (3.3.2)\n", + "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from Requests->cloud-sql-python-connector[pg8000]) (3.7)\n", + "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from Requests->cloud-sql-python-connector[pg8000]) (2.0.7)\n", + "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from Requests->cloud-sql-python-connector[pg8000]) (2024.6.2)\n", + "Requirement already satisfied: typing-extensions>=4.6.0 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy[asyncio]<3.0.0,>=2.0.25->langchain-google-cloud-sql-pg) (4.12.2)\n", + "Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy[asyncio]<3.0.0,>=2.0.25->langchain-google-cloud-sql-pg) (3.0.3)\n", + "Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4->google) (2.5)\n", + "Requirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi>=1.12->cryptography>=42.0.0->cloud-sql-python-connector[pg8000]) (2.22)\n", + "Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (3.21.3)\n", + "Requirement already satisfied: typing-inspect<1,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (0.9.0)\n", + "Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.10/dist-packages (from jsonpatch<2.0,>=1.33->langchain-core<1.0.0,>=0.1.1->langchain-google-cloud-sql-pg) (3.0.0)\n", + "Requirement already satisfied: langchain-text-splitters<0.3.0,>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from langchain<0.3.0,>=0.2.9->langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (0.2.2)\n", + "Requirement already satisfied: orjson<4.0.0,>=3.9.14 in /usr/local/lib/python3.10/dist-packages (from langsmith<0.2.0,>=0.1.0->langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (3.10.6)\n", + "Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth>=2.28.0->cloud-sql-python-connector[pg8000]) (0.6.0)\n", + "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pg8000>=1.31.1->cloud-sql-python-connector[pg8000]) (1.16.0)\n", + "Requirement already satisfied: asn1crypto>=1.5.1 in /usr/local/lib/python3.10/dist-packages (from scramp>=1.4.5->pg8000>=1.31.1->cloud-sql-python-connector[pg8000]) (1.5.1)\n", + "Requirement already satisfied: mypy-extensions>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (1.0.0)\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + " - Import required functions and libraries" + ], + "metadata": { + "id": "yZybYPPvaqcS" + } + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "FWqsMMdQbaKA", + "executionInfo": { + "status": "ok", + "timestamp": 1721926369825, + "user_tz": 300, + "elapsed": 1322, + "user": { + "displayName": "", + "userId": "" + } + } + }, + "outputs": [], + "source": [ + "# Import base libraries\n", + "import os\n", + "import uuid\n", + "\n", + "from langchain.document_loaders import DirectoryLoader\n", + "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", + "from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings\n", + "\n", + "from langchain_google_cloud_sql_pg import PostgresEngine, PostgresVectorStore\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "f2FszLoLbaKB" + }, + "source": [ + "## Creating the Database Connection\n", + "\n", + "Let's now set up a connection to your CloudSQL database:" + ] + }, + { + "cell_type": "code", + "source": [ + "%env ENVIRONMENT=development\n", + "%env PROJECT_ID=globant-gke-ai-resources\n", + "%env CLOUDSQL_INSTANCE_REGION=us-west1\n", + "%env CLOUDSQL_INSTANCE=rag-application-test\n", + "%env EMBEDDINGS_TABLE_NAME=kubernetes_docs\n", + "%env DB_USERNAME=main-user\n", + "%env DB_PASS=gSo{I@YMyd8]&\\34" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "DegY7bswdlSB", + "executionInfo": { + "status": "ok", + "timestamp": 1721926389134, + "user_tz": 300, + "elapsed": 338, + "user": { + "displayName": "", + "userId": "" + } + }, + "outputId": "ca5aa526-bace-469d-8808-162d9e934be2" + }, + "execution_count": 4, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "env: ENVIRONMENT=development\n", + "env: PROJECT_ID=globant-gke-ai-resources\n", + "env: CLOUDSQL_INSTANCE_REGION=us-west1\n", + "env: CLOUDSQL_INSTANCE=rag-application-test\n", + "env: EMBEDDINGS_TABLE_NAME=kubernetes_docs\n", + "env: DB_USERNAME=main-user\n", + "env: DB_PASS=gSo{I@YMyd8]&\\34\n" + ] + } + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "rvK19kzwbaKB", + "executionInfo": { + "status": "ok", + "timestamp": 1721926402495, + "user_tz": 300, + "elapsed": 457, + "user": { + "displayName": "", + "userId": "" + } + } + }, + "outputs": [], + "source": [ + "ENVIRONMENT = os.environ.get(\"ENVIRONMENT\")\n", + "\n", + "GCP_PROJECT_ID = os.environ.get(\"PROJECT_ID\")\n", + "GCP_CLOUD_SQL_REGION = os.environ.get(\"CLOUDSQL_INSTANCE_REGION\")\n", + "GCP_CLOUD_SQL_INSTANCE = os.environ.get(\"CLOUDSQL_INSTANCE\")\n", + "\n", + "DB_NAME = os.environ.get(\"DB_NAME\", \"pgvector-database\")\n", + "VECTOR_EMBEDDINGS_TABLE_NAME = os.environ.get(\"EMBEDDINGS_TABLE_NAME\", \"\")\n", + "\n", + "try:\n", + " db_username_file = open(\"/etc/secret-volume/username\", \"r\")\n", + " DB_USER = db_username_file.read()\n", + " db_username_file.close()\n", + "\n", + " db_password_file = open(\"/etc/secret-volume/password\", \"r\")\n", + " DB_PASS = db_password_file.read()\n", + " db_password_file.close()\n", + "except:\n", + " DB_USER = os.environ.get(\"DB_USERNAME\", \"postgres\")\n", + " DB_PASS = os.environ.get(\"DB_PASS\", \"postgres\")\n", + "\n", + "\n", + "# Create Cloud SQL Postgres Engine\n", + "pg_engine = PostgresEngine.from_instance(\n", + " project_id=GCP_PROJECT_ID,\n", + " instance=GCP_CLOUD_SQL_INSTANCE,\n", + " region=GCP_CLOUD_SQL_REGION,\n", + " database=DB_NAME,\n", + " user=DB_USER,\n", + " password=DB_PASS,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "B7ti0MqBbaKB" + }, + "source": [ + "Next we'll setup some parameters for the dataset processing steps:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "_MydulMdbaKC", + "executionInfo": { + "status": "ok", + "timestamp": 1721926424908, + "user_tz": 300, + "elapsed": 3, + "user": { + "displayName": "", + "userId": "" + } + } + }, + "outputs": [], + "source": [ + "SENTENCE_TRANSFORMER_MODEL = \"intfloat/multilingual-e5-small\" # Transformer to use for converting text chunks to vector embeddings\n", + "\n", + "# the dataset has been pre-dowloaded to the GCS bucket as part of the notebook in the cell above. Ray workers will find the dataset readily mounted.\n", + "SHARED_DATASET_BASE_PATH = \"/data/kubernetes-docs/\"\n", + "\n", + "BATCH_SIZE = 100\n", + "CHUNK_SIZE = 1000 # text chunk sizes which will be converted to vector embeddings\n", + "CHUNK_OVERLAP = 10\n", + "VECTOR_DIMENSION = 384 # Embeddings size" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ov-myFJybaKC" + }, + "source": [ + "## Initialize Vector Store Table\n", + "\n", + "We are ready to begin. Let's first create some code for generating the vector embeddings:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "id": "0EzU4YhrbaKC", + "executionInfo": { + "status": "ok", + "timestamp": 1721926432388, + "user_tz": 300, + "elapsed": 2350, + "user": { + "displayName": "", + "userId": "" + } + } + }, + "outputs": [], + "source": [ + "pg_engine.init_vectorstore_table(\n", + " VECTOR_EMBEDDINGS_TABLE_NAME,\n", + " vector_size=VECTOR_DIMENSION,\n", + " overwrite_existing=True, # Enabling this will recreate the table if exists.\n", + ")" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# Initialize Vector Store" + ], + "metadata": { + "id": "aIAiofJTj4Fh" + } + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "id": "oCbOnnBIbaKD", + "executionInfo": { + "status": "ok", + "timestamp": 1721926463546, + "user_tz": 300, + "elapsed": 14376, + "user": { + "displayName": "", + "userId": "" + } + }, + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "ed2702cf-ce04-4711-eaa7-5e644b5290ec" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py:139: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 0.3.0. An updated version of the class exists in the langchain-huggingface package and should be used instead. To use it run `pip install -U langchain-huggingface` and import as `from langchain_huggingface import HuggingFaceEmbeddings`.\n", + " warn_deprecated(\n", + "/usr/local/lib/python3.10/dist-packages/sentence_transformers/cross_encoder/CrossEncoder.py:11: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n", + " from tqdm.autonotebook import tqdm, trange\n", + "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning: \n", + "The secret `HF_TOKEN` does not exist in your Colab secrets.\n", + "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n", + "You will be able to reuse this secret in all of your notebooks.\n", + "Please note that authentication is recommended but still optional to access public models or datasets.\n", + " warnings.warn(\n", + "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n", + " warnings.warn(\n" + ] + } + ], + "source": [ + "embeddings_service = HuggingFaceEmbeddings(model_name=SENTENCE_TRANSFORMER_MODEL)\n", + "vector_store = PostgresVectorStore.create_sync(\n", + " engine=pg_engine,\n", + " embedding_service=embeddings_service,\n", + " table_name=VECTOR_EMBEDDINGS_TABLE_NAME,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fYoXfeYPbaKD" + }, + "source": [ + "## Ingest PDF docs into CloudSQL DB\n", + "\n", + "### Load and Split the kubernetes docs" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "OzXCfwNAbaKD", + "executionInfo": { + "status": "ok", + "timestamp": 1721927166341, + "user_tz": 300, + "elapsed": 696702, + "user": { + "displayName": "", + "userId": "" + } + }, + "outputId": "d573387c-66df-423f-a000-750334de97a0" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "100%|██████████| 6/6 [11:36<00:00, 116.07s/it]\n" + ] + } + ], + "source": [ + "loader = DirectoryLoader(f\"{SHARED_DATASET_BASE_PATH}/PDFs\", glob=\"*.pdf\", show_progress=True)\n", + "documents = loader.load()" + ] + }, + { + "cell_type": "code", + "source": [ + "splitter = RecursiveCharacterTextSplitter(\n", + " chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP, length_function=len\n", + ")\n", + "\n", + "splits = splitter.split_documents(documents)" + ], + "metadata": { + "id": "O7eBZG7wiWBa", + "executionInfo": { + "status": "ok", + "timestamp": 1721927196274, + "user_tz": 300, + "elapsed": 629, + "user": { + "displayName": "", + "userId": "" + } + } + }, + "execution_count": 10, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Add the splits on the vector store" + ], + "metadata": { + "id": "UwCj9x5Jl5iq" + } + }, + { + "cell_type": "code", + "source": [ + "ids = [str(uuid.uuid4()) for i in range(len(splits))]\n", + "vector_store.add_documents(splits, ids)" + ], + "metadata": { + "collapsed": true, + "id": "Wqd3cKgntEYw", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "executionInfo": { + "status": "ok", + "timestamp": 1721929629134, + "user_tz": 300, + "elapsed": 2429339, + "user": { + "displayName": "", + "userId": "" + } + }, + "outputId": "f9329c53-49fa-488c-caf1-ae74e1db128c" + }, + "execution_count": 11, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "['3db34d89-aca6-4152-a2f5-09e26d932652',\n", + " '49e44105-7700-4f22-a82b-859bfdb52a6e',\n", + " 'f4413c9d-0a33-412e-a840-9eb0ef320bd1',\n", + " 'e390d513-74db-4f86-9383-083220440296',\n", + " '7bb7e45c-2f1e-403e-974d-ce3dd037e2d2',\n", + " '57f1f180-3ecb-4218-a17f-b8bfcbcbdc6c',\n", + " '8837f666-2568-40a8-b140-ed0632aa9086',\n", + " '4ef75735-dc5e-48e9-8f04-95d7c0a06e31',\n", + " 'c4ffa497-042a-452c-b3c0-3b1080893ea4',\n", + " '0fb1cc93-0753-4f34-96c4-0435dc0d06f8',\n", + " 'ddc41036-f30a-422b-98d4-48290a2e65c3',\n", + " '6a79fb07-7bdc-4b16-9fef-e3548b374395',\n", + " '5bca673d-d3c7-4f50-ae26-c10cd59ca3b4',\n", + " '1ff27c15-7e38-4f3d-b0bc-1cf873b35068',\n", + " 'aefd3ab2-118e-4143-9e4f-2c6a7fe70ded',\n", + " '4505f156-8813-463b-9283-f2e7835f1d82',\n", + " '94a88882-bd06-4f3d-bbde-2eef5522273d',\n", + " '7f199fbb-df58-45c6-a94f-4311bf2162b0',\n", + " '28fbeed7-22dd-407c-8c51-67ddbed64297',\n", + " '5a72d95d-fe3b-429f-b5df-5bab3d539e8f',\n", + " '8faccd3d-29bd-48c4-871f-161864e32bfb',\n", + " '4fc9ca3c-55ca-452d-bfee-4f7bbea33a65',\n", + " '038e2bb2-1548-4d3f-96cf-c2e28eb37271',\n", + " '87741718-2d75-47fc-962a-39b8864f7313',\n", + " '32ce4d84-2824-4829-96f0-62a74c041dc6',\n", + " '7b93e12a-6ac7-49aa-84e2-37db7b9e2acc',\n", + " '0692a730-8916-4e56-82d1-534755ab99df',\n", + " '8a828405-ceb6-4f39-9c58-468f308a7e3f',\n", + " 'c42b5921-f928-431a-bf57-81b57c913445',\n", + " 'c17d77a5-deba-40d4-9f79-16c72a80de53',\n", + " 'e0aebbc7-dced-402e-95e9-5c5bfe718c4e',\n", + " 'd1b28e92-2c7f-47c0-b7bb-40cd3b419930',\n", + " '44ede58e-1991-4cd0-9279-dbf3e797b3cd',\n", + " '5910d756-b908-41f2-8429-1f0383f7aed9',\n", + " '0ad94869-5182-4e55-822e-b3b4ba505d98',\n", + " '9367fb31-a452-49a7-9a5f-1f0b06bf3022',\n", + " 'e2b4d3d2-36b5-47ad-be8b-a204ae862dfd',\n", + " '8a934962-b0ee-4285-b3a1-533e031f8c15',\n", + " '7e9d4fda-9d03-4143-9c8a-6e45d5ef2ed1',\n", + " 'ca03803a-2cdb-4c46-8b03-45e5b2449112',\n", + " '0837a075-c16d-4f4d-bdca-8087dae465ec',\n", + " '86468887-1619-499b-90ed-f8e15beee562',\n", + " '5f0ebbd3-8929-4e3f-9450-0b17427422b5',\n", + " '6b4062fe-a002-4181-a940-0a2b13cb4fb3',\n", + " 'ab4c5b3c-966f-4507-8086-55a322eec131',\n", + " 'ff59064f-d7c6-458d-bd30-de6b9093c656',\n", + " '17e540dc-0bb4-4dea-aee1-4e078dcfec05',\n", + " '8aded9a5-5ec6-4645-82db-51bf5a5db309',\n", + " '17b29a9d-a86f-48ff-b461-ba100386bfb4',\n", + " '73aa6ca9-dacf-4ea5-b1c5-554967a07e75',\n", + " 'fd4f946a-f55f-477f-9e7a-56ca78759b1b',\n", + " 'eb3bce43-d6cc-462b-ab3d-fbb33271b031',\n", + " '2bb8a0d9-e581-4c77-8875-d9e12fd8fba9',\n", + " 'bd7c4184-8f33-42cd-8aa3-1216a406b71d',\n", + " '730a1686-fd87-4161-a5e7-e0bb82e6dee5',\n", + " '67a91b4a-07ed-4ceb-aad9-5b12b78021ec',\n", + " '1219fccd-5f0f-40e3-9a52-0604f73342a1',\n", + " 'c3ca859a-37d2-47ce-a08a-53633cb4d497',\n", + " '2d56acba-00f9-4fac-8558-2b2b609afdf6',\n", + " '78afbdbe-9f04-4440-b378-533df8603029',\n", + " 'd8d1ed69-19c8-4698-9426-4139fb66ff9d',\n", + " '7ac99682-6645-4b52-b672-9c1c85565c67',\n", + " 'f0e6ee78-ee30-45f0-9ac3-60d520ac1343',\n", + " 'da8b823e-3517-4feb-b6ef-104a7fc26e6b',\n", + " 'b08b6790-0ad9-434a-8c99-ff2a6ce2c09f',\n", + " 'c6cba2ac-2e31-4d27-910c-8660e8c39f0f',\n", + " '569217b7-6b87-4a83-ac4a-02d54cfa99f4',\n", + " 'f8591d9a-8e5f-4db4-add4-a97c7d3ca5b7',\n", + " 'b7b784d1-4179-46ad-96d8-37276263f9cf',\n", + " 'f1510f08-aa0e-4175-9931-5fef77fbdd35',\n", + " 'c576bc1e-d95c-4db7-a86e-9b539e78ae48',\n", + " '9b04c922-6535-4a26-b892-fa7c3f4208ff',\n", + " 'e40b5a25-59ca-428c-9f3b-021b15d229c3',\n", + " 'b24de065-6133-4559-8afc-ea169ae15786',\n", + " '9d87a8b3-9035-4a4a-98c0-48163b433a7d',\n", + " '73a245f2-7179-4155-bebf-963e8a0e8656',\n", + " '8ba8f0ba-1447-414f-a4ad-422559c0cb38',\n", + " '65ebbd99-59b3-4b81-8cd9-fce67e2e3c47',\n", + " '1f8db38a-b8f6-4177-a57d-1d73011644e4',\n", + " '09a63b30-30ab-40d8-a890-35ce0528812c',\n", + " 'b1c4fc19-dce4-453b-b3a6-661f4c3bdb06',\n", + " '6b5fbfe8-6891-4283-8ca2-574cee94bbea',\n", + " '64f5b87b-0ce9-4f75-9dd5-ec5b6f473dff',\n", + " '5dc27691-1881-4416-883a-80e17d6c73fc',\n", + " 'c0d06714-7d49-4bd0-98fb-981e6e5b9951',\n", + " '6acf7061-6d43-461f-92ae-a729f5bdff9c',\n", + " '224385b8-865e-4c08-9a8f-4853613235e3',\n", + " 'fa3b1eb3-29ae-4099-8f65-2a6d8221cc78',\n", + " '2ecb74ce-5e98-44e4-b96d-24b3f70b5f49',\n", + " 'e0b562f6-b906-4089-a6dc-5813e0bec8d4',\n", + " 'd139da86-1d67-41a2-8214-ca7502e8b896',\n", + " 'fbec69e0-820f-4daf-a9e8-06a820a91fe4',\n", + " 'f7988cf7-cac6-46ef-9386-21e406f74676',\n", + " '26bb80f6-5987-4aaa-94b4-297a650854f7',\n", + " '52476354-dddd-4220-9b13-500ae4611017',\n", + " 'b2f0fb85-ae02-42b4-9667-c7fb1c198ff3',\n", + " '2ff7c9bc-a5c0-423c-ba01-8205610bfc62',\n", + " '0f49526f-3585-4fd3-b62d-cdbf5c8ef46f',\n", + " 'cd632a33-3c05-435e-bad1-648a62ced981',\n", + " '9103a607-5872-4958-9739-9c279ffa4461',\n", + " '0b9662f9-1753-4059-a894-a562c85b9c75',\n", + " '92bd5f22-1e03-4080-9ee6-389536aec01a',\n", + " 'e76c0041-f38e-4535-a896-8dd791d64bf8',\n", + " 'd43f9425-9278-4e0a-a0ce-ab1ba56ded97',\n", + " '465f50ce-b24d-4933-9929-81e7a467d3bb',\n", + " 'fc681054-0bab-4989-b554-42ae442d87e8',\n", + " '5edcc791-eab8-444b-b50b-ce3973c0b852',\n", + " 'ec1789fe-bd63-4bd1-98ae-5ec4ab1cc48d',\n", + " 'd58c5575-cc05-4aac-90f6-92b8560e5a0f',\n", + " '7015850e-e559-4c03-b1be-7ef76e74c6a5',\n", + " '23b9f141-ebe8-4653-a607-df1baf83328a',\n", + " 'f43218f2-8987-4665-a79f-250c02fb508c',\n", + " '152c4470-ccdc-447e-98c7-cebc9b6b2707',\n", + " '51bfb5b2-7abb-4cbc-87e6-c65287f7f018',\n", + " '8774abc3-9298-41b3-8e3a-2e122874946f',\n", + " '07ff031a-e28f-45d3-9384-26c1c3c97b02',\n", + " '4ed8c3d9-9c4f-4d43-9fbb-f08cecd476f8',\n", + " '5bb0f859-69cf-4f9f-b403-89a58044c051',\n", + " 'c1972152-405f-4535-8fcc-068e90b2f708',\n", + " '27d396bd-1aeb-46f7-a929-9fcd6ec6760d',\n", + " '86cecf51-4666-4db8-8936-6acc21c32353',\n", + " '72557adf-54be-4477-9771-1018742b650e',\n", + " 'ceb039b2-ef40-4799-a3d6-c8b33917d40b',\n", + " '9e10be19-96d2-4515-9265-c1a006635d70',\n", + " '2ae35a5b-3a22-49ed-baec-bc9e172ecb77',\n", + " '24b58a38-255b-496c-9842-55ac187d42e5',\n", + " '4bd912a3-40eb-4f75-8d2f-66211302b5e7',\n", + " '975be429-a1e6-419c-bcda-d09d3d54b486',\n", + " '362e1724-54cd-44f1-9648-7a3117f7a212',\n", + " '77e56dd8-8808-4983-9eb6-38cb2e7a713e',\n", + " '13ccc2d0-1102-4325-af83-70b1e6d3cce1',\n", + " '29fed437-d1c6-4ba1-8480-9b9e0dcf56b4',\n", + " 'fcbb2ddf-5106-4589-8863-6e248e1af2a3',\n", + " '8db56e12-db9e-4ab7-86d5-5c10ddf2334f',\n", + " '69681213-c61a-4e55-b8ab-ae16a8dca470',\n", + " 'd800bd75-f41f-40bf-9e8d-7c351945c379',\n", + " '77170ba9-1b98-486d-868a-7aa4db8d9282',\n", + " '83bf900e-fd16-48ba-bef3-4baf0c22038e',\n", + " '32336641-f959-42c2-8bf9-f607d96dac87',\n", + " '8bd5af49-4a0e-4577-9244-14b838df477a',\n", + " 'a3b72fe5-02ce-4eb1-9530-580459989b02',\n", + " '426f9bd7-f169-407a-abc0-812645716a1e',\n", + " '21606785-b3cd-4cb5-8d8e-5f1a79deebcd',\n", + " '8af618a2-66c1-4723-9e2a-291602f5527a',\n", + " 'c7e19b4a-ed3b-4043-8916-4301ffcd9303',\n", + " '910887ae-d5bf-4838-a01d-483314af8515',\n", + " 'd0136d14-a72e-48c3-b13b-810d17dc608f',\n", + " 'ac3ec456-315c-4412-8faa-69de6c996db3',\n", + " '34fcc8d4-a757-42ae-bfc1-e4afe47388c3',\n", + " '3a1b2b45-f637-42b0-ae24-aed5191ad27b',\n", + " 'f49de7d5-d881-49de-93d8-47198c420727',\n", + " '434d40d1-b66b-489c-a341-295f9362b42f',\n", + " 'd3255b1c-5122-4bab-b700-5068f90e78a7',\n", + " 'b4270786-fb2c-44c1-9367-2ca88243473b',\n", + " 'fb140488-03e1-470b-ab4b-7665c13cbb38',\n", + " 'ac1e6ef3-2843-4b56-922f-5c0ba51c33cb',\n", + " '95f95110-0c08-4a83-a2f3-d5eaf71419fc',\n", + " '84324652-89d2-49a9-82f0-9cfe15d711eb',\n", + " 'c27e8fce-546e-42ff-b958-f733f2c68264',\n", + " '30f0074c-5ecd-4023-9733-1b0ebfebfe41',\n", + " '4a63f423-09b4-4e45-9cfe-4590fe76160b',\n", + " 'bb3284e9-1f51-45f0-bb1e-82a02b83b551',\n", + " 'b94a4920-8e45-4bf2-b243-c8f711b9135d',\n", + " '0b5972ec-74bb-466a-998f-1ddaf007f870',\n", + " '663eff39-ba81-4eef-90f6-b4786409823c',\n", + " 'be20c233-a8a7-4f76-8ece-568ed6d36f8a',\n", + " '1f6d44a6-4379-489a-aba6-d66fda63a4c6',\n", + " '18c0f89f-6742-47ed-9b01-8e92647d3062',\n", + " 'b1f88ed7-f6f7-4142-8c82-b1f9a38dc135',\n", + " '65c78a38-3df8-452f-977b-ea21b6483bc5',\n", + " '84e08cd5-5d6f-4307-9016-c3fabcc49d7e',\n", + " 'd5c00eda-eb0c-48cf-8bec-479f15b3be6c',\n", + " '6472803c-d545-47e8-99af-5c1230494767',\n", + " '92dfd378-1170-427c-9227-915f0b051d0d',\n", + " '71bf82b4-9b02-4a6a-85e3-8c6655de5a2e',\n", + " '91d96761-a835-43c9-a961-98c16f1a0351',\n", + " '8127eff4-bb13-42a4-a7ac-cbef799c1239',\n", + " '76f2c27d-4748-4172-82d5-16ede162c879',\n", + " '035e9ec6-e624-4ae8-b74f-3c73ce67f7d3',\n", + " '84dfe6f1-f629-49de-9cc9-bcfa563fa420',\n", + " 'd452b1bb-2b9e-4c19-a032-612a546d0cbd',\n", + " '6835d974-ea34-4d7c-8641-c1859fc2e3c0',\n", + " '253f3ebc-8919-4fa2-adb9-da2a6d38129a',\n", + " 'e9670b83-d4b9-4b87-b456-0c1c96525e0a',\n", + " '067ab811-4991-4e10-a557-028212c93495',\n", + " '07fd09d3-c36b-481b-a61a-7171b8d6056f',\n", + " 'db1edbb9-24a6-41d1-9181-1ed0f93b98b7',\n", + " '30648368-f131-4979-9643-06ac50e88f02',\n", + " 'e95afca4-cdea-4d17-b966-f33810cb536d',\n", + " 'c69f9364-8231-46ec-ac15-5cada18ee337',\n", + " 'a18d04bd-0b51-49d1-9419-145aea24bc92',\n", + " 'd362722e-18a0-45cf-9aad-1cc799a98c32',\n", + " 'ba79a6cc-5ac1-48d5-b97b-049fe4ebe4f8',\n", + " '881fd8fc-9de9-4282-ae25-c96706fd8f1d',\n", + " '5aad5297-e825-44c3-a50b-0e797d2b0cc2',\n", + " '86685d71-4266-47aa-84de-8ebc389d6904',\n", + " '55948065-339d-4820-8531-bd57f812630e',\n", + " '5d002921-075d-4e30-abb9-e2ade8dc8468',\n", + " '8ea2dce9-fc53-4e86-ac3a-7d5cf15bd21d',\n", + " 'd1dd81bf-09b3-4709-b8a8-8fa19782479f',\n", + " '717c96cf-d77f-40dd-b5b7-c9dc987838a0',\n", + " '117fe04d-961a-4e21-903a-e5418dc4e19a',\n", + " '42f18c1e-754a-4a28-8a0b-c629d012401f',\n", + " 'd322d4a0-47df-4f8f-a29d-80e5e719f2ed',\n", + " '8241bdfc-158f-427b-8e99-c08d2760e271',\n", + " '60c6fa1c-06aa-47d9-a951-2636bec5a403',\n", + " '2a759188-2952-44dc-a163-649132d895fd',\n", + " '1050bca6-452c-40d9-a5e0-e360ccff175c',\n", + " 'a2fa6ae5-81f1-4b45-9a32-9128eade7c15',\n", + " '0f574c62-928e-403c-90b2-cd3f2c939bdc',\n", + " 'be562a1b-ab04-4e5d-9557-293347e43c12',\n", + " '3c25634b-661a-435e-983d-3e5ecb41bfa3',\n", + " '09a0eb3f-c84a-47d9-bd75-bd61cdd3cea0',\n", + " 'ee8d23cc-ae7f-4d3d-a336-643ed69eadb9',\n", + " '226e028c-e309-4cc0-9682-bc05b1b1ed4a',\n", + " 'ff752de1-e908-4863-bf76-216b98b373fe',\n", + " '9c7dd00f-0ecb-46c7-9ced-4a2ed67f5f69',\n", + " '73e8d100-627b-445d-bc0a-561bd8ce5cd3',\n", + " 'da9565a5-6639-4019-8d3a-05f6a5d95f88',\n", + " 'cdc8f011-cb59-4464-a4da-8d5167583ccc',\n", + " '8f244e5d-9d30-48d8-bdde-029dd63370ae',\n", + " 'dada795c-64fb-4c20-9ad2-11f5fc5ef88b',\n", + " '5a4f6fce-de98-4755-84e5-5d589376c240',\n", + " '5851c4b2-3b0d-4161-9764-fcbc499d1f9c',\n", + " '48243b41-4579-42cc-9a8f-62d2f16598fd',\n", + " 'ef6e1c77-5432-42f6-a982-6055416ef761',\n", + " 'c7235f4b-b1fc-4563-9ab7-f9bf2c92924d',\n", + " '5404338b-b8dd-4fb4-a085-7d5a5ffa52f9',\n", + " 'aa600bdb-aa66-4f1f-bbc6-2b64b13219e5',\n", + " 'ab6a89f0-6702-4b3d-a56b-c78ce6dbc0e9',\n", + " '42748883-1b19-4d77-95ec-2e8fe13585a2',\n", + " 'a5cfc2cc-a924-4e01-9485-8533a07f21f9',\n", + " '97f33ebf-286c-423a-a9bc-8162be8a18cb',\n", + " '8174caa3-a248-40e1-abf5-8170116905f6',\n", + " '4ed49607-5bbe-480c-8e3c-3008f0911127',\n", + " '0bdad8bb-673c-4aa4-99b7-fe90c22c2642',\n", + " '74841779-a1f1-48d8-8286-e6d7ca0eb795',\n", + " 'bf4aae38-7941-47e7-9b68-8da3aaad6d0a',\n", + " 'dd578797-7b3d-41c1-abab-1c38c7acb0d8',\n", + " '0803ccdf-36b3-432d-9cfa-57641aaa442d',\n", + " 'ef8bcd82-1d43-4bed-a757-262693d8dc75',\n", + " '91ee51c8-ae2c-4cfa-b84e-63f9f81027f1',\n", + " 'a9965172-9c44-4224-983a-da55feda760d',\n", + " 'c9fd9608-c056-44fb-9e75-d0f2e1770e43',\n", + " 'ec350aa2-daab-4c95-9973-bcbfe48fac8f',\n", + " 'd92dd3af-2bcf-461a-8bf6-d1b062d49921',\n", + " 'bba04268-814f-497f-acb7-ddb4bc603d42',\n", + " '57a4ea03-9f46-4f61-933b-2cbf637347ee',\n", + " 'a8ae6835-30e3-4665-a5eb-24d08fef18e0',\n", + " '77168818-b019-4fe0-997c-8d24c9ce2270',\n", + " 'cb0cc795-4de2-4ad7-8b88-f1d1f7a6e92c',\n", + " 'e0165924-abd3-47c2-85be-309a8dcd22d9',\n", + " 'c854be7c-dccd-45e6-b7b2-89e67f154cc8',\n", + " '139b719d-8b10-49d3-8114-0d46d2d6f989',\n", + " '570d24d2-622c-40a9-b574-f196fecbbd33',\n", + " 'e66b599d-1e57-4970-8a1a-318ef4bfc520',\n", + " 'a3ed9dd7-1719-4e3f-b6a3-4f5f703d7a09',\n", + " '35859690-2350-4b9a-929c-3700f47df8f0',\n", + " '2ae03851-44c4-42aa-bf5b-a39ec6ea6723',\n", + " 'd3ed80ac-8dd2-42f7-99e2-3edd7eb9b146',\n", + " '8b49bd54-6534-4046-ae4b-3d1e61020315',\n", + " '88f52c68-b7d9-4671-a9a3-d1eb3e20383f',\n", + " '476cee18-87f4-4b66-866b-0dee263fdea1',\n", + " '7a188330-6d7f-452f-965f-7ba804c44548',\n", + " '6de4d082-d3f5-4547-964d-ca38e4acf7bd',\n", + " '1b8b5dd3-cb23-41f8-bb0f-ade1d8e3d780',\n", + " 'f6d35795-89f7-4311-b5dd-6302d78850ab',\n", + " '4667d952-f112-46cb-9de1-336becadbdab',\n", + " 'af65d0d5-985b-4155-8f1e-98c207a51c83',\n", + " '2096ab58-401c-49a2-8bdb-59e37f8e6178',\n", + " 'd2730620-017a-4b1d-b414-66412da91021',\n", + " '2013bf46-bdad-4507-adf4-14934962c268',\n", + " 'a1f36607-cf78-4a20-be77-efe9215bf466',\n", + " '500926fc-4961-4e65-96ef-672eb90ecc2d',\n", + " '551647c0-52b0-4bfc-8481-f8a4d11a8eae',\n", + " '3a14cdb9-ab74-4ebd-a327-687b0a40d923',\n", + " '68f9b570-3406-4f6b-b3b2-890fc6c1ea9c',\n", + " '3111c6b6-ac8d-4c4c-af46-774e455a4d69',\n", + " '0e25ffdf-b3f4-49ee-81cb-7bdc8e51360a',\n", + " '2efd2caf-3169-4cb9-aac4-9468674ad8c3',\n", + " '31e9412e-296e-4124-9563-4357ecad6397',\n", + " 'c94eab1e-d6aa-436a-8375-92abde8c4ccb',\n", + " '9095fcd5-de45-4246-8676-dcf0ae654b4c',\n", + " 'd5fe20d3-c122-4d89-b5f9-410d16091bcf',\n", + " '44c0f47e-2fee-4828-8079-8e8018497707',\n", + " 'e7c3b40d-ca2f-4bb4-bbc9-a7df19a9e9bf',\n", + " 'c630a568-907b-48b4-9578-63413fd21371',\n", + " 'b2cc2687-1def-4479-b1ff-722f26a50f6a',\n", + " '41e18e9d-f200-4be1-9a91-10d0dfab481b',\n", + " '623760fb-e805-45d3-92bc-d25371e3ea08',\n", + " '02a926c0-9f3c-4a53-bb7a-40339cacafda',\n", + " 'ed61d459-7cc2-4d6a-ad9b-e81a9121355b',\n", + " 'f7afe408-4eb2-4f55-be78-57b1925a95d7',\n", + " 'cdccc379-58df-4609-9c06-0e92ee13658d',\n", + " '3b9af532-fe53-4f0c-b872-2d6d5af8243e',\n", + " '3f8bd8be-6167-444a-b3ed-081bb7d2d00b',\n", + " 'd8e248d8-f471-4411-9bad-bcc2b04d654e',\n", + " '5ee5ac01-44cd-4e8f-946a-33dd8ebabead',\n", + " '673122c6-3de7-426d-ac63-b74c303c25d7',\n", + " 'e74a0950-a71a-4382-a3b7-2210d898d209',\n", + " 'fe91e5c6-89a6-49b8-9289-df8b905bef66',\n", + " 'f684fbfc-3615-4c02-8f88-70eb29591716',\n", + " '567791a3-3234-4645-b51e-b1b37f357631',\n", + " '1711b6e1-6a1f-45dc-a935-61465dcf9b88',\n", + " '57e255d6-57d5-46ed-a739-0c16e14e9b82',\n", + " 'e5d3ce27-62a2-4f41-a9c5-3188b8f0a4cf',\n", + " '8b6d279e-b1e6-4a77-a051-a0cf0454943a',\n", + " '73f2a6d7-1548-44f8-a0fc-0279986388b6',\n", + " 'cea502b0-42cf-4ca9-b2f4-76da27e74008',\n", + " 'c74e4345-e7d2-44cb-b7f8-0a49d30554c9',\n", + " '4ad94b6f-d86c-4f11-92ff-874d86292a7a',\n", + " 'de3ad667-e1d7-4997-94f8-6fdc87d00068',\n", + " 'fad76f30-ef88-4ff0-a4a6-e70da8e985d7',\n", + " 'cad2a64b-8b5a-42cd-933b-4c5cc2aa8c01',\n", + " '8ca165ed-e146-41ac-8e46-2efcbb1bfde5',\n", + " '0c795a67-2df5-4d03-a618-6e556e73c3b3',\n", + " 'bf276283-744f-43f6-9f9f-a9bbe2554870',\n", + " '8701e162-2001-43a5-85cd-2823ab659a65',\n", + " '0e00dd81-bdf9-43c0-9ddc-15cd024d5bf3',\n", + " '714f8489-184d-4705-9ebc-f79870b344e0',\n", + " '5e9fcb14-1055-42de-b8bf-93f4c1531136',\n", + " '5d97e18d-fb84-47f6-b45d-401cd2549592',\n", + " '0742514e-ce3d-4777-a387-66da370a453c',\n", + " '92e9f0d8-5854-4116-b92f-7ad88654e6b5',\n", + " 'ff734a8c-47e1-4fdd-bab3-2090d52cc7d0',\n", + " '6bb58dc4-10b3-4f13-95dc-2bc00a689dcf',\n", + " 'a664a630-1df1-4ebb-b28c-92a5fd911056',\n", + " '2115a439-ba6e-47ee-931d-d869262bb61c',\n", + " '029a9ce4-5c82-4cb0-8df5-0fd799c25135',\n", + " 'a0aa26eb-071f-457c-8690-af85e2c7d28c',\n", + " '273690b2-9a7b-4bf4-a48f-40f8c033abea',\n", + " '4f76152f-2757-4209-910c-65799b1256f6',\n", + " '36e1f311-7df0-49ee-9947-449633455cbd',\n", + " '3843c760-5e6c-48d0-8531-32cc86fcdb2f',\n", + " 'a8fe7ff3-4343-40b8-8a64-a2e4c56700fc',\n", + " '0c9976e3-257b-4dac-9b8e-3fcd505b8674',\n", + " '479c1963-4842-4a58-9875-34c97f480a28',\n", + " '4a221fde-51ec-49c0-bef9-27e03c754374',\n", + " '65672c69-e117-45ea-ac13-12740a4ba6f1',\n", + " '58f3ae00-ba7c-47cf-886a-b85a63392a12',\n", + " '0282547d-229b-4d6b-9111-2f6160218f2f',\n", + " '68eac83c-2f58-498b-ba5b-24f1bc3168e4',\n", + " '35ac648c-11ed-42ea-99db-838c27db26cb',\n", + " 'd0bf8079-0e30-454c-afb3-f2213545c654',\n", + " 'db01a9f0-7769-4098-8073-fce0c12f5625',\n", + " '43168152-b75d-432e-a80a-4c36684ac1c7',\n", + " '96e4cdb8-fd25-42f5-873d-3ba7c0921e10',\n", + " 'f4a7a915-adfd-4410-b2e6-96440aacfa2b',\n", + " '0c43b7c8-e0fd-429a-a1cc-24b669cd3cf0',\n", + " '86f933d8-a6d9-43de-8f78-1d5b21cf8a2a',\n", + " '7e46ff22-77dd-43e4-b107-d3d7f223752c',\n", + " 'fd3e9c37-1869-417a-a985-42b6a73ad90c',\n", + " '51be7d75-f086-4998-97f7-787e3d820aad',\n", + " '07eb4c04-3008-4dfc-92ff-208fb109566a',\n", + " '9728cb92-131a-4a20-8853-4431f1acb6f8',\n", + " '87fd6fcb-df62-4984-8f77-205f193b7f9c',\n", + " '9000e823-a05d-4795-ac6b-36e09eedf3a8',\n", + " '32e400b6-2db1-4dc9-9037-769e073c66cf',\n", + " 'cf242948-3ac5-414f-9ee8-270720686540',\n", + " '831e0987-32f8-4957-aefc-b1a333748025',\n", + " '8ebcb64c-d730-4ba1-8f38-5e8db8b5076f',\n", + " 'd99fa1b6-fc58-46d7-a8b1-62e5628a5be7',\n", + " '97ec9a97-e8f4-43ac-8b44-f5ab5e68ac52',\n", + " '57962607-be4f-48a4-9156-1e5b461d5f28',\n", + " 'd5ec4136-ef24-4014-a803-6b0a358f1e9a',\n", + " '36e5351c-0e0b-4de8-b129-0693b3369265',\n", + " '7c74d449-ee03-4ace-aa52-5a1cfa3f53dd',\n", + " '90e5c0e1-b50f-45b0-8a9e-f528756bdbf5',\n", + " 'a3f714f4-6ecc-4c27-a360-2322f0444023',\n", + " 'ffaacc3e-dabb-4fc0-acea-20c4f87e8930',\n", + " '62348591-61e2-4b54-bb90-9d84dc721b79',\n", + " '1dc6e427-720b-4464-b210-cd22f8c8ac80',\n", + " '66222d6b-e707-4eef-8ab1-e4195346a6ad',\n", + " '8074010c-fd43-45d6-be36-54dae4b17401',\n", + " '250283a6-f5dd-437f-92f2-15f82afe6748',\n", + " '1134d59a-140e-4105-92eb-0f3a14651211',\n", + " '1b39180d-dc58-49c0-8f3d-b40aaf65f58f',\n", + " '4aa9ef6d-8851-4851-a279-5b23057ff043',\n", + " '99bf1403-44ec-4ec2-837f-733f8818e35a',\n", + " '40762f8a-2ef4-4554-a9fb-5256e1bde1d3',\n", + " '708482b3-a2d3-42d6-b6d1-57fa5418d6e1',\n", + " 'f980d946-c818-4e76-85f4-fc2243f53dfe',\n", + " '1e2692f8-3417-467b-ab43-fa807dfd1ec4',\n", + " 'f4a0a083-80ea-4cc2-b556-44af1d379fc5',\n", + " '0a2ab8f3-1c4f-477c-935a-27e8a42b6b20',\n", + " '240aa451-681d-43b6-8f2d-f418ce15d42e',\n", + " 'fb42ef24-02ae-4c4b-8031-3d63acee2b1f',\n", + " 'ccd0d91c-e15d-437e-803c-1e0d17faa0c6',\n", + " '9c80ee0b-9d41-468a-a342-55ee4f94dc2e',\n", + " 'a447e844-d2ea-4463-ae89-2b22ad58c359',\n", + " '60fe79cb-0096-4ffb-89c0-ddcbf1b854a5',\n", + " '5a4fc59c-631e-4854-a340-ce54ec2b2196',\n", + " '0b0c25c9-07fd-45de-b48d-a3d387fe9737',\n", + " 'a08f4e60-f3dc-49cd-85ba-7c0d71b498b1',\n", + " '966bbd96-7409-4e45-9361-bd9f9c2e9d00',\n", + " '65c38010-e855-44be-8d2d-25528731d6b5',\n", + " 'a8ed3c92-e2db-4c7a-997f-53f13dadf9ea',\n", + " '0babdf0d-7269-4f7a-a8b2-c0028a122688',\n", + " '92014bf4-383b-4831-8318-d95a8ad15d95',\n", + " 'e6d5c901-5ebb-4491-93b3-8481afc7d07e',\n", + " 'e8694ba1-5500-4a75-afdc-7762f78728c0',\n", + " '4030b103-a29b-45be-9b74-946ad30c0b9a',\n", + " 'ab96a8a6-f332-4cf8-8a59-f315606f15ac',\n", + " '00c3b07d-33b2-401b-a52e-11198937a743',\n", + " '367babca-9588-41e9-a41d-c63c3ff6ad2e',\n", + " '0b4d6dd3-0042-42ac-be71-80d4eaf239db',\n", + " 'db880a24-3a77-431e-ae65-b6a1a73c3052',\n", + " '78307532-afb3-4a38-bea7-6137dfb40a96',\n", + " '641d6d57-6a7a-498b-82f1-37e6e10a983d',\n", + " '61c12b8d-c552-4a61-a391-8e194796d63f',\n", + " '6488c41a-96cf-4242-b507-3ccc8fc09399',\n", + " '9562212c-d353-47f8-88c0-6d2ba3799434',\n", + " '05f46044-88c0-4e3c-868f-10dd81f55728',\n", + " '8f178780-b530-4432-b023-94641ae9e9b2',\n", + " 'c002a3e8-ec80-459c-99c9-390028d649f1',\n", + " 'b9f53a43-db6e-4871-85af-bb344cec62e6',\n", + " '3fb343a7-c607-4ffe-8d3b-d5f14818598c',\n", + " 'f3a098c7-7981-4f97-86ae-07794d1f8930',\n", + " '96b41930-ead2-4470-90f8-967b69e698d0',\n", + " '339bb1a2-82f7-4fad-b211-de6f82b6d4b8',\n", + " '0990e464-32a5-4bf5-9d26-13eda9bb2b26',\n", + " 'c13ccca4-79a7-4af7-b47a-0d67272aa83a',\n", + " 'b66d3ac6-d6a2-4045-9b38-77abf1c303c2',\n", + " 'e8c013e0-a96b-4caf-8734-4d4cf21c7508',\n", + " 'af44cf42-a0b3-4b62-82f6-c9b07fac4517',\n", + " '74c619a7-505b-418c-a03d-125b3c25a380',\n", + " '9d303736-c614-4418-ab07-cc217ff17a10',\n", + " '10f50143-6889-44f5-a105-534f66309a63',\n", + " '3ca48707-01af-4291-b90d-1d0347dcff8c',\n", + " '46b0f976-cfcb-4345-b783-aca7911ef23c',\n", + " '39de5639-70d1-495a-848d-e46ea76888d8',\n", + " 'e179bb04-651d-4275-ab9a-705f5a0e3be6',\n", + " '3895c487-b408-4282-832f-8df4d6384bb7',\n", + " '6c0af2f8-91fe-46e8-9c48-eed3523f86d4',\n", + " '897988f8-4847-4b89-bb74-de34771c43d3',\n", + " '0a2196c1-ce2d-4f97-bc79-e4a7f578ad91',\n", + " '3361a1ef-c7b1-472f-a13e-163b70d294e6',\n", + " '8915f6d7-55b1-43ef-a195-bc9ec487d8a1',\n", + " 'e0573d76-d54e-4615-82f3-0fd524fffa47',\n", + " 'ed7dbd3a-9688-4cf8-b3e0-53dcf221d021',\n", + " '0c7ef6b6-77e7-447a-ad6f-1c25a8b64c96',\n", + " 'f744affa-b4ab-45dd-b248-92c74aa255ca',\n", + " '3433a2a0-c1dc-4224-95a4-858b6e417a35',\n", + " 'ce7f7daf-2c12-4969-86fe-50e452215c86',\n", + " 'df9f110b-6790-48c6-ad08-b01803c4ce22',\n", + " '252f4c98-56ce-4d5a-a947-0a24e5bff501',\n", + " 'ee09e3a8-8e16-4b7b-ab52-47c0c925cfe1',\n", + " 'd3447382-6c98-42a9-9246-e346ec7b60e4',\n", + " '9375b9bf-1a0d-4625-9962-fa45abf3bb01',\n", + " '6570680b-f8a3-42d0-93b9-31d53b1fadc4',\n", + " 'a93c74cf-c74d-4c3b-8d33-d35611508618',\n", + " '0a38602b-2428-48b6-841e-c539e17a2c95',\n", + " '6620319a-9314-4b45-bfbc-a448c16a53a9',\n", + " '8d6b7b01-053a-47f2-aed8-d6ef80a9bd40',\n", + " '840b3d7e-c232-4869-a701-71d0b16b2b5d',\n", + " '14445964-fa45-45c9-88ee-f58c930ee02d',\n", + " '935ab3ff-07d0-48d2-8551-24769caaa579',\n", + " '2fe39b5b-a876-4849-bdd9-edaf3538d9d1',\n", + " '6e315690-9c64-4cf1-ab6d-24f63764ce91',\n", + " 'd834139c-5ba9-4d53-93e0-a0fd307f030b',\n", + " '34522c38-9104-4a01-bd37-61f393024813',\n", + " 'cbdee4fb-6b5f-4d73-b922-5bbddececb9c',\n", + " 'f6efffd5-5c15-4316-964f-f6b862a17719',\n", + " '951ab049-fe3d-4e08-91df-cea71b0beee4',\n", + " 'f00f3e24-5b63-4091-a5da-a308c192e915',\n", + " '923ae62d-202a-441a-9689-3f98b691736a',\n", + " '78c40b05-fb7b-4c43-95d9-7d3c391d632d',\n", + " '1f535f8a-62bb-41b5-8385-1ab1d530c48c',\n", + " '6e695c9a-c0a4-4de7-bac4-a08c7b28c502',\n", + " 'd94a76c6-14fe-4eca-a384-a2b90bdf027e',\n", + " '8dec22a0-3395-4193-bb4a-a59991d9b30b',\n", + " '96c2c786-9f88-40bd-97ad-b3444640dc03',\n", + " '0172ef8f-c9a9-4dd7-8973-ae3a330887af',\n", + " 'ee6c04f6-4564-4482-bec4-8032f5319ddd',\n", + " 'fbec4552-7773-4b7b-ab81-fa95477df94e',\n", + " 'bfa44e7c-4a2f-4e68-b80b-e4be84aaa3a0',\n", + " '1f6f7610-35e5-41d0-9352-aa68a94115b4',\n", + " 'c7622834-d01b-497f-866f-1da3a9002668',\n", + " 'ed3e899e-8303-453b-870a-a6bb84913a06',\n", + " '13898033-08c4-45fc-9c11-2d3c58f84107',\n", + " '4b48e7db-adc5-4a98-af9d-192bb78fb755',\n", + " 'b0d43a42-db9b-4846-ae14-09b846d42789',\n", + " '2ad46e57-8eb5-454f-926e-f3a0c019dc07',\n", + " '17453555-0af2-4f4d-bf2b-fd6e10d5f4c1',\n", + " 'd0fa48aa-887a-487f-a065-f95b433dbf1a',\n", + " '53b5f6b0-0930-4097-bee9-6b606bb4aa7f',\n", + " '2d75bee8-119e-4d51-8979-bd22f4e71438',\n", + " '6217b23b-e68f-40b7-847e-2708b496668b',\n", + " '4919a33b-5269-48e1-a33c-de2a814e4897',\n", + " '2e8c377e-90a6-416d-b3ec-f6711481003d',\n", + " '2a5b2374-18a3-46ac-94a7-9896c04b6d4c',\n", + " 'f00907c0-6d0b-4607-93ae-1ff09eeb50d8',\n", + " 'be053782-ed92-434a-a27e-f57722501b68',\n", + " '8caaeb4a-b194-4954-bc5f-8cab53bd5e17',\n", + " 'db981be3-5c60-4892-90f1-bdafc67e98c9',\n", + " '435344c0-cd84-4274-9719-899458d03ca1',\n", + " 'bdcd6406-e6df-4e37-a639-78f38a28c028',\n", + " '2a2f0144-9b59-43a9-bca6-f31091c20c36',\n", + " 'e8c8bd47-484e-459c-ae3c-79db2d2827d3',\n", + " '7ba5a8cd-2e13-448a-baab-04f5e6f01ba8',\n", + " '63346bc8-b6a3-4af9-add6-5f37cf719dc7',\n", + " '609ce304-2be2-449c-8d70-66a089cf07c2',\n", + " '9527e45c-bde3-4293-b166-932e234e2bc5',\n", + " '28167f57-73c8-4c31-813a-79bd57a4b8fe',\n", + " '39193d1d-99fa-42c0-90d1-e707febfc62e',\n", + " '7641fa21-3f96-4394-8b91-9c90472e022d',\n", + " 'edee3dba-4a9b-41b3-b2e7-379864c5436f',\n", + " 'f08f2e25-5dea-416f-92ee-0f583a694343',\n", + " '7b8e8f60-8760-4719-8230-f81f3b76ebcc',\n", + " 'f323b1b1-02e8-486f-91e5-f83e1eadee98',\n", + " '07a0ff51-fef2-45da-8190-61062eba547e',\n", + " '0ee1a280-c657-4f45-ad88-9a8338347200',\n", + " 'c476f072-f5a9-405b-8d76-a6121c0a8fcf',\n", + " '5579ef24-77b0-4440-a2fb-a227934ad349',\n", + " 'f491fb7a-252c-4980-a839-d88a4104fbf9',\n", + " 'a1fa5038-dfe7-4fa8-8033-af6c99d16045',\n", + " '10e4d4d9-d22b-47df-8233-ae24c5e055f0',\n", + " '55130b7c-386d-4226-8ae7-0f1d2aeeef79',\n", + " 'd0002d82-b3e6-4545-897e-2b3aa7e25535',\n", + " '99e43b60-34a2-46c5-9151-a16f440ca696',\n", + " '5481e12e-7605-4ec8-9421-dbe7c823499b',\n", + " '3c09e837-6984-4fc7-85f2-f28ed9df6bf0',\n", + " 'c6e1192c-9751-451a-8736-f6916e1cdcf3',\n", + " 'de8c96f7-571a-4b05-acd6-edcf9730bfa3',\n", + " 'a4a4bd19-09dc-4969-804b-6cb4d3f3adec',\n", + " '0109886b-c8d0-41e7-8281-21e2e7f5f5ef',\n", + " 'fa32f043-64c3-40c3-99c0-519866a1d38f',\n", + " '177bcc1a-49d3-46e5-8f29-d51ad4591389',\n", + " '6c4cba78-eacc-4215-9956-1427d9495bfc',\n", + " 'ce9977f5-20a9-4694-a83c-428cab0744b1',\n", + " 'd5abdf9d-f0d6-4b1f-9233-d759ebc4a914',\n", + " 'ef2b181f-63c9-4035-a7fc-4b64172a9f14',\n", + " 'edff1eeb-521f-4286-9d11-fdfb40e2937e',\n", + " 'e158328f-1841-4c28-a27d-934cbb88db67',\n", + " '549027c4-544e-4e00-8425-44717563a267',\n", + " 'e8a63d37-8a39-4228-914a-0c5f945abaca',\n", + " 'd336bada-0ca3-4946-a193-82704133003c',\n", + " 'cb60e7c8-11c6-4809-866b-53c0b81879b2',\n", + " '5b17a50f-4bbd-4980-80a1-126d63a660dc',\n", + " 'a5d79469-d699-45a7-b9d6-378b45aae32d',\n", + " '75958b4c-aa0d-4ffa-8ec4-5054004c638e',\n", + " 'af2d8d74-607e-4ab2-ba7f-40af37c91742',\n", + " 'de79f705-a48a-462a-9fa7-29a0dc6f2ad7',\n", + " '8fade724-5ee3-4128-93ad-d054ac16a5c0',\n", + " '40af3eee-6308-4b35-bf3c-0d8eb59369fd',\n", + " '8589981a-4e18-4da4-b351-5348bfb5d151',\n", + " 'e188a21b-6f48-4201-a615-a6e1b8b66077',\n", + " '591af725-1a69-4424-92ba-ed9b3f364b1c',\n", + " 'af6ba993-f29d-4bb3-bb13-f5b708defdb7',\n", + " 'de02cff2-883c-471a-917c-5c188530c9f7',\n", + " '407ebe8c-7589-448e-9cc2-863537882ab0',\n", + " 'f82a4be9-e603-49ad-ae88-9f6fd5e23497',\n", + " '2fe3c4cd-d2a4-457b-855f-f33114a9ed3f',\n", + " '2e7c8ec1-fc25-4f5c-a5bf-08747dbe4d12',\n", + " '867dc5c6-0a69-4333-b24b-377fb4e6a207',\n", + " '14fa3e6b-0045-40ae-bb1b-4a4217891ccc',\n", + " 'e6daf0b5-1f54-48f5-a1cc-54a3a512c0c1',\n", + " '5c675d3d-29d9-4680-9122-28ef67309b98',\n", + " '8a69328e-c110-435c-b501-313d64184c19',\n", + " 'a5457b52-29ad-4a86-90eb-33034e4a5dc0',\n", + " '651c9843-36d9-4d78-803b-06a4cef7b2de',\n", + " '8353afc0-3b6f-4897-86c4-2284d38d51c4',\n", + " 'dc32fb26-f12c-4a9d-831b-6cb130c6ac32',\n", + " '836fe7b5-8a20-49a1-b7c9-7dec95cdd007',\n", + " 'f2ad653c-7f1a-483a-90d6-a8207d55ae45',\n", + " '734cd609-c1dd-4748-aa27-fae565ab8654',\n", + " '604dee59-7cb6-4ad8-a24e-195ca78e0dc4',\n", + " 'b2091d85-b736-40fb-8399-4a1c445cdac1',\n", + " '110de47e-15a1-4514-b2fc-5785f0b2572e',\n", + " '4ce2bb29-0244-4af0-940d-181207a9aa9e',\n", + " 'e8ef2ec1-7e39-40c3-8503-d13f18258a99',\n", + " '586555ff-33b1-4cc6-ae96-a48d03846cc4',\n", + " '56973462-c91f-48f6-9cc9-40f2edda21ed',\n", + " 'd6654378-fd75-4d46-bf42-6a2241edb2fe',\n", + " 'd9203ced-32ef-4621-82d0-017288feef62',\n", + " 'a0cf8ea4-9224-4aba-8ebc-ebd6d34ba469',\n", + " '43f994e2-9bec-45cd-90e0-46580f734450',\n", + " 'f2c5c0eb-8c94-43aa-b43e-453b2e7ac668',\n", + " '5309f954-59e9-42f2-8c1d-372f3ebd5ed7',\n", + " '63b0cb14-02ae-443e-826f-4cddbc47b6ec',\n", + " '2d7ae5e0-8758-4aed-af79-129d7ea4fe06',\n", + " 'c9f69582-163c-462d-9c27-a00f5fd29e43',\n", + " '303b2c1d-6281-4b38-9199-4ecfcea5ff68',\n", + " 'f791ce46-9784-4c41-b6bd-548bbbe5ec13',\n", + " '6d7c108f-f2f4-4137-8b1d-fc90efb65dce',\n", + " '534a8cb3-9158-404a-a717-aa99b9e65d25',\n", + " '1c6fa8b3-b0c9-49e1-b22a-75f0222493d1',\n", + " '6514bad0-bc1a-459f-b5f6-5528208680d9',\n", + " '07df8e33-5bff-4754-9ff6-157e8fe27225',\n", + " 'd0b087be-5349-45dd-8f53-7c7ca949e3f5',\n", + " 'cb035968-a933-48d4-b036-3179e3a32ed8',\n", + " '35a76ad5-b512-46cb-aa9f-e737f9df64b6',\n", + " '123f5626-5be6-4ec7-894b-efe7850c7c3d',\n", + " '7eb9c81b-686b-4616-b216-b00c85e6063b',\n", + " 'dc1748a4-3b1f-44a6-9824-a7b179bd39fd',\n", + " '272356b0-65ad-49b2-9c75-5f3c9f77aed0',\n", + " '17fb8e50-bc24-4a0e-9c1b-720ecdaac8ac',\n", + " 'eb94e4bb-8983-4637-aed4-579f47c8c1de',\n", + " '60e731ad-f6b1-4322-823a-423ae6a777ce',\n", + " 'f0a119c8-b8b7-4dee-bcc1-7a7cefe9fb60',\n", + " '1cb86cfc-5bac-47ab-ac8e-2a2542c61a66',\n", + " '9e0c24cd-3988-4762-9790-dac138dd7d09',\n", + " 'd0abbc7d-dca4-431a-9bc0-c6e256089d4f',\n", + " 'bcbf494b-f022-4855-bbd5-52fae9018497',\n", + " 'fb0e65c4-b652-4151-b414-f8cb1794fc3a',\n", + " '48f28f62-b446-4ec0-89cd-cc6079bdeb45',\n", + " '4db25f19-4ea9-44a0-aebd-69d048dfafaf',\n", + " 'ffeba7e4-d03b-4b56-9a99-ed024fa1785e',\n", + " 'e9634d3a-d591-462e-8016-bbf72c7245f1',\n", + " 'eba2d25f-be37-4548-975b-db88ff963f75',\n", + " '8e284ba8-4d54-42ba-9116-4cfd41e09686',\n", + " '7865290c-bb9c-4cac-86e2-fefd6f403adb',\n", + " '0a91777f-ead5-4745-ad10-bfb1a87bae58',\n", + " '266bc935-8aae-49b9-b217-08edf5b6fa22',\n", + " '8246a64a-370a-4be8-a81c-45adfdb43237',\n", + " '6bd5d4ad-6198-45fe-8004-cdb2d9153afc',\n", + " '9813e05e-6340-4008-9635-5113fe8df2cb',\n", + " 'f409b403-4a3a-43a4-a779-aaee2c836421',\n", + " 'd5d954c0-9c12-456a-b556-349a57a7dab2',\n", + " 'aea501cf-77a6-431a-b64f-ae73ca78af2f',\n", + " '1ccf6bf2-890b-465e-b61d-a023b39e2c97',\n", + " '56fa6680-0605-4aee-931c-d9b71430f73b',\n", + " '5edad202-6ce4-45dd-94a6-2762420d8770',\n", + " '45082524-a802-495c-a9a8-31ccdbaa7eff',\n", + " 'aa0f11bb-0505-41a4-9799-a38db0d7ce3a',\n", + " 'c9b656bd-76e6-4df0-ae1e-d0d002f1eccb',\n", + " 'a6252a8a-5307-4fca-b0f9-cee47b04d8b8',\n", + " 'c5d78728-7318-45af-8126-f1a568cdbec1',\n", + " '533c9336-2b19-4a96-94c9-2d07cf207d55',\n", + " '6db414b0-8edb-4df3-8c08-b904c274b6b3',\n", + " 'a78c7b9e-de35-4099-be41-30393167de48',\n", + " '3ef64341-d588-48e6-822d-d7624418d2f7',\n", + " '3207665d-f221-4a81-8b8d-febfc2c91c11',\n", + " '8314875a-8589-4142-b0a7-dbabf3379dc7',\n", + " '78ce1102-93f3-43b9-9f4f-2b4bb812b55e',\n", + " 'bffc1f0f-c04a-4cd8-83ab-cde04b29163c',\n", + " 'd996a4a6-a14e-4301-874c-aaa94ddfe9fa',\n", + " '99307f98-4d2b-4ccb-bbf3-ca5c8b312a04',\n", + " '409ac5eb-2198-47e6-98a1-6e9b2e878f14',\n", + " '473841c7-a810-4398-ba21-b51aab6f8412',\n", + " 'e188b107-95f2-4db5-bc86-422485b89b92',\n", + " '2116f4ed-f964-4907-a570-d8ed2e905164',\n", + " 'bcaa926d-d135-46e1-b683-6b32fba235be',\n", + " '8492e4cd-5754-49aa-ac76-f7787b030b98',\n", + " '2246a77a-e642-4a3e-9535-a5333436f968',\n", + " '418cc633-1559-4ee7-8ecb-ede9ad66699c',\n", + " 'd71507a4-ab0b-46ed-a582-32ebf5534c69',\n", + " '93bceadd-36ad-499f-be07-3a16d154defe',\n", + " '09f89805-d7f2-4114-b824-375cbe5fe38d',\n", + " '22b2c5ec-b4f1-4dba-91e0-f86487a83f95',\n", + " '43df5b8c-be71-4553-bdf9-7b39d939ff9b',\n", + " '7f43c884-fc31-4c35-bff0-2cf5667657c6',\n", + " 'ca77cbc2-82a5-40b4-aabf-696b966637f3',\n", + " 'e8b00175-dd74-4737-9bc4-bf12179ae0be',\n", + " 'd524407a-5261-4305-b46a-1f45335ad9fb',\n", + " '444b6ed4-ffe1-4296-b368-b8b0badf97ca',\n", + " '1637b569-f7e0-437e-9d08-0091905215cf',\n", + " '380c3afc-8e27-4cda-ac11-9d795ba5d395',\n", + " '8518634f-dc92-48c4-a9e0-d16ad532c0fd',\n", + " '711fd6b0-69b5-4150-939a-30367f3535ab',\n", + " '99edce98-1c92-46e9-a33b-51a066d084c5',\n", + " '36be52f8-bfbb-4f34-88b1-725bdf57fbbc',\n", + " '2c2dbcce-36d5-4fc0-819d-a95f061c5ad5',\n", + " '77e3adea-258c-4bac-b172-ea974b3f7ca7',\n", + " 'a51c1801-30cc-4050-b99d-e3ca500b2fdc',\n", + " 'b0fb6b35-2e95-4e54-b698-1b7ef78cfd0d',\n", + " 'bd9a3fab-5e15-4d45-98b4-a0d79182ed12',\n", + " '9f62707d-6a99-49e1-884a-d29c04a63040',\n", + " '9cfc6809-3baf-40b6-be59-f08d1cfcfc71',\n", + " 'b61d9d14-84fa-4c2d-94c4-69d896666442',\n", + " '03a63bd3-c2a1-4664-a210-a9b909371168',\n", + " 'a8b1fde9-2605-486e-a96c-2bd4cfae6e73',\n", + " 'e71a6004-1de5-4fbc-8c3f-5fcf086f78d3',\n", + " '05a5bff5-cdba-4845-9ef5-c7cd66b51aaa',\n", + " 'bbec367c-e89f-4e34-a3dd-99c395ae3c8e',\n", + " '1090f3e9-76e9-4df6-b190-a47ec43ac4c8',\n", + " '7e5543af-9105-4012-bf09-70aa93b8e450',\n", + " 'fcc9be21-3c70-4bce-99ca-feb31ce44ed7',\n", + " '55e4e7da-f672-4751-9815-6e44b515166d',\n", + " '0d85cf02-fa80-45fd-8770-6fb76b793223',\n", + " '2ab8f36b-3024-4d48-b522-8e090cf9b25a',\n", + " 'ec7f007a-99c0-4fab-b6b3-270978db46de',\n", + " '2cb504dd-1553-4b97-89b4-1e818f6c7f3f',\n", + " 'f7e4a5ac-6a7b-421b-a448-095adaaa1f86',\n", + " '28e1ba96-6be9-4c2c-b861-67724c0ec16b',\n", + " '0c870217-0ec6-4d28-9a36-a62f7b48c6a8',\n", + " 'd2137698-2e92-4ae7-872b-cbf2c3eb46d1',\n", + " 'db52c2bf-1800-442e-a90f-8460f427b03f',\n", + " 'f01bbba6-7ec5-4361-9a21-887b87f1d4b8',\n", + " '7481442e-fc63-47c1-b719-67b9a9c5459e',\n", + " '1201f799-2b78-42c9-a3b1-8b4b7f1e182c',\n", + " '20cf9003-92a5-40b7-9923-820c2eca4a18',\n", + " '822c5e93-4d71-4e4d-94cc-3877d7c37429',\n", + " 'd45db8b4-cdb9-43b4-9cb4-90c4ae8438d2',\n", + " '8370d2f5-b65c-4db2-b41c-a7526c905afb',\n", + " 'c92b0288-dc50-4f01-a40c-aa7adfe381cd',\n", + " '69b5a3d8-d2a6-471f-b39d-8a49fd777f9d',\n", + " '08f3cced-a484-4913-8178-9b1fd231dabb',\n", + " '51a1f27a-aec7-42c0-b8d4-0b51ddb90857',\n", + " 'ccdd4aed-aaa7-4cb5-bc53-2a075fbb3b7a',\n", + " '6aa8d4d8-54e0-4550-9458-0ced4eff3f0d',\n", + " '7fc8acc6-23ef-461b-83ee-0d6022c2308e',\n", + " '7368e364-db22-4243-9bcb-380559fdbe10',\n", + " 'f1e3aa5a-0640-4d2a-b83e-c2399746c7e7',\n", + " 'c5645587-38dc-4d12-941d-a6fed0eab93c',\n", + " 'b92a94d3-6bf6-41e5-9586-49268c6ff351',\n", + " '8645b32f-9209-4500-a122-61384d0e38f9',\n", + " '77f6084b-69d8-4b66-8609-e49f55fcccce',\n", + " 'ccb2aaef-4f52-4c24-8bd5-c43309e877e6',\n", + " 'b3b29cee-8d20-4b2e-98ed-2af9b4512b29',\n", + " 'cda036ca-ee20-4ea0-b8ab-0b0f2240260c',\n", + " 'b078dcca-250c-4023-9852-9fe2bd6846e7',\n", + " 'ea41eade-982b-46cf-8794-c77ded34b509',\n", + " '88815e62-1193-4e45-8ae9-7837421e1804',\n", + " '89d12db4-412f-4f13-9d3e-e2879c977e0d',\n", + " 'a5163ef1-dcd3-4eca-8ef1-0fa464d8cb8a',\n", + " '21e0f0cf-0ff3-44fd-9f81-bf0ee977d476',\n", + " '4bb717c3-b757-4389-a33e-adaa536074df',\n", + " '004609cb-4c8e-467e-b24f-e587976b1b0c',\n", + " '85938080-03b8-4bdd-9cec-38e720c8e738',\n", + " '53f5e221-b15d-4359-9ea7-cd68ab16a9fc',\n", + " 'cf2e5012-eeb4-4cbb-ae50-bb54dc3454cc',\n", + " 'a207403c-0f03-4352-846d-1ba51c56531c',\n", + " '28b4e4f5-fbb5-4fc1-8c47-c7cd8edbe751',\n", + " '45ccca48-4d1a-431f-966c-7a2fbf9256dc',\n", + " 'b6863937-8b94-4458-9cdd-d54f558a62cd',\n", + " '33a2ce3d-b82a-44fe-b0c7-743b0198604a',\n", + " '7cafc4f2-a554-4d46-b6a2-a9bfe4ff7ae1',\n", + " 'f474339c-2ccb-46ab-984b-2779c8fed454',\n", + " 'decad8e5-f856-4f3b-b58c-67748783b249',\n", + " '9b1e9da4-464b-4fcc-a20b-164e42781344',\n", + " '32eea073-cf81-4171-a60a-979acbd5d263',\n", + " '32785d92-6c82-411c-b2c5-af9ee4904888',\n", + " '48a5ea19-1dee-4159-a0bf-c3c57222cc40',\n", + " '523cdacb-8d75-4d6b-b57e-0a27c965f2c9',\n", + " 'a6610cb2-8ba8-4b0a-b3e6-b297dddb47b5',\n", + " '4a107a6b-0ec6-4bf4-ad1e-9b3b9a326a25',\n", + " '568f48c8-8e8a-4437-89e0-6b49af1df0cd',\n", + " '0b0ac28a-15a3-46dd-b884-66d96d91f9ed',\n", + " '5d0ad098-04b9-4f79-9daf-e131f7e1b154',\n", + " '0f5e1784-a255-4798-88e7-91c698c27ee0',\n", + " 'e97550e6-ba98-4a5a-abd4-e97711190193',\n", + " '378d92e5-0ef3-4a57-b62f-5fde633140f5',\n", + " 'ee5a8131-b78b-4360-89de-2cd4b25ce5d7',\n", + " '7e16be82-ef77-4c1e-8af2-3a48d1797366',\n", + " 'dd5b2b43-7d30-4f4d-b6f1-e73dc7c731f2',\n", + " 'd41919f8-81b2-4d7c-b8e3-b7749d63cfd3',\n", + " '57b8efee-83d7-4860-b40b-930c5e6798aa',\n", + " '05e4d705-eb58-47f0-9e48-141497ffc526',\n", + " '49ec8570-61f8-4272-9ae2-cf8878b34a93',\n", + " '3898396d-d73d-4047-852f-3566ecee40a6',\n", + " 'f4d15c6a-7b57-41fb-8beb-b6ee5717cd1d',\n", + " '70585b58-3c6a-4e9a-971f-504043ba720c',\n", + " '972e3e39-29c9-409d-afb3-25217c20ccfd',\n", + " 'bcefb624-2175-4a13-a122-404c958ae0d5',\n", + " 'ae2e4e52-c6bb-4c86-a9d1-b8d7cdd29fbe',\n", + " '7a45b2e0-0871-45cf-96cd-c2ab55c10202',\n", + " '38d4ebbd-776f-482d-b539-0d933c3e7a99',\n", + " '3c634532-546f-4f3c-9a9e-623dbd4e69a8',\n", + " '0010bb1c-ac95-49b2-a845-d44c4876b819',\n", + " '6405012a-6355-4222-91a0-87ce7345a351',\n", + " '9be920aa-0cb5-4b10-8d61-fa33a7f3f7db',\n", + " '07ccb80a-5031-4368-89e4-a858f5a5174d',\n", + " 'a4a93dc9-c740-485b-8b01-ca1085b7c228',\n", + " 'ab62fcd8-e5c6-4dc3-8eef-843f9f0e3909',\n", + " 'a84f9048-178b-4db6-8b54-27d16dacaf04',\n", + " '39c05b59-cda4-4bfb-afba-a721b4cad6a5',\n", + " 'c138f838-01c3-421a-a821-a9cb2e96eea9',\n", + " '6a970da6-8813-424d-81aa-ee527a02903e',\n", + " '8d370fc4-7c12-42f1-9fc3-d9129a4ea59d',\n", + " '2aab0a42-f95d-4d63-94d8-4386877d0046',\n", + " '4a18faa8-fccb-45a3-8b3e-2f13ee3d0083',\n", + " '165e9a23-e2f0-4e7e-a28f-2228346207dd',\n", + " '853dd232-c1f5-4c35-a509-2db5c809e382',\n", + " '2db4f47a-d106-4cbc-b1b7-c4adab6ca1ed',\n", + " '29d1aa59-2820-4a3c-b1f4-bc56169d5b50',\n", + " '90f76edc-f6bf-4028-ab66-50f9c17ae47f',\n", + " 'eb8a2c07-e0d7-4048-b740-2ceaa2f9b5a9',\n", + " 'c7505507-8b25-490b-829a-702c9f82f305',\n", + " '7e939ebb-d36d-43ab-9897-fe54534bf026',\n", + " '38ca04a8-1eca-454f-8a66-7c7a9c28a7ad',\n", + " '0615a2b1-2e76-4c5d-aa09-2f9b966ce8a3',\n", + " '3e755e5f-b934-4b4d-902e-9b8ae03ec588',\n", + " '51dbdc39-d69f-4106-a5b2-dca22babbf91',\n", + " 'f8aafff9-462c-4a9c-85cf-2b8a585fafa6',\n", + " '3a9fd7ba-b9c1-4f33-b89f-9a03b8f9a98a',\n", + " 'b2459bdb-b70f-4c72-a936-95ece90e86a9',\n", + " 'd06b55c8-6e6d-490a-afb6-6206752915cd',\n", + " '192de348-9859-40f4-812b-ec32f6379c49',\n", + " '8ed4b9c1-5457-4300-8fec-bea3c09187f5',\n", + " '09d0d4dd-6434-4f0f-8af1-d5c7baab8469',\n", + " '4e1c9e9f-414c-483a-9f13-296669c3a360',\n", + " '98daacb4-8033-4458-99b8-a18584578792',\n", + " 'e8d41bf9-dd7c-4765-b811-ea392a7da4c3',\n", + " '2f3c16a3-4372-4661-b87e-133494fc072c',\n", + " '8d75560b-18ff-4683-90c0-f505eeaa562e',\n", + " '91aea2d1-31db-4ec9-a022-88b5012a3672',\n", + " 'aa404a92-e43e-44f7-87c2-e6f80f6d9551',\n", + " 'e151c235-701d-47f9-9cb5-c2a3d0452154',\n", + " '5ce9aaf5-434f-445d-9d1f-0900702bf86d',\n", + " '97bdc360-686a-419a-a012-e2aa16516c17',\n", + " 'bbcf4b79-81a3-4a0e-a64a-3b06c3f09005',\n", + " 'e49b18b2-e2bb-4810-bbd9-4aa8c84c649f',\n", + " '3fb6daca-71d0-4906-bf32-d50bd7d6c1cf',\n", + " '5ae8536e-32e7-4298-80f7-78056a74684c',\n", + " 'add5dec4-e152-460d-8f7a-51e88eb84889',\n", + " '6e63d188-ba6f-400a-aa90-bdcd95c8c911',\n", + " '55ad86d0-19ef-4470-a75a-7953e829f290',\n", + " '8607aac5-1d38-4285-be3a-e3f1c55f5218',\n", + " '920ad7a5-5530-4689-ad80-5b5d1c0d3bee',\n", + " '73247e00-baea-46e0-925c-7aa0df95c0c8',\n", + " '82105b7f-fe14-44a0-a420-49ad26ef9644',\n", + " 'a4312169-102a-4cce-8594-78099a8efd0e',\n", + " 'f620b96b-224f-409d-bf01-314a82c88700',\n", + " '98ead4c8-1750-485c-bade-f7f5147d820d',\n", + " '12843a05-da09-4fc4-83bb-d94227f9004c',\n", + " '5b7b6567-9b45-4108-9a67-7f76013e4acb',\n", + " '5ed16166-7ae1-4f67-8406-b9dd66e590ba',\n", + " '28182f80-387a-45ab-bffa-2672c4a54da8',\n", + " 'e87a9656-cc24-4f5d-b514-9a59db059b00',\n", + " '251f434f-3d8f-466a-8dac-5daaa3ce86c8',\n", + " '2eea2a0d-cd73-4c83-8ce4-d43f8a8e9d99',\n", + " '9ee90875-4b00-49ef-bf45-e6fa6df899b2',\n", + " 'a4e52989-4ecd-4625-8245-92e554ea0b12',\n", + " '2244622b-7ca2-46bf-9020-ca02ce399706',\n", + " '8522329d-8c75-42ea-a8f6-5e1d66e0865d',\n", + " '58605254-45c5-4df5-80d0-ec34e483525a',\n", + " '4a72b0cb-954c-4024-898a-e054c29c3169',\n", + " 'cbcbf1ec-d4b3-4db8-a6c8-268655cec416',\n", + " '255c79ac-df3e-482a-a1b6-567327aafa7f',\n", + " 'ced4f7d6-3996-4b89-bf67-4ec8df039def',\n", + " '5d9db48e-bb78-4bf6-83d2-9d384f6b7db8',\n", + " '69f8f32c-4afe-4c4f-a26f-63e90bbe0d10',\n", + " '16c0603e-1170-4bfc-aeb0-c8e677c4b494',\n", + " '8258b9f7-fd26-480e-ac8c-7026b7758eb9',\n", + " 'a2233bcc-649b-4727-956f-1144159ea035',\n", + " 'a666d9b6-e543-4921-84ee-bcbd1670bf8c',\n", + " '1c15e716-c022-40be-9bba-e7c16edef47c',\n", + " '5771b83b-c46d-40d5-af1d-0154595524b6',\n", + " '23395d70-147f-4141-a4c3-63a6641c2ffc',\n", + " '2360e523-323b-4f00-880f-8053b3a07440',\n", + " 'f1545c04-7fa9-4d59-a2ca-a2c93a49b2c9',\n", + " 'dc00bc04-6bb0-41cb-9a7a-197a54497fea',\n", + " '8cd9b1f1-eebb-496b-86fe-2eb746b3ff76',\n", + " '3dd2a571-20dc-42ca-b9c2-a426f5934190',\n", + " '39cdade7-f84e-45d7-9530-5959889f789a',\n", + " 'e64d3b82-eef9-4355-a1c1-498723d67c49',\n", + " 'c3465aae-8963-444e-af74-1ad80eb625ad',\n", + " '15e60084-da91-4812-9200-cdfaece1b298',\n", + " '046b79e8-481e-4370-9305-1545df4cc565',\n", + " '8fd577a1-a7f7-43ae-9cb9-62d6f677a28d',\n", + " 'f30ce6c5-0acc-4d44-8ee7-4c1b191f8c86',\n", + " 'b05b0e9e-563b-4f50-a808-bc90ad52a166',\n", + " 'f8464c0d-6f99-4963-8153-6b9cac7d21ca',\n", + " 'e25af122-389d-4e77-9131-1e5aef465395',\n", + " 'b9b021ce-7fb2-4ff7-8203-dc651c85db95',\n", + " '319a6745-ff38-46b3-90a5-2fea27fa2a76',\n", + " 'a2859d3b-d6f2-4565-b3fb-a4a1d2a409c9',\n", + " '9d23440d-3be9-4cd2-9d57-4a3da03c888e',\n", + " '856fa271-61d9-4978-a3ff-af0d5835fa48',\n", + " 'ffb73318-ca14-47c4-9b03-61a3d06b2f2e',\n", + " 'b1235f96-fe0a-4d4e-bc67-9ca82fbdfac3',\n", + " 'bf4c3dab-3e6e-4567-8720-2e70a0018a91',\n", + " '912341ab-a6cc-4fca-b186-7d1fdd3d56a7',\n", + " '843db8c4-ff4e-4ff3-9b38-547f20fa2aef',\n", + " '343f7d46-034d-47a2-95e0-18b06201eb98',\n", + " 'fb224588-8e67-4977-9e3e-ee950dec5ed3',\n", + " 'a386cf47-cdf7-49e7-9ef4-fb109c6e97f7',\n", + " '46432b13-2e66-4cf3-9c5c-a66305ee10ca',\n", + " '4c9b36b6-33b6-46d8-bb66-b3f71a59617c',\n", + " '712e96de-8f23-4fc9-9b56-f1589dcefa88',\n", + " 'cdeadb98-3b86-4e45-91bb-992c9ebb24d9',\n", + " '49a78b6a-56b6-493b-954d-c1fbac8ada71',\n", + " 'fdc6a3b9-6fe5-4172-91ab-de4f3b045df4',\n", + " '131e5b08-5cd9-49fa-ba0b-20cdc018c3c3',\n", + " '18631f28-8818-43ee-9979-62890e50fb58',\n", + " '9e044dcd-2c92-4709-896b-9816c33354f1',\n", + " '12ab8374-c57a-4452-bc6c-f229439831ec',\n", + " '162daf1a-f6c2-4feb-9327-14eb870fc776',\n", + " 'bbb258bc-504b-4777-9bc9-5f9e3506582a',\n", + " 'fd43df59-8cb6-41d8-9a68-91955b570529',\n", + " '1574dbe1-b923-4b4e-a71b-47dd8ff47da5',\n", + " '9bb9fd9e-038c-4acd-8f14-aead4815885c',\n", + " '3d2f4f69-0b46-46c5-b210-95fb5d9b4dff',\n", + " 'fb3f1ef9-6dd2-487d-bad8-1eebfaedab18',\n", + " '9fb09802-6917-4257-a57b-1fb74f71b339',\n", + " '5c58cda6-8d99-445e-8220-431017d04287',\n", + " '6a791b76-a65b-411a-a11e-7737b905b819',\n", + " '197983e8-040e-40ad-8b10-7efc3000ad9d',\n", + " '3aea0477-23dd-4be7-a2bd-e5f83a5c4490',\n", + " 'e741546b-241e-4247-8584-10183639e568',\n", + " '6f512b08-224a-4bb9-b19c-609dffd6f790',\n", + " 'e8f43de9-3a55-434d-9fd7-72867a04ef41',\n", + " 'eaa4a446-f507-4104-91f5-f502f4c8d6b9',\n", + " 'a0a008a9-c6a0-484e-bd60-e4d1ffe9b697',\n", + " 'd61c72f6-d641-48d4-aed7-131ef0d18fb5',\n", + " 'f3152d0c-38aa-40a8-96eb-fa1fe1184212',\n", + " '34ea0fa9-6f10-4a9d-accd-875c0155dd35',\n", + " '9e3e8d37-69bd-4360-98d5-833b7a6cc4a9',\n", + " '58a0f55f-48bb-4e00-b832-d0bc425fa1fd',\n", + " 'f317975c-aefa-4811-ba60-7fbe321382e6',\n", + " '67fc5859-b2f4-429b-8e50-70ed51430de8',\n", + " '6f4d41bc-db57-48a9-9c58-df99f2897176',\n", + " '2a63cea9-b432-45c2-90d3-ae510bd3a4d9',\n", + " 'f420a822-eeba-4ec9-b87c-85d7768c6375',\n", + " 'aa288c01-49c7-437a-835b-0c5e1e7bc83f',\n", + " 'aca30793-e5d6-4eed-8259-016411536b98',\n", + " '0ab0e35c-e2f7-4cd7-b504-6769b6c9f0d5',\n", + " 'da9327ed-944f-4e29-990c-20ef874a2f25',\n", + " '7d36b56a-6904-469d-8816-28ff32b9066d',\n", + " '8bfc5d10-e3b6-4b8c-94da-e584bc708932',\n", + " 'ae21ac9e-2610-4d2f-b33b-b5db50a7eac2',\n", + " '814af52f-3e94-4c0b-9d7a-3defc4cb3ec3',\n", + " '76e0897f-d9aa-4c95-b605-c88ff53b8a36',\n", + " '16c11966-9183-4de8-a356-13f5099b5e25',\n", + " '18cb2291-60d7-4bf8-a081-d5f10f975d9c',\n", + " 'c2c7a83d-6d63-4ec0-bb4f-068db483f91a',\n", + " '3e4bb122-e768-4b21-96db-10b18c37fa0b',\n", + " '5bd4046f-d334-4b79-86c0-0e6f43ba2d90',\n", + " '7c7da70a-1fdc-43fa-8db0-090475e2736a',\n", + " 'a41f22c2-897b-43a3-a284-62f5e797cbe6',\n", + " 'f79a045e-652a-44ce-a35f-331c355670e9',\n", + " 'fdb744fd-3a07-442b-9d68-d84a8ec0d38e',\n", + " '9b78397b-1e63-408b-bcb1-791b88cf75a5',\n", + " 'aa879a60-d5ab-4772-924a-e3eb31333edb',\n", + " '39e38b69-426f-46a5-8c37-c275f66d9fe1',\n", + " 'b00e270c-d88b-458c-b890-157498c1b701',\n", + " '0d0dea84-3cb6-4739-8436-dcfe3139a975',\n", + " '2b0e10f3-c4a2-469f-aa2a-b61b5cf8bea2',\n", + " 'c1771627-11eb-4f6d-8516-d35cadb2ef23',\n", + " '9d40f7e0-b1b2-4661-9c47-37bf2e828c02',\n", + " 'eb2d7654-f59c-4590-a39b-3596db51df12',\n", + " 'db7bd1cb-7d77-4024-ad1c-8523c4042da2',\n", + " '9ccc72fe-6f21-4a70-9d3b-ac7ed03d7629',\n", + " 'e87868fd-d2b3-4d66-941b-2c0ea80d4533',\n", + " '474155af-d4f1-4e3a-9783-394ebda5d3fe',\n", + " 'a712a14c-a297-4dd1-a45e-25274a29e4d3',\n", + " '8b68522b-5231-438d-a8c4-6c3be5070be0',\n", + " '790bfe5b-df15-403c-b715-20650e8965e0',\n", + " '04744b90-bb60-4ffb-8433-e296096d4f01',\n", + " '5256e04a-d6b5-47c9-97d1-e6c94c98b5b0',\n", + " '8f5fd6bd-1ef7-49ee-968e-7fa77ec943e0',\n", + " 'dcbb3de6-023e-4b17-bfa5-3cd52c40c336',\n", + " '7ec2d335-1f47-457d-8936-ba87f753ca8b',\n", + " 'f5e695dd-df45-49c5-9496-8c6ae693bae3',\n", + " '812900c6-f264-4039-b993-ac2408c6bf45',\n", + " '9e964cc9-23b4-4835-b578-658a1d2b6675',\n", + " '8087bc80-a4a4-4431-9f8c-fb0a2d85fdd2',\n", + " '437c8797-c02d-4230-8e2d-10d93b0e7009',\n", + " '93342c70-4617-443d-8a64-130f1ab15d43',\n", + " '1cedaef0-51c1-4978-a72a-33c45461293b',\n", + " 'd559ef6a-5ce1-4168-b7eb-70e02fe41b28',\n", + " '95e57f0a-1339-4f09-a7c0-9bea196b0d4d',\n", + " '418e6b70-173e-4f18-860b-8e05114f36f3',\n", + " '67c953c7-1d96-4335-85ba-e5f6f746c74e',\n", + " '900e24a1-dae5-4e6a-a2bf-0a643ab290dd',\n", + " '4bf9392d-bb58-4dd5-83d3-3a23a4ab10fe',\n", + " '16dfd282-917f-430e-9a0c-636196e8bddf',\n", + " '1249db1f-6dbe-44df-bc73-a05c712fec40',\n", + " 'c8dff0ca-2ef3-4bfe-82dd-de81f61aefd0',\n", + " '739f16f6-98f7-4db1-a021-073fe9c74a0a',\n", + " '82b15ebf-2646-4b85-b10f-03161eeecff1',\n", + " 'aa7f5864-1c76-464d-95c7-1e0a0e4a36e0',\n", + " '69c6db4b-001a-469a-849a-90198cd7fc5d',\n", + " 'da9e2f52-42bb-45b3-9403-05a43236e21b',\n", + " '94b12a55-f54c-4eeb-a061-4bd04975bc21',\n", + " 'f6611719-cb8a-46e7-ab23-9f605fd1d454',\n", + " '71480110-9da1-4646-87e8-4118e9f630f4',\n", + " 'f125cc7a-501f-4dd6-9fdb-baecf37f9c18',\n", + " 'd8c21263-317d-4b62-839c-43d8949da733',\n", + " '1529a8df-c158-4dc6-912b-34ea83168e6f',\n", + " '8f26d0e7-f7ba-42f7-99dc-73a2e9848b7f',\n", + " '096866d4-a484-4bd3-b96e-2b0df526887e',\n", + " 'cb40cf58-5bad-498c-898e-e8f866ac3697',\n", + " '36c254a8-6f40-421f-8959-44f2a7a9960f',\n", + " 'f01ee66c-7aea-4955-af64-8a0796e9e6ed',\n", + " 'c71b1f61-73d3-46e2-a238-b6c4f1a10c51',\n", + " 'b4754108-49b0-4307-8658-834b07e1c73b',\n", + " 'ce3683a6-72a7-4502-a4d3-3655b2b30b8b',\n", + " '4c62bfe1-0ca9-47e4-917f-fba3b963aac0',\n", + " '8f144364-7e23-4fdb-8fbf-214a829c87df',\n", + " '6475e40e-b4f6-4e94-9fde-8ac094395b6e',\n", + " '76e280ad-e1e8-438f-8dcb-490fad2e35d8',\n", + " '378684af-e9ab-463c-8dee-5a521c388703',\n", + " '606206fd-f86b-4e18-b488-1e37c758093a',\n", + " '80c0e6fa-ef50-40d1-b645-5ff55b1319c3',\n", + " '9e1f0f9e-fb9f-4f3c-8fe7-6981e65fa9bd',\n", + " '57df1ace-17d6-4fdf-98ee-0742afd2bf88',\n", + " 'db35bbdd-3d86-4d67-ad52-b91f7b49bfe4',\n", + " 'c4f85853-7941-4c8b-9341-f6b97040091a',\n", + " 'f06c1be0-e8b4-4cd2-ad06-cbe386f7b954',\n", + " '47b46f90-0da8-4788-a2e2-4adaf7d80716',\n", + " '8c49dc2f-af0a-420b-ba79-9fc35ab6cea6',\n", + " 'f6a51aa5-705f-4071-a410-a84bedc5435d',\n", + " '1d06cb79-4e65-4540-8f2d-7b6cadab3740',\n", + " '1883b7f6-67d3-41f3-86ae-e6c9ba3e5a2d',\n", + " '4adf169e-f1da-4f0c-8a9f-d5923b023d99',\n", + " '929b6b6a-5fbb-414a-a786-8da289bf3f0c',\n", + " 'db99e4a7-8b70-441d-8d58-ebcc3a332831',\n", + " '67952d65-fdfe-4b23-931d-64cfd966b7d8',\n", + " ...]" + ] + }, + "metadata": {}, + "execution_count": 11 + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Trying the Vector Storage" + ], + "metadata": { + "id": "7Vnj-Z0pIYDM" + } + }, + { + "cell_type": "code", + "source": [ + "query = \"Hello, what's kubernetes\"\n", + "query_vector = embeddings_service.embed_query(query)\n", + "docs = vector_store.similarity_search_by_vector(query_vector, k=4)\n", + "\n", + "for i, document in enumerate(docs):\n", + " print(f\"Result #{i+1}\")\n", + " print(document.page_content)\n", + " print(\"-\" * 100)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "tdITIoEAIcfr", + "executionInfo": { + "status": "ok", + "timestamp": 1721932055248, + "user_tz": 300, + "elapsed": 314, + "user": { + "displayName": "", + "userId": "" + } + }, + "outputId": "102198aa-ca71-4a4c-a5e7-58afd1a2884f" + }, + "execution_count": 15, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Result #1\n", + "Overview\n", + "\n", + "Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.\n", + "\n", + "This page is an overview of Kubernetes.\n", + "\n", + "Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.\n", + "\n", + "The name Kubernetes originates from Greek, meaning helmsman or pilot. K8s as an abbreviation results from counting the eight letters between the \"K\" and the \"s\". Google open- sourced the Kubernetes project in 2014. Kubernetes combines over 15 years of Google's experience running production workloads at scale with best-of-breed ideas and practices from the community.\n", + "----------------------------------------------------------------------------------------------------\n", + "Result #2\n", + "bWRSeEgvbUNOS2JKYjFRQm1HCkkwYitEUEdaTktXTU0xMzhIQXdoV0tkNjVoVHdYOWl4V3Z HMkh4TG1WQzg0L1BHT0tWQW9FNkpsYWFHdTlQVmkKdjlOSjVaZlZrcXdCd0hKbzZXdk9xV lA3SVFjZmg3d0drWm89Ci0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo= signerName: kubernetes.io/kube-apiserver-client expirationSeconds: 86400 # one day usages: - client auth EOF\n", + "----------------------------------------------------------------------------------------------------\n", + "Result #3\n", + "- --config=/etc/kubernetes/my-scheduler/my-scheduler-config.yaml image: gcr.io/my-gcp-project/my-kube-scheduler:1.0\n", + "----------------------------------------------------------------------------------------------------\n", + "Result #4\n", + "metadata (ObjectMeta)\n", + "\n", + "Standard object's metadata. More info: https://git.k8s.io/community/contributors/devel/ sig-architecture/api-conventions.md#metadata\n", + "\n", + "\n", + "\n", + "spec (JobSpec)\n", + "\n", + "Specification of the desired behavior of a job. More info: https://git.k8s.io/community/ contributors/devel/sig-architecture/api-conventions.md#spec-and-status\n", + "\n", + "\n", + "\n", + "status (JobStatus)\n", + "\n", + "Current status of a job. More info: https://git.k8s.io/community/contributors/devel/sig- architecture/api-conventions.md#spec-and-status\n", + "\n", + "JobSpec\n", + "\n", + "JobSpec describes how the job execution will look like.\n", + "\n", + "Replicas\n", + "\n", + "\n", + "\n", + "template (PodTemplateSpec), required\n", + "\n", + "Describes the pod that will be created when executing a job. The only allowed template.spec.restartPolicy values are \"Never\" or \"OnFailure\". More info: https:// kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/\n", + "\n", + "\n", + "\n", + "parallelism (int32)\n", + "----------------------------------------------------------------------------------------------------\n" + ] + } + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.9.13" + }, + "colab": { + "provenance": [], + "name": "rag-data-ingest-with-kubernetes-docs.ipynb" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file From 015d3ff6634d4f4a8e9ea0a06124d0df007f1b6f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Wed, 18 Sep 2024 13:57:02 +0000 Subject: [PATCH 2/8] Running rag e2e test with kubernetes docs. - Updating test_rag.py so the test can validate answers from the kubernetes documentation. - Updating cloudbuild.yaml to ingest the database with the kubernetes documentation. --- applications/rag/tests/test_rag.py | 271 +++++++++++++---------------- cloudbuild.yaml | 10 +- 2 files changed, 123 insertions(+), 158 deletions(-) diff --git a/applications/rag/tests/test_rag.py b/applications/rag/tests/test_rag.py index d7da3a0e2..5ed2f4b3c 100644 --- a/applications/rag/tests/test_rag.py +++ b/applications/rag/tests/test_rag.py @@ -3,160 +3,125 @@ import requests def test_prompts(prompt_url): - testcases = [ - { - "prompt": "List the cast of Squid Game", - "expected_context": "This is a TV Show in called Squid Game added at September 17, 2021 whose director is and with cast: Lee Jung-jae, Park Hae-soo, Wi Ha-jun, Oh Young-soo, Jung Ho-yeon, Heo Sung-tae, Kim Joo-ryoung, Tripathi Anupam, You Seong-joo, Lee You-mi released at 2021. Its rating is: TV-MA. Its duration is 1 Season. Its description is Hundreds of cash-strapped players accept a strange invitation to compete in children's games. Inside, a tempting prize awaits — with deadly high stakes..", - "expected_substrings": ["Lee Jung-jae", "Park Hae-soo", "Wi Ha-jun", "Oh Young-soo", "Jung Ho-yeon", "Heo Sung-tae", "Kim Joo-ryoung", "Tripathi Anupam", "You Seong-joo", "Lee You-mi"], - }, - { - "prompt": "When was Squid Game released?", - "expected_context": "This is a TV Show in called Squid Game added at September 17, 2021 whose director is and with cast: Lee Jung-jae, Park Hae-soo, Wi Ha-jun, Oh Young-soo, Jung Ho-yeon, Heo Sung-tae, Kim Joo-ryoung, Tripathi Anupam, You Seong-joo, Lee You-mi released at 2021. Its rating is: TV-MA. Its duration is 1 Season. Its description is Hundreds of cash-strapped players accept a strange invitation to compete in children's games. Inside, a tempting prize awaits — with deadly high stakes..", - "expected_substrings": ["September 17, 2021"], - }, - { - "prompt": "What is the rating of Squid Game?", - "expected_context": "This is a TV Show in called Squid Game added at September 17, 2021 whose director is and with cast: Lee Jung-jae, Park Hae-soo, Wi Ha-jun, Oh Young-soo, Jung Ho-yeon, Heo Sung-tae, Kim Joo-ryoung, Tripathi Anupam, You Seong-joo, Lee You-mi released at 2021. Its rating is: TV-MA. Its duration is 1 Season. Its description is Hundreds of cash-strapped players accept a strange invitation to compete in children's games. Inside, a tempting prize awaits — with deadly high stakes..", - "expected_substrings": ["TV-MA"], - }, - { - "prompt": "List the cast of Avatar: The Last Airbender", - "expected_context": "This is a TV Show in United States called Avatar: The Last Airbender added at May 15, 2020 whose director is and with cast: Zach Tyler, Mae Whitman, Jack De Sena, Dee Bradley Baker, Dante Basco, Jessie Flower, Mako Iwamatsu released at 2007. Its rating is: TV-Y7. Its duration is 3 Seasons. Its description is Siblings Katara and Sokka wake young Aang from a long hibernation and learn he's an Avatar, whose air-bending powers can defeat the evil Fire Nation..", - "expected_substrings": ["Zach Tyler", "Mae Whitman", "Jack De Sena", "Dee Bradley Baker", "Dante Basco", "Jessie Flower", "Mako Iwamatsu"], - }, - { - "prompt": "When was Avatar: The Last Airbender added on Netflix?", - "expected_context": "This is a TV Show in United States called Avatar: The Last Airbender added at May 15, 2020 whose director is and with cast: Zach Tyler, Mae Whitman, Jack De Sena, Dee Bradley Baker, Dante Basco, Jessie Flower, Mako Iwamatsu released at 2007. Its rating is: TV-Y7. Its duration is 3 Seasons. Its description is Siblings Katara and Sokka wake young Aang from a long hibernation and learn he's an Avatar, whose air-bending powers can defeat the evil Fire Nation..", - "expected_substrings": ["May 15, 2020"], - }, - { - "prompt": "What is the rating of Avatar: The Last Airbender?", - "expected_context": "This is a TV Show in United States called Avatar: The Last Airbender added at May 15, 2020 whose director is and with cast: Zach Tyler, Mae Whitman, Jack De Sena, Dee Bradley Baker, Dante Basco, Jessie Flower, Mako Iwamatsu released at 2007. Its rating is: TV-Y7. Its duration is 3 Seasons. Its description is Siblings Katara and Sokka wake young Aang from a long hibernation and learn he's an Avatar, whose air-bending powers can defeat the evil Fire Nation..", - "expected_substrings": ["TV-Y7"], - }, - ] - - for testcase in testcases: - prompt = testcase["prompt"] - expected_context = testcase["expected_context"] - expected_substrings = testcase["expected_substrings"] - - print(f"Testing prompt: {prompt}") - data = {"prompt": prompt} - json_payload = json.dumps(data) - - headers = {'Content-Type': 'application/json'} - response = requests.post(prompt_url, data=json_payload, headers=headers) - response.raise_for_status() - - response = response.json() - context = response['response']['context'] - text = response['response']['text'] - user_prompt = response['response']['user_prompt'] - - print(f"Reply: {text}") - - assert user_prompt == prompt, f"unexpected user prompt: {user_prompt} != {prompt}" - assert context == expected_context, f"unexpected context: {context} != {expected_context}" - - for substring in expected_substrings: - assert substring in text, f"substring {substring} not in response:\n {text}" + try: + testcases = [ + { + "prompt": "What's kubernetes?", + }, + { + "prompt": "How create a kubernetes cluster?", + }, + { + "prompt": "What's kubectl?", + } + ] + + for testcase in testcases: + prompt = testcase["prompt"] + + print(f"Testing prompt: {prompt}") + data = {"prompt": prompt} + json_payload = json.dumps(data) + + headers = {'Content-Type': 'application/json'} + response = requests.post(prompt_url, data=json_payload, headers=headers) + response.raise_for_status() + + response = response.json() + print(response) + text = response['response'].get('text') + + print(f"Reply: {text}") + + assert response != None, f"Not response found: {response}" + assert text != None, f"Not text" + except Exception as err: + print(err) + raise err def test_prompts_nlp(prompt_url): - testcases = [ - { - "prompt": "List the cast of Squid Game", - "nlpFilterLevel": "0", - "expected_context": "This is a TV Show in called Squid Game added at September 17, 2021 whose director is and with cast: Lee Jung-jae, Park Hae-soo, Wi Ha-jun, Oh Young-soo, Jung Ho-yeon, Heo Sung-tae, Kim Joo-ryoung, Tripathi Anupam, You Seong-joo, Lee You-mi released at 2021. Its rating is: TV-MA. Its duration is 1 Season. Its description is Hundreds of cash-strapped players accept a strange invitation to compete in children's games. Inside, a tempting prize awaits — with deadly high stakes..", - "expected_substrings": ["Lee Jung-jae", "Park Hae-soo", "Wi Ha-jun", "Oh Young-soo", "Jung Ho-yeon", "Heo Sung-tae", "Kim Joo-ryoung", "Tripathi Anupam", "You Seong-joo", "Lee You-mi"], - }, - { - "prompt": "Which movie has a building that blows up?", - "nlpFilterLevel": "0", - "expected_context": "This is a Movie in United States called Oklahoma City added at April 19, 2017 whose director is Barak Goodman and with cast: released at 2017. Its rating is: TV-14. Its duration is 102 min. Its description is Domestic terrorist Timothy McVeigh's bomb attack on an Oklahoma City federal building in 1995 is chronicled in this examination of the shocking event..", - "expected_substrings": ["The movie \"Oklahoma City\" released in 2017, directed by Barak Goodman, and starring a cast, depicts the bombing of an Oklahoma City federal building in 1995. The building that blows up in the movie is the Alfred P. Murrah Federal Building."], - }, - { - "prompt": "Which movie has a building that blows up?", - "nlpFilterLevel": "50", - "expected_context": "This is a Movie in United States called Oklahoma City added at April 19, 2017 whose director is Barak Goodman and with cast: released at 2017. Its rating is: TV-14. Its duration is 102 min. Its description is Domestic terrorist Timothy McVeigh's bomb attack on an Oklahoma City federal building in 1995 is chronicled in this examination of the shocking event..", - "expected_substrings": ["The response is deemed inappropriate for display."], - }, - { - "prompt": "List the cast of Squid Game", - "nlpFilterLevel": "100", - "expected_context": "This is a TV Show in called Squid Game added at September 17, 2021 whose director is and with cast: Lee Jung-jae, Park Hae-soo, Wi Ha-jun, Oh Young-soo, Jung Ho-yeon, Heo Sung-tae, Kim Joo-ryoung, Tripathi Anupam, You Seong-joo, Lee You-mi released at 2021. Its rating is: TV-MA. Its duration is 1 Season. Its description is Hundreds of cash-strapped players accept a strange invitation to compete in children's games. Inside, a tempting prize awaits — with deadly high stakes..", - "expected_substrings": ["The response is deemed inappropriate for display."], - } - ] - - for testcase in testcases: - prompt = testcase["prompt"] - nlpFilterLevel = testcase["nlpFilterLevel"] - expected_context = testcase["expected_context"] - expected_substrings = testcase["expected_substrings"] - - print(f"Testing prompt: {prompt}") - data = {"prompt": prompt, "nlpFilterLevel": nlpFilterLevel} - json_payload = json.dumps(data) - - headers = {'Content-Type': 'application/json'} - response = requests.post(prompt_url, data=json_payload, headers=headers) - response.raise_for_status() - - response = response.json() - context = response['response']['context'] - text = response['response']['text'] - user_prompt = response['response']['user_prompt'] - - print(f"Reply: {text}") - - assert user_prompt == prompt, f"unexpected user prompt: {user_prompt} != {prompt}" - assert context == expected_context, f"unexpected context: {context} != {expected_context}" - - for substring in expected_substrings: - assert substring in text, f"substring {substring} not in response:\n {text}" + try: + testcases = [ + { + "prompt": "What's kubernetes?", + "nlpFilterLevel": "0", + }, + { + "prompt": "What's kubernetes?", + "nlpFilterLevel": "100", + }, + { + "prompt": "How create a kubernetes cluster?", + "nlpFilterLevel": "0", + }, + { + "prompt": "What's kubectl?", + "nlpFilterLevel": "50", + } + ] + + for testcase in testcases: + prompt = testcase["prompt"] + nlpFilterLevel = testcase["nlpFilterLevel"] + + print(f"Testing prompt: {prompt}") + data = {"prompt": prompt, "nlpFilterLevel": nlpFilterLevel} + json_payload = json.dumps(data) + + headers = {'Content-Type': 'application/json'} + response = requests.post(prompt_url, data=json_payload, headers=headers) + response.raise_for_status() + + response = response.json() + + text = response['response']['text'] + + + print(f"Reply: {text}") + + assert response != None, f"Not response found: {response}" + assert text != None, f"Not text" + except Exception as err: + print(err) + raise err def test_prompts_dlp(prompt_url): - testcases = [ - { - "prompt": "who worked with Robert De Niro and name one film they collaborated?", - "inspectTemplate": "projects/gke-ai-eco-dev/locations/global/inspectTemplates/DO-NOT-DELETE-e2e-test-inspect-template", - "deidentifyTemplate": "projects/gke-ai-eco-dev/locations/global/deidentifyTemplates/DO-NOT-DELETE-e2e-test-de-identify-template", - "expected_context": "This is a Movie in United States called GoodFellas added at January 1, 2021 whose director is Martin Scorsese and with cast: Robert De Niro, Ray Liotta, Joe Pesci, Lorraine Bracco, Paul Sorvino, Frank Sivero, Tony Darrow, Mike Starr, Frank Vincent, Chuck Low released at 1990. Its rating is: R. Its duration is 145 min. Its description is Former mobster Henry Hill recounts his colorful yet violent rise and fall in a New York crime family – a high-rolling dream turned paranoid nightmare..", - "expected_substrings": ["[PERSON_NAME] has worked with many talented actors and directors throughout his career. One film he collaborated with [PERSON_NAME] is \"GoodFellas,\" which was released in 1990. In this movie, [PERSON_NAME] played the role of [PERSON_NAME], a former mobster who recounts his rise and fall in a New York crime family."], - }, - ] - - for testcase in testcases: - prompt = testcase["prompt"] - inspectTemplate = testcase["inspectTemplate"] - deidentifyTemplate = testcase["deidentifyTemplate"] - expected_context = testcase["expected_context"] - expected_substrings = testcase["expected_substrings"] - - print(f"Testing prompt: {prompt}") - data = {"prompt": prompt, "inspectTemplate": inspectTemplate, "deidentifyTemplate": deidentifyTemplate} - json_payload = json.dumps(data) - - headers = {'Content-Type': 'application/json'} - response = requests.post(prompt_url, data=json_payload, headers=headers) - response.raise_for_status() - - response = response.json() - context = response['response']['context'] - text = response['response']['text'] - user_prompt = response['response']['user_prompt'] - - print(f"Reply: {text}") - - assert user_prompt == prompt, f"unexpected user prompt: {user_prompt} != {prompt}" - assert context == expected_context, f"unexpected context: {context} != {expected_context}" - - for substring in expected_substrings: - assert substring in text, f"substring {substring} not in response:\n {text}" - -prompt_url = sys.argv[1] -test_prompts(prompt_url) -test_prompts_nlp(prompt_url) -test_prompts_dlp(prompt_url) + try: + testcases = [ + { + "prompt": "What's kubernetes?", + "inspectTemplate": "projects/globant-gke-ai-resources/locations/us-central1/inspectTemplates/gke-rag-application-inspect-template", #"projects/gke-ai-eco-dev/locations/global/inspectTemplates/DO-NOT-DELETE-e2e-test-inspect-template", + "deidentifyTemplate": "projects/globant-gke-ai-resources/locations/us-central1/deidentifyTemplates/gke-rag-application-deidentify-template" #"projects/gke-ai-eco-dev/locations/global/deidentifyTemplates/DO-NOT-DELETE-e2e-test-de-identify-template", + }, + ] + + for testcase in testcases: + prompt = testcase["prompt"] + inspectTemplate = testcase["inspectTemplate"] + deidentifyTemplate = testcase["deidentifyTemplate"] + + print(f"Testing prompt: {prompt}") + data = {"prompt": prompt, "inspectTemplate": inspectTemplate, "deidentifyTemplate": deidentifyTemplate} + json_payload = json.dumps(data) + + headers = {'Content-Type': 'application/json'} + response = requests.post(prompt_url, data=json_payload, headers=headers) + response.raise_for_status() + + response = response.json() + text = response['response']['text'] + + + print(f"Reply: {text}") + + assert response != None, f"Not response found: {response}" + assert text != None, f"Not text" + except Exception as err: + print(err) + raise err + +if __name__ == "__main__": + prompt_url = sys.argv[1] + test_prompts(prompt_url) + test_prompts_nlp(prompt_url) + test_prompts_dlp(prompt_url) \ No newline at end of file diff --git a/cloudbuild.yaml b/cloudbuild.yaml index d3c22f362..59d99b7a0 100644 --- a/cloudbuild.yaml +++ b/cloudbuild.yaml @@ -259,15 +259,15 @@ steps: echo "pass" > /workspace/rag_frontend_result.txt cd /workspace/ - sed -i "s//$$KAGGLE_USERNAME/g" ./applications/rag/example_notebooks/rag-kaggle-ray-sql-interactive.ipynb - sed -i "s//$$KAGGLE_KEY/g" ./applications/rag/example_notebooks/rag-kaggle-ray-sql-interactive.ipynb - gsutil cp ./applications/rag/example_notebooks/rag-kaggle-ray-sql-interactive.ipynb gs://gke-aieco-rag-$SHORT_SHA-$_BUILD_ID/ + sed -i "s//$$KAGGLE_USERNAME/g" ./applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb + sed -i "s//$$KAGGLE_KEY/g" ./applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb + gsutil cp ./applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb gs://gke-aieco-rag-$SHORT_SHA-$_BUILD_ID/ kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID $(kubectl get pod -l app=jupyterhub,component=hub -n rag-$SHORT_SHA-$_BUILD_ID -o jsonpath="{.items[0].metadata.name}") -- jupyterhub token admin --log-level=CRITICAL | xargs python3 ./applications/rag/notebook_starter.py # Wait for jupyterhub to trigger notebook pod startup sleep 5s kubectl wait --for=condition=Ready pod/jupyter-admin -n rag-$SHORT_SHA-$_BUILD_ID --timeout=500s - kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- jupyter nbconvert --to script /data/rag-kaggle-ray-sql-interactive.ipynb - kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- ipython /data/rag-kaggle-ray-sql-interactive.py + kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- jupyter nbconvert --to script /data/rag-data-ingest-with-kubernetes-docs.ipynb + kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- ipython /data/rag-data-ingest-with-kubernetes-docs.py python3 ./applications/rag/tests/test_rag.py "http://127.0.0.1:8081/prompt" echo "pass" > /workspace/rag_prompt_result.txt From d05e198ab2a7d9bc7894096b861f7a4dd0e4f76e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Wed, 18 Sep 2024 14:46:31 +0000 Subject: [PATCH 3/8] Reverting change --- cloudbuild.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/cloudbuild.yaml b/cloudbuild.yaml index 70ed92d13..3e56f9085 100644 --- a/cloudbuild.yaml +++ b/cloudbuild.yaml @@ -384,6 +384,7 @@ substitutions: _USER_NAME: github _AUTOPILOT_CLUSTER: "false" _BUILD_ID: ${BUILD_ID:0:8} +logsBucket: gs://ai-on-gke-build-logs options: substitutionOption: 'ALLOW_LOOSE' machineType: 'E2_HIGHCPU_8' From 67fa74bbe5dd9ea18dbf8815dc411a8193eecbb2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Wed, 18 Sep 2024 10:43:35 -0500 Subject: [PATCH 4/8] Fixing issue converting notebook to script. --- cloudbuild.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cloudbuild.yaml b/cloudbuild.yaml index 3e56f9085..897273f10 100644 --- a/cloudbuild.yaml +++ b/cloudbuild.yaml @@ -266,7 +266,7 @@ steps: # Wait for jupyterhub to trigger notebook pod startup sleep 5s kubectl wait --for=condition=Ready pod/jupyter-admin -n rag-$SHORT_SHA-$_BUILD_ID --timeout=500s - kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- jupyter nbconvert --to script /data/rag-data-ingest-with-kubernetes-docs.ipynb + kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- jupyter nbconvert --to script /data/rag-data-ingest-with-kubernetes-docs.ipynb --to python kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- ipython /data/rag-data-ingest-with-kubernetes-docs.py python3 ./applications/rag/tests/test_rag.py "http://127.0.0.1:8081/prompt" @@ -395,4 +395,4 @@ availableSecrets: - versionName: projects/gke-ai-eco-dev/secrets/cloudbuild-kaggle-username/versions/latest env: 'KAGGLE_USERNAME' - versionName: projects/gke-ai-eco-dev/secrets/cloudbuild-kaggle-key/versions/latest - env: 'KAGGLE_KEY' \ No newline at end of file + env: 'KAGGLE_KEY' From 819773979729768d58d663100fcf24334767da12 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Wed, 18 Sep 2024 11:29:38 -0500 Subject: [PATCH 5/8] Update cloudbuild.yaml to fix generation of script. --- cloudbuild.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cloudbuild.yaml b/cloudbuild.yaml index 897273f10..feb900484 100644 --- a/cloudbuild.yaml +++ b/cloudbuild.yaml @@ -266,7 +266,7 @@ steps: # Wait for jupyterhub to trigger notebook pod startup sleep 5s kubectl wait --for=condition=Ready pod/jupyter-admin -n rag-$SHORT_SHA-$_BUILD_ID --timeout=500s - kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- jupyter nbconvert --to script /data/rag-data-ingest-with-kubernetes-docs.ipynb --to python + kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- jupyter nbconvert --to python /data/rag-data-ingest-with-kubernetes-docs.ipynb kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- ipython /data/rag-data-ingest-with-kubernetes-docs.py python3 ./applications/rag/tests/test_rag.py "http://127.0.0.1:8081/prompt" From f54e7ed84e9e61756bc2fdb0b4952f24de8a470e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Wed, 18 Sep 2024 18:52:03 +0000 Subject: [PATCH 6/8] updating notebook variables --- ...rag-data-ingest-with-kubernetes-docs.ipynb | 309 ++++++++---------- 1 file changed, 131 insertions(+), 178 deletions(-) diff --git a/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb b/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb index 7dd40f32d..9cbd84497 100644 --- a/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb +++ b/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb @@ -24,23 +24,23 @@ "colab": { "base_uri": "https://localhost:8080/" }, - "id": "k8d6_U2sbaJ_", "executionInfo": { + "elapsed": 569, "status": "ok", "timestamp": 1721926267799, - "user_tz": 300, - "elapsed": 569, "user": { "displayName": "", "userId": "" - } + }, + "user_tz": 300 }, + "id": "k8d6_U2sbaJ_", "outputId": "e15c65de-1382-4923-a3ee-15b3f3f21f86" }, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "fatal: destination path '/data/kubernetes-docs' already exists and is not an empty directory.\n" ] @@ -53,43 +53,38 @@ }, { "cell_type": "markdown", - "source": [ - "- Install the required packages" - ], "metadata": { "id": "iRtu4buBamab" - } + }, + "source": [ + "- Install the required packages" + ] }, { "cell_type": "code", - "source": [ - "!pip install pgvector\n", - "!pip install langchain langchain-community sentence_transformers unstructured[pdf]\n", - "!pip install google cloud-sql-python-connector[pg8000] langchain-google-cloud-sql-pg" - ], + "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "collapsed": true, - "id": "xRh2Gn1rcBJY", "executionInfo": { + "elapsed": 35573, "status": "ok", "timestamp": 1721926317024, - "user_tz": 300, - "elapsed": 35573, "user": { "displayName": "", "userId": "" - } + }, + "user_tz": 300 }, + "id": "xRh2Gn1rcBJY", "outputId": "f0deb85d-1d5c-41d0-b6ff-e3ed86bd3042" }, - "execution_count": 2, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "Requirement already satisfied: pgvector in /usr/local/lib/python3.10/dist-packages (0.3.2)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from pgvector) (1.25.2)\n", @@ -294,32 +289,37 @@ "Requirement already satisfied: mypy-extensions>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community<0.3.0,>=0.0.18->langchain-google-cloud-sql-pg) (1.0.0)\n" ] } + ], + "source": [ + "!pip install pgvector\n", + "!pip install langchain langchain-community sentence_transformers unstructured[pdf]\n", + "!pip install google cloud-sql-python-connector[pg8000] langchain-google-cloud-sql-pg" ] }, { "cell_type": "markdown", - "source": [ - " - Import required functions and libraries" - ], "metadata": { "id": "yZybYPPvaqcS" - } + }, + "source": [ + " - Import required functions and libraries" + ] }, { "cell_type": "code", "execution_count": 3, "metadata": { - "id": "FWqsMMdQbaKA", "executionInfo": { + "elapsed": 1322, "status": "ok", "timestamp": 1721926369825, - "user_tz": 300, - "elapsed": 1322, "user": { "displayName": "", "userId": "" - } - } + }, + "user_tz": 300 + }, + "id": "FWqsMMdQbaKA" }, "outputs": [], "source": [ @@ -345,90 +345,43 @@ "Let's now set up a connection to your CloudSQL database:" ] }, - { - "cell_type": "code", - "source": [ - "%env ENVIRONMENT=development\n", - "%env PROJECT_ID=globant-gke-ai-resources\n", - "%env CLOUDSQL_INSTANCE_REGION=us-west1\n", - "%env CLOUDSQL_INSTANCE=rag-application-test\n", - "%env EMBEDDINGS_TABLE_NAME=kubernetes_docs\n", - "%env DB_USERNAME=main-user\n", - "%env DB_PASS=gSo{I@YMyd8]&\\34" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "DegY7bswdlSB", - "executionInfo": { - "status": "ok", - "timestamp": 1721926389134, - "user_tz": 300, - "elapsed": 338, - "user": { - "displayName": "", - "userId": "" - } - }, - "outputId": "ca5aa526-bace-469d-8808-162d9e934be2" - }, - "execution_count": 4, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "env: ENVIRONMENT=development\n", - "env: PROJECT_ID=globant-gke-ai-resources\n", - "env: CLOUDSQL_INSTANCE_REGION=us-west1\n", - "env: CLOUDSQL_INSTANCE=rag-application-test\n", - "env: EMBEDDINGS_TABLE_NAME=kubernetes_docs\n", - "env: DB_USERNAME=main-user\n", - "env: DB_PASS=gSo{I@YMyd8]&\\34\n" - ] - } - ] - }, { "cell_type": "code", "execution_count": 5, "metadata": { - "id": "rvK19kzwbaKB", "executionInfo": { + "elapsed": 457, "status": "ok", "timestamp": 1721926402495, - "user_tz": 300, - "elapsed": 457, "user": { "displayName": "", "userId": "" - } - } + }, + "user_tz": 300 + }, + "id": "rvK19kzwbaKB" }, "outputs": [], "source": [ - "ENVIRONMENT = os.environ.get(\"ENVIRONMENT\")\n", - "\n", - "GCP_PROJECT_ID = os.environ.get(\"PROJECT_ID\")\n", - "GCP_CLOUD_SQL_REGION = os.environ.get(\"CLOUDSQL_INSTANCE_REGION\")\n", - "GCP_CLOUD_SQL_INSTANCE = os.environ.get(\"CLOUDSQL_INSTANCE\")\n", + "# initialize parameters\n", + "INSTANCE_CONNECTION_NAME = os.environ.get(\"CLOUDSQL_INSTANCE_CONNECTION_NAME\", \"\")\n", + "print(f\"Your instance connection name is: {INSTANCE_CONNECTION_NAME}\")\n", + "cloud_variables = INSTANCE_CONNECTION_NAME.split(\":\")\n", "\n", - "DB_NAME = os.environ.get(\"DB_NAME\", \"pgvector-database\")\n", - "VECTOR_EMBEDDINGS_TABLE_NAME = os.environ.get(\"EMBEDDINGS_TABLE_NAME\", \"\")\n", + "GCP_PROJECT_ID = os.environ.get(\"GCP_PROJECT_ID\", cloud_variables[0])\n", + "GCP_CLOUD_SQL_REGION = os.environ.get(\"CLOUDSQL_INSTANCE_REGION\", cloud_variables[1])\n", + "GCP_CLOUD_SQL_INSTANCE = os.environ.get(\"CLOUDSQL_INSTANCE\", cloud_variables[2])\n", "\n", - "try:\n", - " db_username_file = open(\"/etc/secret-volume/username\", \"r\")\n", - " DB_USER = db_username_file.read()\n", - " db_username_file.close()\n", + "DB_NAME = os.environ.get(\"INSTANCE_CONNECTION_NAME\", \"pgvector-database\")\n", + "VECTOR_EMBEDDINGS_TABLE_NAME = os.environ.get(\"EMBEDDINGS_TABLE_NAME\", \"rag_vector_embeddings\")\n", "\n", - " db_password_file = open(\"/etc/secret-volume/password\", \"r\")\n", - " DB_PASS = db_password_file.read()\n", - " db_password_file.close()\n", - "except:\n", - " DB_USER = os.environ.get(\"DB_USERNAME\", \"postgres\")\n", - " DB_PASS = os.environ.get(\"DB_PASS\", \"postgres\")\n", + "db_username_file = open(\"/etc/secret-volume/username\", \"r\")\n", + "DB_USER = db_username_file.read()\n", + "db_username_file.close()\n", "\n", + "db_password_file = open(\"/etc/secret-volume/password\", \"r\")\n", + "DB_PASS = db_password_file.read()\n", + "db_password_file.close()\n", "\n", "# Create Cloud SQL Postgres Engine\n", "pg_engine = PostgresEngine.from_instance(\n", @@ -454,17 +407,17 @@ "cell_type": "code", "execution_count": 6, "metadata": { - "id": "_MydulMdbaKC", "executionInfo": { + "elapsed": 3, "status": "ok", "timestamp": 1721926424908, - "user_tz": 300, - "elapsed": 3, "user": { "displayName": "", "userId": "" - } - } + }, + "user_tz": 300 + }, + "id": "_MydulMdbaKC" }, "outputs": [], "source": [ @@ -494,17 +447,17 @@ "cell_type": "code", "execution_count": 7, "metadata": { - "id": "0EzU4YhrbaKC", "executionInfo": { + "elapsed": 2350, "status": "ok", "timestamp": 1721926432388, - "user_tz": 300, - "elapsed": 2350, "user": { "displayName": "", "userId": "" - } - } + }, + "user_tz": 300 + }, + "id": "0EzU4YhrbaKC" }, "outputs": [], "source": [ @@ -517,37 +470,37 @@ }, { "cell_type": "markdown", - "source": [ - "# Initialize Vector Store" - ], "metadata": { "id": "aIAiofJTj4Fh" - } + }, + "source": [ + "# Initialize Vector Store" + ] }, { "cell_type": "code", "execution_count": 8, "metadata": { - "id": "oCbOnnBIbaKD", + "colab": { + "base_uri": "https://localhost:8080/" + }, "executionInfo": { + "elapsed": 14376, "status": "ok", "timestamp": 1721926463546, - "user_tz": 300, - "elapsed": 14376, "user": { "displayName": "", "userId": "" - } - }, - "colab": { - "base_uri": "https://localhost:8080/" + }, + "user_tz": 300 }, + "id": "oCbOnnBIbaKD", "outputId": "ed2702cf-ce04-4711-eaa7-5e644b5290ec" }, "outputs": [ { - "output_type": "stream", "name": "stderr", + "output_type": "stream", "text": [ "/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py:139: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 0.3.0. An updated version of the class exists in the langchain-huggingface package and should be used instead. To use it run `pip install -U langchain-huggingface` and import as `from langchain_huggingface import HuggingFaceEmbeddings`.\n", " warn_deprecated(\n", @@ -591,23 +544,23 @@ "colab": { "base_uri": "https://localhost:8080/" }, - "id": "OzXCfwNAbaKD", "executionInfo": { + "elapsed": 696702, "status": "ok", "timestamp": 1721927166341, - "user_tz": 300, - "elapsed": 696702, "user": { "displayName": "", "userId": "" - } + }, + "user_tz": 300 }, + "id": "OzXCfwNAbaKD", "outputId": "d573387c-66df-423f-a000-750334de97a0" }, "outputs": [ { - "output_type": "stream", "name": "stderr", + "output_type": "stream", "text": [ "100%|██████████| 6/6 [11:36<00:00, 116.07s/it]\n" ] @@ -620,66 +573,61 @@ }, { "cell_type": "code", - "source": [ - "splitter = RecursiveCharacterTextSplitter(\n", - " chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP, length_function=len\n", - ")\n", - "\n", - "splits = splitter.split_documents(documents)" - ], + "execution_count": 10, "metadata": { - "id": "O7eBZG7wiWBa", "executionInfo": { + "elapsed": 629, "status": "ok", "timestamp": 1721927196274, - "user_tz": 300, - "elapsed": 629, "user": { "displayName": "", "userId": "" - } - } + }, + "user_tz": 300 + }, + "id": "O7eBZG7wiWBa" }, - "execution_count": 10, - "outputs": [] + "outputs": [], + "source": [ + "splitter = RecursiveCharacterTextSplitter(\n", + " chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP, length_function=len\n", + ")\n", + "\n", + "splits = splitter.split_documents(documents)" + ] }, { "cell_type": "markdown", - "source": [ - "### Add the splits on the vector store" - ], "metadata": { "id": "UwCj9x5Jl5iq" - } + }, + "source": [ + "### Add the splits on the vector store" + ] }, { "cell_type": "code", - "source": [ - "ids = [str(uuid.uuid4()) for i in range(len(splits))]\n", - "vector_store.add_documents(splits, ids)" - ], + "execution_count": 11, "metadata": { - "collapsed": true, - "id": "Wqd3cKgntEYw", "colab": { "base_uri": "https://localhost:8080/" }, + "collapsed": true, "executionInfo": { + "elapsed": 2429339, "status": "ok", "timestamp": 1721929629134, - "user_tz": 300, - "elapsed": 2429339, "user": { "displayName": "", "userId": "" - } + }, + "user_tz": 300 }, + "id": "Wqd3cKgntEYw", "outputId": "f9329c53-49fa-488c-caf1-ae74e1db128c" }, - "execution_count": 11, "outputs": [ { - "output_type": "execute_result", "data": { "text/plain": [ "['3db34d89-aca6-4152-a2f5-09e26d932652',\n", @@ -1685,54 +1633,49 @@ " ...]" ] }, + "execution_count": 11, "metadata": {}, - "execution_count": 11 + "output_type": "execute_result" } + ], + "source": [ + "ids = [str(uuid.uuid4()) for i in range(len(splits))]\n", + "vector_store.add_documents(splits, ids)" ] }, { "cell_type": "markdown", - "source": [ - "## Trying the Vector Storage" - ], "metadata": { "id": "7Vnj-Z0pIYDM" - } + }, + "source": [ + "## Trying the Vector Storage" + ] }, { "cell_type": "code", - "source": [ - "query = \"Hello, what's kubernetes\"\n", - "query_vector = embeddings_service.embed_query(query)\n", - "docs = vector_store.similarity_search_by_vector(query_vector, k=4)\n", - "\n", - "for i, document in enumerate(docs):\n", - " print(f\"Result #{i+1}\")\n", - " print(document.page_content)\n", - " print(\"-\" * 100)" - ], + "execution_count": 15, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, - "id": "tdITIoEAIcfr", "executionInfo": { + "elapsed": 314, "status": "ok", "timestamp": 1721932055248, - "user_tz": 300, - "elapsed": 314, "user": { "displayName": "", "userId": "" - } + }, + "user_tz": 300 }, + "id": "tdITIoEAIcfr", "outputId": "102198aa-ca71-4a4c-a5e7-58afd1a2884f" }, - "execution_count": 15, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "Result #1\n", "Overview\n", @@ -1786,10 +1729,24 @@ "----------------------------------------------------------------------------------------------------\n" ] } + ], + "source": [ + "query = \"Hello, what's kubernetes\"\n", + "query_vector = embeddings_service.embed_query(query)\n", + "docs = vector_store.similarity_search_by_vector(query_vector, k=4)\n", + "\n", + "for i, document in enumerate(docs):\n", + " print(f\"Result #{i+1}\")\n", + " print(document.page_content)\n", + " print(\"-\" * 100)" ] } ], "metadata": { + "colab": { + "name": "rag-data-ingest-with-kubernetes-docs.ipynb", + "provenance": [] + }, "kernelspec": { "display_name": "Python 3", "language": "python", @@ -1798,12 +1755,8 @@ "language_info": { "name": "python", "version": "3.9.13" - }, - "colab": { - "provenance": [], - "name": "rag-data-ingest-with-kubernetes-docs.ipynb" } }, "nbformat": 4, "nbformat_minor": 0 -} \ No newline at end of file +} From 37b5024554ebb6e2f247c79ad3528a2f2ede9ff8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Thu, 19 Sep 2024 11:08:30 +0000 Subject: [PATCH 7/8] Adding iptype configuration to db engine --- .../example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb | 1 + 1 file changed, 1 insertion(+) diff --git a/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb b/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb index 9cbd84497..6b6a45acd 100644 --- a/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb +++ b/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb @@ -391,6 +391,7 @@ " database=DB_NAME,\n", " user=DB_USER,\n", " password=DB_PASS,\n", + " ip_type=IPTypes.PRIVATE\n", ")" ] }, From b046e93e7e5196f77759aef02d9576e3d0fa970b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Germ=C3=A1n=20Grandas?= Date: Thu, 19 Sep 2024 14:28:01 +0000 Subject: [PATCH 8/8] Updating Rag application README --- applications/rag/README.md | 32 ++++++-------------------------- 1 file changed, 6 insertions(+), 26 deletions(-) diff --git a/applications/rag/README.md b/applications/rag/README.md index cd8c3d016..81ddea3de 100644 --- a/applications/rag/README.md +++ b/applications/rag/README.md @@ -17,7 +17,7 @@ RAG uses a semantically searchable knowledge base (like vector search) to retrie 5. A [Jupyter](https://docs.jupyter.org/en/latest/) notebook running on GKE that reads the dataset using GCS fuse driver integrations and runs a Ray job to populate the vector DB. 3. A front end chat interface running on GKE that prompts the inference server with context from the vector DB. -This tutorial walks you through installing the RAG infrastructure in a GCP project, generating vector embeddings for a sample [Kaggle Netflix shows](https://www.kaggle.com/datasets/shivamb/netflix-shows) dataset and prompting the LLM with context. +This tutorial walks you through installing the RAG infrastructure in a GCP project, generating vector embeddings for a sample [Kubernetes Docs](https://github.com/dohsimpson/kubernetes-doc-pdf) dataset and prompting the LLM with context. # Prerequisites @@ -74,7 +74,7 @@ This section sets up the RAG infrastructure in your GCP project using Terraform. # Generate vector embeddings for the dataset -This section generates the vector embeddings for your input dataset. Currently, the default dataset is [Netflix shows](https://www.kaggle.com/datasets/shivamb/netflix-shows). We will use a Jupyter notebook to run a Ray job that generates the embeddings & populates them into the `pgvector` instance created above. +This section generates the vector embeddings for your input dataset. Currently, the default dataset is [Kubernetes docs](https://github.com/dohsimpson/kubernetes-doc-pdf). We will use a Jupyter notebook to generate the embeddings & populates them into the `pgvector` instance created above. Set your the namespace, cluster name and location from `workloads.tfvars`): @@ -108,30 +108,10 @@ gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${CLUSTER_L 2. Load the notebook: - Once logged in to JupyterHub, choose the `CPU` preset with `Default` storage. - - Click [File] -> [Open From URL] and paste: `https://raw.githubusercontent.com/GoogleCloudPlatform/ai-on-gke/main/applications/rag/example_notebooks/rag-kaggle-ray-sql-interactive.ipynb` - -3. Configure Kaggle: - - Create a [Kaggle account](https://www.kaggle.com/account/login?phase=startRegisterTab&returnUrl=%2F). - - [Generate an API token](https://www.kaggle.com/settings/account). See [further instructions](https://www.kaggle.com/docs/api#authentication). This token is used in the notebook to access the [Kaggle Netflix shows](https://www.kaggle.com/datasets/shivamb/netflix-shows) dataset. - - Replace the variables in the 1st cell of the notebook with your Kaggle credentials (can be found in the `kaggle.json` file created while generating the API token): - * `KAGGLE_USERNAME` - * `KAGGLE_KEY` - -4. Generate vector embeddings: Run all the cells in the notebook to generate vector embeddings for the Netflix shows dataset (https://www.kaggle.com/datasets/shivamb/netflix-shows) and store them in the `pgvector` CloudSQL instance via a Ray job. - * When the last cell says the job has succeeded (eg: `Job 'raysubmit_APungAw6TyB55qxk' succeeded`), the vector embeddings have been generated and we can launch the frontend chat interface. Note that running the job can take up to 10 minutes. - * Ray may take several minutes to create the runtime environment. During this time, the job will appear to be missing (e.g. `Status message: PENDING`). - * Connect to the Ray dashboard to check the job status or logs: - - If IAP is disabled (`ray_dashboard_add_auth = false`): - - `kubectl port-forward -n ${NAMESPACE} service/ray-cluster-kuberay-head-svc 8265:8265` - - Go to `localhost:8265` in a browser - - If IAP is enabled (`ray_dashboard_add_auth = true`): - - Fetch the domain: `terraform output ray-dashboard-managed-cert` - - If you used a custom domain, ensure you configured your DNS as described above. - - Verify the domain status is `Active`: - - `kubectl get managedcertificates ray-dashboard-managed-cert -n ${NAMESPACE} --output jsonpath='{.status.domainStatus[0].status}'` - - Note: This can take up to 20 minutes to propagate. - - Once the domain status is Active, go to the domain in a browser and login with your Google credentials. - - To add additional users to your frontend application, go to [Google Cloud Platform IAP](https://console.cloud.google.com/security/iap), select the `rag/ray-cluster-kuberay-head-svc` service and add principals with the role `IAP-secured Web App User`. + - Click [File] -> [Open From URL] and paste: `https://raw.githubusercontent.com/GoogleCloudPlatform/ai-on-gke/main/applications/rag/example_notebooks/rag-data-ingest-with-kubernetes-docs.ipynb` + + +4. Generate vector embeddings: Run all the cells in the notebook to generate vector embeddings for the [Kubernetes documentation](https://github.com/dohsimpson/kubernetes-doc-pdf) and store them in the `pgvector` CloudSQL instance. # Launch the frontend chat interface