In this repository, you'll find the practical essence of the article Hackernoon article: "Semantic Search Queries Return More Informed Results" distilled into code (albeit updated). The author points out a common hurdle: the struggle with searching through our own unstructured data. Weaviate, an open-source vector search engine, is introduced as a sturdy bridge over this hurdle. Following the narrative, this codebase sets up Weaviate, harnesses the open transformer model sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
for vectorization through the vectorization module, and dives into a dataset of wine reviews. This repository demonstrates how to set up Weaviate with your data and get straight to firing up search queries.
(TODO: Add demo video)
Before you can run the project, you need to have Docker, Docker Compose, and Python installed on your machine. Follow the instructions below to install the prerequisites:
- For Windows and Mac:
- Download and install Docker Desktop from Docker's official website.
- For Linux:
- Run the following commands in your terminal:
sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io
- Run the following commands in your terminal:
- For Windows and Mac:
- Docker Compose is included with Docker Desktop.
- For Linux:
- Run the following command in your terminal:
sudo apt install docker-compose
- Run the following command in your terminal:
- Download and install the latest version of Python from Python's official website.
- Verify the installation by running the following command in your terminal:
python --version
- Install virtualenv (if not already installed):
pip install virtualenv
- Create a Virtual Environment:
Navigate to the directory where you want to create your virtual environment, then run:
virtualenv <name_of_virtualenv>
- Activate the Virtual Environment:
On Windows, run:
On macOS and Linux, run:
.\<name_of_virtualenv>\Scripts\activate
source <name_of_virtualenv>/bin/activate
- Install Python requirements:
pip install -r requirements.txt
- Start up Weaviate:
docker-compose up -d
. Once completed, Weaviate is running onhttp://localhost:8080
. - Run
python import.py
to import 2500 wines to Weaviate. - The data is now stored in the Weaviate instance. You can experiment with it using a python notebook or a python file.
This folder contains Wine review data, retrieved from Kaggle (from WineEnthusiast).
This project's origin is here and the Hackernoon article: "Semantic Search Queries Return More Informed Results".