Skip to content

chore: add documentation for Hybrid Seach #233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vishwarajanand
Copy link
Collaborator

Adds documentation for hybrid search:

  • examples/pg_vectorstore_how_to.ipynb -> Adds how to do Hybrid search, specifically initialize a table, initilaize VS object, add documents, etc

Reviewer decision points:

  1. Should we add more contents into README?
    • Might make README bulkier and heavier to read?
  2. Should we write an additional markdown for hybrid search?
    • Challenge in curating specific examples

@vishwarajanand vishwarajanand marked this pull request as ready for review July 8, 2025 08:35
@@ -78,6 +78,7 @@ print(docs)

> [!TIP]
> All synchronous functions have corresponding asynchronous functions
> PGVectorStore also supports Hybrid Search which combines multiple search strategies to improve search results.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that a note in the Readme is helpful. We should make sure the code snippet is clear and concise. So instead of this, can we add a header for Hybrid search and the smallest code snippet to get started then link to the how-to?

Comment on lines +710 to +718
"- **`id_column=\"product_id\"`**: ID column uniquely identifies each row in the products table.\n",
"\n",
"- **`content_column=\"description\"`**: The `description` column contains text descriptions of each product. This text is used by the `embedding_service` to create vectors that go in embedding_column and represent the semantic meaning of each description.\n",
"\n",
"- **`embedding_column=\"embed\"`**: The `embed` column stores the vectors created from the product descriptions. These vectors are used to find products with similar descriptions.\n",
"\n",
"- **`metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"]`**: These columns are treated as metadata for each product. Metadata provides additional information about a product, such as its name, category, price, quantity available, SKU (Stock Keeping Unit), and an image URL. This information is useful for displaying product details in search results or for filtering and categorization.\n",
"\n",
"- **`metadata_json_column=\"metadata\"`**: The `metadata` column can store any additional information about the products in a flexible JSON format. This allows for storing varied and complex data that doesn't fit into the standard columns.\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This info is already provided above. Please outline how to use the HybridSearchConfig.

Comment on lines +745 to +758
"# If a hybrid search config is provided during vector store table creation,\n",
"# the specified TSV column will be automatically created.\n",
"await pg_engine.ainit_vectorstore_table(\n",
" table_name=TABLE_NAME,\n",
" # schema_name=SCHEMA_NAME,\n",
" vector_size=VECTOR_SIZE,\n",
" id_column=\"product_id\",\n",
" content_column=\"description\",\n",
" embedding_column=\"embed\",\n",
" metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n",
" metadata_json_column=\"metadata\",\n",
" hybrid_search_config=hybrid_search_config,\n",
" store_metadata=True,\n",
")\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create individual sections for each of these notes. The inline comments are hard to read.

"cell_type": "markdown",
"metadata": {},
"source": [
"# Hybrid Search Vector Store\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide the easy way to get started then a section for how to customize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants