-
Notifications
You must be signed in to change notification settings - Fork 94
chore: add documentation for Hybrid Seach #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
chore: add documentation for Hybrid Seach #233
Conversation
@@ -78,6 +78,7 @@ print(docs) | |||
|
|||
> [!TIP] | |||
> All synchronous functions have corresponding asynchronous functions | |||
> PGVectorStore also supports Hybrid Search which combines multiple search strategies to improve search results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a note in the Readme is helpful. We should make sure the code snippet is clear and concise. So instead of this, can we add a header for Hybrid search and the smallest code snippet to get started then link to the how-to?
"- **`id_column=\"product_id\"`**: ID column uniquely identifies each row in the products table.\n", | ||
"\n", | ||
"- **`content_column=\"description\"`**: The `description` column contains text descriptions of each product. This text is used by the `embedding_service` to create vectors that go in embedding_column and represent the semantic meaning of each description.\n", | ||
"\n", | ||
"- **`embedding_column=\"embed\"`**: The `embed` column stores the vectors created from the product descriptions. These vectors are used to find products with similar descriptions.\n", | ||
"\n", | ||
"- **`metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"]`**: These columns are treated as metadata for each product. Metadata provides additional information about a product, such as its name, category, price, quantity available, SKU (Stock Keeping Unit), and an image URL. This information is useful for displaying product details in search results or for filtering and categorization.\n", | ||
"\n", | ||
"- **`metadata_json_column=\"metadata\"`**: The `metadata` column can store any additional information about the products in a flexible JSON format. This allows for storing varied and complex data that doesn't fit into the standard columns.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This info is already provided above. Please outline how to use the HybridSearchConfig.
"# If a hybrid search config is provided during vector store table creation,\n", | ||
"# the specified TSV column will be automatically created.\n", | ||
"await pg_engine.ainit_vectorstore_table(\n", | ||
" table_name=TABLE_NAME,\n", | ||
" # schema_name=SCHEMA_NAME,\n", | ||
" vector_size=VECTOR_SIZE,\n", | ||
" id_column=\"product_id\",\n", | ||
" content_column=\"description\",\n", | ||
" embedding_column=\"embed\",\n", | ||
" metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n", | ||
" metadata_json_column=\"metadata\",\n", | ||
" hybrid_search_config=hybrid_search_config,\n", | ||
" store_metadata=True,\n", | ||
")\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create individual sections for each of these notes. The inline comments are hard to read.
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Hybrid Search Vector Store\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide the easy way to get started then a section for how to customize.
Adds documentation for hybrid search:
Reviewer decision points: