From 0fea97a2e9f202a92ab37ec45ee89c75c8a72ab0 Mon Sep 17 00:00:00 2001 From: Quentin de Quelen Date: Tue, 3 Sep 2024 17:03:26 +0200 Subject: [PATCH] Add a guides for generating embeddings with Mistral, OpenAI, Voyage, and Cloudflare (#2965) * Add a guide for generating embeddings with Mistral * Update guides/embedders/mistral.mdx Co-authored-by: Louis Dureuil * add links * comments fixes from Laurent * Add guides for Cloudflare, Cohere, OpenAI and Voyage * Update config/sidebar-guides.json Co-authored-by: Laurent Cazanove * Update config/sidebar-guides.json Co-authored-by: Laurent Cazanove * Update config/sidebar-guides.json Co-authored-by: Laurent Cazanove * Update config/sidebar-guides.json Co-authored-by: Laurent Cazanove * Update config/sidebar-guides.json Co-authored-by: Laurent Cazanove * Update guides/embedders/voyage.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/voyage.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/voyage.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/voyage.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/voyage.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/cloudflare.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/cloudflare.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/cloudflare.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/cloudflare.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/cloudflare.mdx Co-authored-by: Laurent Cazanove * Update guides/embedders/cloudflare.mdx Co-authored-by: Laurent Cazanove * Apply suggestions from code review Co-authored-by: Laurent Cazanove --------- Co-authored-by: Louis Dureuil Co-authored-by: Laurent Cazanove --- config/sidebar-guides.json | 25 ++++++++ guides/embedders/cloudflare.mdx | 99 +++++++++++++++++++++++++++++++ guides/embedders/cohere.mdx | 101 ++++++++++++++++++++++++++++++++ guides/embedders/mistral.mdx | 97 ++++++++++++++++++++++++++++++ guides/embedders/openai.mdx | 86 +++++++++++++++++++++++++++ guides/embedders/voyage.mdx | 101 ++++++++++++++++++++++++++++++++ 6 files changed, 509 insertions(+) create mode 100644 guides/embedders/cloudflare.mdx create mode 100644 guides/embedders/cohere.mdx create mode 100644 guides/embedders/mistral.mdx create mode 100644 guides/embedders/openai.mdx create mode 100644 guides/embedders/voyage.mdx diff --git a/config/sidebar-guides.json b/config/sidebar-guides.json index fd6e20655..8ad5c8a3e 100644 --- a/config/sidebar-guides.json +++ b/config/sidebar-guides.json @@ -64,6 +64,31 @@ "source": "guides/computing_hugging_face_embeddings_gpu.mdx", "label": "Computing Hugging Face embeddings with the GPU", "slug": "computing_hugging_face_embeddings_gpu" + }, + { + "source": "guides/embedders/cloudflare.mdx", + "label": "Semantic search with Cloudflare embeddings", + "slug": "cloudflare" + }, + { + "source": "guides/embedders/cohere.mdx", + "label": "Semantic search with Cohere embeddings", + "slug": "cohere" + }, + { + "source": "guides/embedders/mistral.mdx", + "label": "Semantic search with Mistral embeddings", + "slug": "mistral" + }, + { + "source": "guides/embedders/openai.mdx", + "label": "Semantic search with OpenAI embeddings", + "slug": "openai" + }, + { + "source": "guides/embedders/voyage.mdx", + "label": "Semantic search with Voyage embeddings", + "slug": "voyage" } ] }, diff --git a/guides/embedders/cloudflare.mdx b/guides/embedders/cloudflare.mdx new file mode 100644 index 000000000..c27a33daa --- /dev/null +++ b/guides/embedders/cloudflare.mdx @@ -0,0 +1,99 @@ +--- +title: Semantic Search with Cloudflare Worker AI Embeddings - Meilisearch documentation +description: This guide will walk you through the process of setting up Meilisearch with Cloudflare Worker AI embeddings to enable semantic search capabilities. +--- + +# Semantic search with Cloudflare Worker AI embeddings + +## Introduction + +This guide will walk you through the process of setting up Meilisearch with Cloudflare Worker AI embeddings to enable semantic search capabilities. By leveraging Meilisearch's AI features and Cloudflare Worker AI's embedding API, you can enhance your search experience and retrieve more relevant results. + +## Requirements + +To follow this guide, you'll need: + +- A [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=vector-search&utm_source=docs&utm_medium=cloudflare-embeddings-guide) project running version 1.10 or above with the Vector store activated. +- A Cloudflare account with access to Worker AI and an API key. You can sign up for a Cloudflare account at [Cloudflare](https://www.cloudflare.com/). +- Your Cloudflare account ID. +- No backend required. + +## Setting up Meilisearch + +To set up an embedder in Meilisearch, you need to configure it to your settings. You can refer to the [Meilisearch documentation](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=cloudflare-embeddings-guide#update-embedder-settings) for more details on updating the embedder settings. + +Cloudflare Worker AI offers the following embedding models: + +- `baai/bge-base-en-v1.5`: 768 dimensions +- `baai/bge-large-en-v1.5`: 1024 dimensions +- `baai/bge-small-en-v1.5`: 384 dimensions + +Here's an example of embedder settings for Cloudflare Worker AI: + +```json +{ + "cloudflare": { + "source": "rest", + "apiKey": "", + "dimensions": 384, + "documentTemplate": "", + "url": "https://api.cloudflare.com/client/v4/accounts//ai/run/@cf/", + "request": { + "text": ["{{text}}", "{{..}}"] + }, + "response": { + "result": { + "data": ["{{embedding}}", "{{..}}"] + } + } + } +} +``` + +In this configuration: + +- `source`: Specifies the source of the embedder, which is set to "rest" for using a REST API. +- `apiKey`: Replace `` with your actual Cloudflare API key. +- `dimensions`: Specifies the dimensions of the embeddings. Set to 384 for `baai/bge-small-en-v1.5`, 768 for `baai/bge-base-en-v1.5`, or 1024 for `baai/bge-large-en-v1.5`. +- `documentTemplate`: Optionally, you can provide a [custom template](/learn/ai-powered-search/getting_started_with_ai_search?utm_campaign=vector-search&utm_source=docs&utm_medium=cloudflare-embeddings-guide#documenttemplate) for generating embeddings from your documents. +- `url`: Specifies the URL of the Cloudflare Worker AI API endpoint. +- `request`: Defines the request structure for the Cloudflare Worker AI API, including the input parameters. +- `response`: Defines the expected response structure from the Cloudflare Worker AI API, including the embedding data. + +Be careful when setting up the `url` field in your configuration. The URL contains your Cloudflare account ID (``) and the specific model you want to use (``). Make sure to replace these placeholders with your actual account ID and the desired model name (e.g., `baai/bge-small-en-v1.5`). + +Once you've configured the embedder settings, Meilisearch will automatically generate embeddings for your documents and store them in the vector store. + +Please note that Cloudflare may have rate limiting, which is managed by Meilisearch. If you have a free account, the indexation process may take some time, but Meilisearch will handle it with a retry strategy. + +It's recommended to monitor the tasks queue to ensure everything is running smoothly. You can access the tasks queue using the Cloud UI or the [Meilisearch API](/reference/api/tasks?utm_campaign=vector-search&utm_source=docs&utm_medium=cloudflare-embeddings-guide#get-tasks). + +## Testing semantic search + +With the embedder set up, you can now perform semantic searches using Meilisearch. When you send a search query, Meilisearch will generate an embedding for the query using the configured embedder and then use it to find the most semantically similar documents in the vector store. +To perform a semantic search, you simply need to make a normal search request but include the hybrid parameter: + +```json +{ + "q": "", + "hybrid": { + "semanticRatio": 1, + "embedder": "cloudflare" + } +} +``` + +In this request: + +- `q`: Represents the user's search query. +- `hybrid`: Specifies the configuration for the hybrid search. + - `semanticRatio`: Allows you to control the balance between semantic search and traditional search. A value of 1 indicates pure semantic search, while a value of 0 represents full-text search. You can adjust this parameter to achieve a hybrid search experience. + - `embedder`: The name of the embedder used for generating embeddings. Make sure to use the same name as specified in the embedder configuration, which in this case is "cf-bge-small-en-v1.5". + +You can use the Meilisearch API or client libraries to perform searches and retrieve the relevant documents based on semantic similarity. + +## Conclusion + +By following this guide, you should now have Meilisearch set up with Cloudflare Worker AI embedding, enabling you to leverage semantic search capabilities in your application. Meilisearch's auto-batching and efficient handling of embeddings make it a powerful choice for integrating semantic search into your project. + +To explore further configuration options for embedders, consult the [detailed documentation about the embedder setting possibilities](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=cloudflare-embeddings-guide#embedders-experimental). diff --git a/guides/embedders/cohere.mdx b/guides/embedders/cohere.mdx new file mode 100644 index 000000000..fabd97ced --- /dev/null +++ b/guides/embedders/cohere.mdx @@ -0,0 +1,101 @@ +--- +title: Semantic Search with Cohere Embeddings - Meilisearch documentation +description: This guide will walk you through the process of setting up Meilisearch with Cohere embeddings to enable semantic search capabilities. +--- + +# Semantic search with Cohere embeddings + +## Introduction + +This guide will walk you through the process of setting up Meilisearch with Cohere embeddings to enable semantic search capabilities. By leveraging Meilisearch's AI features and Cohere's embedding API, you can enhance your search experience and retrieve more relevant results. + +## Requirements + +To follow this guide, you'll need: + +- A [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=vector-search&utm_source=docs&utm_medium=cohere-embeddings-guide) project running version 1.10 or above with the Vector store activated. +- A Cohere account with an API key for embedding generation. You can sign up for a Cohere account at [Cohere](https://cohere.com/). +- No backend required. + +## Setting up Meilisearch + +To set up an embedder in Meilisearch, you need to configure it to your settings. You can refer to the [Meilisearch documentation](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=cohere-embeddings-guide#update-embedder-settings) for more details on updating the embedder settings. + +Cohere offers multiple embedding models: +- `embed-english-v3.0` and `embed-multilingual-v3.0`: 1024 dimensions +- `embed-english-light-v3.0` and `embed-multilingual-light-v3.0`: 384 dimensions + +Here's an example of embedder settings for Cohere: + +```json +{ + "cohere": { + "source": "rest", + "apiKey": "", + "dimensions": 1024, + "documentTemplate": "", + "url": "https://api.cohere.com/v1/embed", + "request": { + "model": "embed-english-v3.0", + "texts": [ + "{{text}}", + "{{..}}" + ], + "input_type": "search_document" + }, + "response": { + "embeddings": [ + "{{embedding}}", + "{{..}}" + ] + }, + } +} +``` + +In this configuration: + +- `source`: Specifies the source of the embedder, which is set to "rest" for using a REST API. +- `apiKey`: Replace `` with your actual Cohere API key. +- `dimensions`: Specifies the dimensions of the embeddings, set to 1024 for the `embed-english-v3.0` model. +- `documentTemplate`: Optionally, you can provide a [custom template](/learn/ai-powered-search/getting_started_with_ai_search?utm_campaign=vector-search&utm_source=docs&utm_medium=cohere-embeddings-guide#documenttemplate) for generating embeddings from your documents. +- `url`: Specifies the URL of the Cohere API endpoint. +- `request`: Defines the request structure for the Cohere API, including the model name and input parameters. +- `response`: Defines the expected response structure from the Cohere API, including the embedding data. + +Once you've configured the embedder settings, Meilisearch will automatically generate embeddings for your documents and store them in the vector store. + +Please note that most third-party tools have rate limiting, which is managed by Meilisearch. If you have a free account, the indexation process may take some time, but Meilisearch will handle it with a retry strategy. + +It's recommended to monitor the tasks queue to ensure everything is running smoothly. You can access the tasks queue using the Cloud UI or the [Meilisearch API](https://www.meilisearch.com/docs/reference/api/tasks?utm_campaign=vector-search&utm_source=docs&utm_medium=cohere-embeddings-guide#get-tasks). + +## Testing semantic search + +With the embedder set up, you can now perform semantic searches using Meilisearch. When you send a search query, Meilisearch will generate an embedding for the query using the configured embedder and then use it to find the most semantically similar documents in the vector store. +To perform a semantic search, you simply need to make a normal search request but include the hybrid parameter: + +```json +{ + "q": "", + "hybrid": { + "semanticRatio": 1, + "embedder": "cohere" + } +} +``` + +In this request: + +- `q`: Represents the user's search query. +- `hybrid`: Specifies the configuration for the hybrid search. + - `semanticRatio`: Allows you to control the balance between semantic search and traditional search. A value of 1 indicates pure semantic search, while a value of 0 represents full-text search. You can adjust this parameter to achieve a hybrid search experience. + - `embedder`: The name of the embedder used for generating embeddings. Make sure to use the same name as specified in the embedder configuration, which in this case is "cohere". + +You can use the Meilisearch API or client libraries to perform searches and retrieve the relevant documents based on semantic similarity. + +## Conclusion + +By following this guide, you should now have Meilisearch set up with Cohere embedding, enabling you to leverage semantic search capabilities in your application. Meilisearch's auto-batching and efficient handling of embeddings make it a powerful choice for integrating semantic search into your project. + +To explore further configuration options for embedders, consult the [detailed documentation about the embedder setting possibilities](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=cohere-embeddings-guide#embedders-experimental). + diff --git a/guides/embedders/mistral.mdx b/guides/embedders/mistral.mdx new file mode 100644 index 000000000..350e2b5d9 --- /dev/null +++ b/guides/embedders/mistral.mdx @@ -0,0 +1,97 @@ +--- +title: Semantic Search with Mistral Embeddings - Meilisearch documentation +description: This guide will walk you through the process of setting up Meilisearch with Mistral embeddings to enable semantic search capabilities. +--- + +# Semantic search with Mistral embeddings + +## Introduction + +This guide will walk you through the process of setting up Meilisearch with Mistral embeddings to enable semantic search capabilities. By leveraging Meilisearch's AI features and Mistral's embedding API, you can enhance your search experience and retrieve more relevant results. + +## Requirements + +To follow this guide, you'll need: + +- A [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=vector-search&utm_source=docs&utm_medium=mistral-embeddings-guide) project running version 1.10 or above with the Vector store activated. +- A Mistral account with an API key for embedding generation. You can sign up for a Mistral account at [Mistral](https://mistral.ai/). +- No backend required. + +## Setting up Meilisearch + +To set up an embedder in Meilisearch, you need to configure it to your settings. You can refer to the [Meilisearch documentation](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=mistral-embeddings-guide#update-embedder-settings) for more details on updating the embedder settings. + +While using Mistral to generate embeddings, you'll need to use the model `mistral-embed`. Unlike some other services, Mistral currently offers only one embedding model. + +Here's an example of embedder settings for Mistral: + +```json +{ + "mistral": { + "source": "rest", + "apiKey": "", + "dimensions": 1024, + "documentTemplate": "", + "url": "https://api.mistral.ai/v1/embeddings", + "request": { + "model": "mistral-embed", + "input": ["{{text}}", "{{..}}"] + }, + "response": { + "data": [ + { + "embedding": "{{embedding}}" + }, + "{{..}}" + ] + } + } +} +``` + +In this configuration: + +- `source`: Specifies the source of the embedder, which is set to "rest" for using a REST API. +- `apiKey`: Replace `` with your actual Mistral API key. +- `dimensions`: Specifies the dimensions of the embeddings, set to 1024 for the `mistral-embed` model. +- `documentTemplate`: Optionally, you can provide a [custom template](/learn/ai-powered-search/getting_started_with_ai_search?utm_campaign=vector-search&utm_source=docs&utm_medium=mistral-embeddings-guide#documenttemplate) for generating embeddings from your documents. +- `url`: Specifies the URL of the Mistral API endpoint. +- `request`: Defines the request structure for the Mistral API, including the model name and input parameters. +- `response`: Defines the expected response structure from the Mistral API, including the embedding data. + +Once you've configured the embedder settings, Meilisearch will automatically generate embeddings for your documents and store them in the vector store. + +Please note that most third-party tools have rate limiting, which is managed by Meilisearch. If you have a free account, the indexation process may take some time, but Meilisearch will handle it with a retry strategy. + +It's recommended to monitor the tasks queue to ensure everything is running smoothly. You can access the tasks queue using the Cloud UI or the [Meilisearch API](/reference/api/tasks?utm_campaign=vector-search&utm_source=docs&utm_medium=mistral-embeddings-guide#get-tasks) + +## Testing semantic search + +With the embedder set up, you can now perform semantic searches using Meilisearch. When you send a search query, Meilisearch will generate an embedding for the query using the configured embedder and then use it to find the most semantically similar documents in the vector store. +To perform a semantic search, you simply need to make a normal search request but include the hybrid parameter: + +```json +{ + "q": "", + "hybrid": { + "semanticRatio": 1, + "embedder": "mistral" + } +} +``` + +In this request: + +- `q`: Represents the user's search query. +- `hybrid`: Specifies the configuration for the hybrid search. + - `semanticRatio`: Allows you to control the balance between semantic search and traditional search. A value of 1 indicates pure semantic search, while a value of 0 represents full-text search. You can adjust this parameter to achieve a hybrid search experience. + - `embedder`: The name of the embedder used for generating embeddings. Make sure to use the same name as specified in the embedder configuration, which in this case is "mistral". + +You can use the Meilisearch API or client libraries to perform searches and retrieve the relevant documents based on semantic similarity. + +## Conclusion + +By following this guide, you should now have Meilisearch set up with Mistral embedding, enabling you to leverage semantic search capabilities in your application. Meilisearch's auto-batching and efficient handling of embeddings make it a powerful choice for integrating semantic search into your project. + +To explore further configuration options for embedders, consult the [detailed documentation about the embedder setting possibilities](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=mistral-embeddings-guide#embedders-experimental). + diff --git a/guides/embedders/openai.mdx b/guides/embedders/openai.mdx new file mode 100644 index 000000000..82077d2b2 --- /dev/null +++ b/guides/embedders/openai.mdx @@ -0,0 +1,86 @@ +--- +title: Semantic Search with OpenAI Embeddings - Meilisearch documentation +description: This guide will walk you through the process of setting up Meilisearch with OpenAI embeddings to enable semantic search capabilities. +--- + +# Semantic search with OpenAI embeddings + +## Introduction + +This guide will walk you through the process of setting up Meilisearch with OpenAI embeddings to enable semantic search capabilities. By leveraging Meilisearch's AI features and OpenAI's embedding API, you can enhance your search experience and retrieve more relevant results. + +## Requirements + +To follow this guide, you'll need: + +- A [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=vector-search&utm_source=docs&utm_medium=openai-embeddings-guide) project running version 1.10 or above with the Vector store activated. +- An OpenAI account with an API key for embedding generation. You can sign up for an OpenAI account at [OpenAI](https://openai.com/). +- No backend required. + +## Setting up Meilisearch + +To set up an embedder in Meilisearch, you need to configure it to your settings. You can refer to the [Meilisearch documentation](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=openai-embeddings-guide#update-embedder-settings) for more details on updating the embedder settings. + +OpenAI offers three main embedding models: + +- `text-embedding-3-large`: 3,072 dimensions +- `text-embedding-3-small`: 1,536 dimensions +- `text-embedding-ada-002`: 1,536 dimensions + +Here's an example of embedder settings for OpenAI: + +```json +{ + "openai": { + "source": "openAi", + "apiKey": "", + "dimensions": 1536, + "documentTemplate": "", + "model": "text-embedding-3-small" + } +} +``` + +In this configuration: + +- `source`: Specifies the source of the embedder, which is set to "openAi" for using OpenAI's API. +- `apiKey`: Replace `` with your actual OpenAI API key. +- `dimensions`: Specifies the dimensions of the embeddings. Set to 1536 for `text-embedding-3-small` and `text-embedding-ada-002`, or 3072 for `text-embedding-3-large`. +- `documentTemplate`: Optionally, you can provide a [custom template](/learn/ai-powered-search/getting_started_with_ai_search?utm_campaign=vector-search&utm_source=docs&utm_medium=openai-embeddings-guide#documenttemplate) for generating embeddings from your documents. +- `model`: Specifies the OpenAI model to use for generating embeddings. Choose from `text-embedding-3-large`, `text-embedding-3-small`, or `text-embedding-ada-002`. + +Once you've configured the embedder settings, Meilisearch will automatically generate embeddings for your documents and store them in the vector store. + +Please note that OpenAI has rate limiting, which is managed by Meilisearch. If you have a free account, the indexation process may take some time, but Meilisearch will handle it with a retry strategy. + +It's recommended to monitor the tasks queue to ensure everything is running smoothly. You can access the tasks queue using the Cloud UI or the [Meilisearch API](/reference/api/tasks?utm_campaign=vector-search&utm_source=docs&utm_medium=openai-embeddings-guide#get-tasks) + +## Testing semantic search + +With the embedder set up, you can now perform semantic searches using Meilisearch. When you send a search query, Meilisearch will generate an embedding for the query using the configured embedder and then use it to find the most semantically similar documents in the vector store. +To perform a semantic search, you simply need to make a normal search request but include the hybrid parameter: + +```json +{ + "q": "", + "hybrid": { + "semanticRatio": 1, + "embedder": "openai" + } +} +``` + +In this request: + +- `q`: Represents the user's search query. +- `hybrid`: Specifies the configuration for the hybrid search. + - `semanticRatio`: Allows you to control the balance between semantic search and traditional search. A value of 1 indicates pure semantic search, while a value of 0 represents full-text search. You can adjust this parameter to achieve a hybrid search experience. + - `embedder`: The name of the embedder used for generating embeddings. Make sure to use the same name as specified in the embedder configuration, which in this case is "openai". + +You can use the Meilisearch API or client libraries to perform searches and retrieve the relevant documents based on semantic similarity. + +## Conclusion + +By following this guide, you should now have Meilisearch set up with OpenAI embedding, enabling you to leverage semantic search capabilities in your application. Meilisearch's auto-batching and efficient handling of embeddings make it a powerful choice for integrating semantic search into your project. + +To explore further configuration options for embedders, consult the [detailed documentation about the embedder setting possibilities](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=openai-embeddings-guide#embedders-experimental). diff --git a/guides/embedders/voyage.mdx b/guides/embedders/voyage.mdx new file mode 100644 index 000000000..470a3897b --- /dev/null +++ b/guides/embedders/voyage.mdx @@ -0,0 +1,101 @@ +--- +title: Semantic Search with Voyage AI Embeddings - Meilisearch documentation +description: This guide will walk you through the process of setting up Meilisearch with Voyage AI embeddings to enable semantic search capabilities. +--- + +# Semantic search with Voyage AI embeddings + +## Introduction + +This guide will walk you through the process of setting up Meilisearch with Voyage AI embeddings to enable semantic search capabilities. By leveraging Meilisearch's AI features and Voyage AI's embedding API, you can enhance your search experience and retrieve more relevant results. + +## Requirements + +To follow this guide, you'll need: + +- A [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=vector-search&utm_source=docs&utm_medium=voyage-embeddings-guide) project running version 1.10 or above with the Vector store activated. +- A Voyage AI account with an API key for embedding generation. You can sign up for a Voyage AI account at [Voyage AI](https://www.voyageai.com/). +- No backend required. + +## Setting up Meilisearch + +To set up an embedder in Meilisearch, you need to configure it to your settings. You can refer to the [Meilisearch documentation](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=voyage-embeddings-guide#update-embedder-settings) for more details on updating the embedder settings. + +Voyage AI offers the following embedding models: + +- `voyage-large-2-instruct`: 1024 dimensions +- `voyage-multilingual-2`: 1024 dimensions +- `voyage-large-2`: 1536 dimensions +- `voyage-2`: 1024 dimensions + +Here's an example of embedder settings for Voyage AI: + +```json +{ + "voyage": { + "source": "rest", + "apiKey": "", + "dimensions": 1024, + "documentTemplate": "", + "url": "https://api.voyageai.com/v1/embeddings", + "request": { + "model": "voyage-2", + "input": ["{{text}}", "{{..}}"] + }, + "response": { + "data": [ + { + "embedding": "{{embedding}}" + }, + "{{..}}" + ] + } + } +} +``` + +In this configuration: + +- `source`: Specifies the source of the embedder, which is set to "rest" for using a REST API. +- `apiKey`: Replace `` with your actual Voyage AI API key. +- `dimensions`: Specifies the dimensions of the embeddings. Set to 1024 for `voyage-2`, `voyage-large-2-instruct`, and `voyage-multilingual-2`, or 1536 for `voyage-large-2`. +- `documentTemplate`: Optionally, you can provide a [custom template](/learn/ai-powered-search/getting_started_with_ai_search?utm_campaign=vector-search&utm_source=docs&utm_medium=voyage-embeddings-guide#documenttemplate) for generating embeddings from your documents. +- `url`: Specifies the URL of the Voyage AI API endpoint. +- `request`: Defines the request structure for the Voyage AI API, including the model name and input parameters. +- `response`: Defines the expected response structure from the Voyage AI API, including the embedding data. + +Once you've configured the embedder settings, Meilisearch will automatically generate embeddings for your documents and store them in the vector store. + +Please note that most third-party tools have rate limiting, which is managed by Meilisearch. If you have a free account, the indexation process may take some time, but Meilisearch will handle it with a retry strategy. + +It's recommended to monitor the tasks queue to ensure everything is running smoothly. You can access the tasks queue using the Cloud UI or the [Meilisearch API](/reference/api/tasks?utm_campaign=vector-search&utm_source=docs&utm_medium=voyage-embeddings-guide#get-tasks). + +## Testing semantic search + +With the embedder set up, you can now perform semantic searches using Meilisearch. When you send a search query, Meilisearch will generate an embedding for the query using the configured embedder and then use it to find the most semantically similar documents in the vector store. +To perform a semantic search, you simply need to make a normal search request but include the hybrid parameter: + +```json +{ + "q": "", + "hybrid": { + "semanticRatio": 1, + "embedder": "voyage" + } +} +``` + +In this request: + +- `q`: Represents the user's search query. +- `hybrid`: Specifies the configuration for the hybrid search. + - `semanticRatio`: Allows you to control the balance between semantic search and traditional search. A value of 1 indicates pure semantic search, while a value of 0 represents full-text search. You can adjust this parameter to achieve a hybrid search experience. + - `embedder`: The name of the embedder used for generating embeddings. Make sure to use the same name as specified in the embedder configuration, which in this case is "voyage". + +You can use the Meilisearch API or client libraries to perform searches and retrieve the relevant documents based on semantic similarity. + +## Conclusion + +By following this guide, you should now have Meilisearch set up with Voyage AI embedding, enabling you to leverage semantic search capabilities in your application. Meilisearch's auto-batching and efficient handling of embeddings make it a powerful choice for integrating semantic search into your project. + +To explore further configuration options for embedders, consult the [detailed documentation about the embedder setting possibilities](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=voyage-embeddings-guide#embedders-experimental).