Merge pull request #2998 from meilisearch/v1.11

v1.11
meilisearch · Oct 28, 2024 · 57e795e · 57e795e
2 parents cfb4735 + 75567ed
commit 57e795e
Show file tree

Hide file tree

Showing 13 changed files with 220 additions and 36 deletions.
diff --git a/.code-samples.meilisearch.yaml b/.code-samples.meilisearch.yaml
@@ -1242,13 +1242,16 @@ search_parameter_guide_hybrid_1: |-
       "q": "kitchen utensils",
       "hybrid": {
         "semanticRatio": 0.9,
-        "embedder": "default"
+        "embedder": "EMBEDDER_NAME"
       }
     }'
 search_parameter_guide_vector_1: |-
   curl -X POST 'localhost:7700/indexes/INDEX_NAME/search' \
     -H 'content-type: application/json' \
-    --data-binary '{ "vector": [0, 1, 2] }'
+    --data-binary '{ 
+      "vector": [0, 1, 2],
+      "embedder": "EMBEDDER_NAME"
+    }'
 get_search_cutoff_1: |-
   curl \
     -X GET 'http://localhost:7700/indexes/movies/settings/search-cutoff-ms'
@@ -1321,7 +1324,7 @@ search_parameter_reference_retrieve_vectors_1: |-
       "q": "kitchen utensils",
       "retrieveVectors": true,
       "hybrid": {
-        "embedder": "default"
+        "embedder": "EMBEDDER_NAME"
       }
     }'
 search_parameter_reference_distinct_1: |-
@@ -1355,11 +1358,12 @@ get_similar_post_1: |-
     -H 'Content-Type: application/json' \
     -H 'Authorization: Bearer DEFAULT_SEARCH_API_KEY' \
     --data-binary '{
-      "id": TARGET_DOCUMENT_ID
+      "id": TARGET_DOCUMENT_ID,
+      "embedder": "EMBEDDER_NAME"
     }'
 get_similar_get_1: |-
   curl \
-    -X GET 'http://localhost:7700/indexes/INDEX_NAME/similar?id=TARGET_DOCUMENT_ID'
+    -X GET 'http://localhost:7700/indexes/INDEX_NAME/similar?id=TARGET_DOCUMENT_ID&embedder=EMBEDDER_NAME'
 search_parameter_reference_ranking_score_threshold_1: |-
   curl \
   -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
@@ -1373,7 +1377,7 @@ search_parameter_reference_locales_1: |-
   -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
   -H 'Content-Type: application/json' \
   --data-binary '{
-    "q": "進撃の巨人",
+    "q": "QUERY TEXT IN JAPANESE",
     "locales": ["jpn"]
   }'
 get_localized_attribute_settings_1: |-

diff --git a/assets/misc/meilisearch-collection-postman.json b/assets/misc/meilisearch-collection-postman.json
@@ -1,9 +1,9 @@
 {
 	"info": {
-		"_postman_id": "719caa45-6643-4393-9b84-e8bc6a70d074",
-		"name": "Meilisearch v1.10",
+		"_postman_id": "cc6bb097-033d-4f65-8704-f10e4e4b10d0",
+		"name": "Meilisearch v1.11",
 		"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
-		"_exporter_id": "8898306"
+		"_exporter_id": "25294324"
 	},
 	"item": [
 		{

diff --git a/guides/docker.mdx b/guides/docker.mdx
@@ -14,7 +14,7 @@ Docker is a tool that bundles applications into containers. Docker containers en
 Docker containers are distributed in images. To use Meilisearch, use the `docker pull` command to download a Meilisearch image:
 
 ```sh
-docker pull getmeili/meilisearch:v1.10
+docker pull getmeili/meilisearch:v1.11
 ```
 
 Meilisearch deploys a new Docker image with every release of the engine. Each image is tagged with the corresponding Meilisearch version, indicated in the above example by the text following the `:` symbol. You can see [the full list of available Meilisearch Docker images](https://hub.docker.com/r/getmeili/meilisearch/tags#!) on Docker Hub.
@@ -31,7 +31,7 @@ After completing the previous step, use `docker run` to launch the Meilisearch i
 docker run -it --rm \
   -p 7700:7700 \
   -v $(pwd)/meili_data:/meili_data \
-  getmeili/meilisearch:v1.10
+  getmeili/meilisearch:v1.11
 ```
 
 ### Configure Meilisearch
@@ -47,7 +47,7 @@ docker run -it --rm \
   -p 7700:7700 \
   -e MEILI_MASTER_KEY='MASTER_KEY'\
   -v $(pwd)/meili_data:/meili_data \
-  getmeili/meilisearch:v1.10
+  getmeili/meilisearch:v1.11
 ```
 
 #### Passing instance options with CLI arguments
@@ -58,7 +58,7 @@ If you want to pass command-line arguments to Meilisearch with Docker, you must
 docker run -it --rm \
   -p 7700:7700 \
   -v $(pwd)/meili_data:/meili_data \
-  getmeili/meilisearch:v1.10 \
+  getmeili/meilisearch:v1.11 \
   meilisearch --master-key="MASTER_KEY"
 ```
 
@@ -76,7 +76,7 @@ To keep your data intact between reboots, specify a dedicated volume by running
 docker run -it --rm \
   -p 7700:7700 \
   -v $(pwd)/meili_data:/meili_data \
-  getmeili/meilisearch:v1.10
+  getmeili/meilisearch:v1.11
 ```
 
 The example above uses `$(pwd)/meili_data`, which is a directory in the host machine. Depending on your OS, mounting volumes from the host to the container might result in performance loss and is only recommended when developing your application.
@@ -91,7 +91,7 @@ To import a dump, use Meilisearch's `--import-dump` command-line option and spec
 docker run -it --rm \
   -p 7700:7700 \
   -v $(pwd)/meili_data:/meili_data \
-  getmeili/meilisearch:v1.10 \
+  getmeili/meilisearch:v1.11 \
   meilisearch --import-dump /meili_data/dumps/20200813-042312213.dump
 ```
 
@@ -111,7 +111,7 @@ To generate a Meilisearch snapshot with Docker, launch Meilisearch with `--sched
 docker run -it --rm \
   -p 7700:7700 \
   -v $(pwd)/meili_data:/meili_data \
-  getmeili/meilisearch:v1.10 \
+  getmeili/meilisearch:v1.11 \
   meilisearch --schedule-snapshot --snapshot-dir /meili_data/snapshots
 ```
 
@@ -123,7 +123,7 @@ To import a snapshot, launch Meilisearch with the `--import-snapshot` option:
 docker run -it --rm \
   -p 7700:7700 \
   -v $(pwd)/meili_data:/meili_data \
-  getmeili/meilisearch:v1.10 \
+  getmeili/meilisearch:v1.11 \
   meilisearch --import-snapshot /meili_data/snapshots/data.ms.snapshot
 ```
 

diff --git a/learn/ai_powered_search/getting_started_with_ai_search.mdx b/learn/ai_powered_search/getting_started_with_ai_search.mdx
@@ -50,15 +50,15 @@ curl \
 
 Next, you must generate vector embeddings for all documents in your dataset. Embeddings are mathematical representations of the meanings of words and sentences in your documents. Meilisearch relies on external providers to generate these embeddings. Use OpenAI for this tutorial.
 
-Use the `embedders` index setting of the [update `/settings` endpoint](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=vector-search-guide) to configure a default [OpenAI](https://platform.openai.com/) embedder:
+Use the `embedders` index setting of the [update `/settings` endpoint](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=vector-search-guide) to configure an [OpenAI](https://platform.openai.com/) embedder:
 
 ```sh
 curl \
   -X PATCH 'http://localhost:7700/indexes/kitchenware/settings' \
   -H 'Content-Type: application/json' \
   --data-binary '{
     "embedders": {
-      "default": {
+      "openai": {
         "source":  "openAi",
         "apiKey": "OPEN_AI_API_KEY",
         "model": "text-embedding-3-small",
@@ -91,7 +91,7 @@ curl \
   --data-binary '{
     "q": "kitchen utensils made of wood",
     "hybrid": {
-      "embedder": "default",
+      "embedder": "openai",
       "semanticRatio": 0.7
     }
   }'

diff --git a/learn/filtering_and_sorting/filter_expression_reference.mdx b/learn/filtering_and_sorting/filter_expression_reference.mdx
@@ -161,7 +161,7 @@ NOT genres IN [horror, comedy]
 
 `CONTAINS` filters results containing partial matches to the specified string pattern, similar to a [SQL `LIKE`](https://dev.mysql.com/doc/refman/8.4/en/string-comparison-functions.html#operator_like).
 
-The following expression returns all dairy products whose name start with `"kef"`, such as kefir:
+The following expression returns all dairy products whose names contain `"kef"`:
 
 ```
 dairy_products.name CONTAINS kef
@@ -185,6 +185,40 @@ curl \
     "containsFilter": true
   }'
 ```
+
+This will also enable the [`STARTS WITH`](#starts-with) operator.
+</Capsule>
+
+### `STARTS WITH` <NoticeTag type="experimental" label="experimental" />
+
+`STARTS WITH` filters results whose values start with the specified string pattern.
+
+The following expression returns all dairy products whose name start with `"kef"`:
+
+```
+dairy_products.name STARTS WITH kef
+```
+
+The negated form of the above expression can be written as:
+
+```
+dairy_products.name NOT STARTS WITH kef
+NOT dairy_product.name STARTS WITH kef
+```
+
+<Capsule intent="note" title="Activating `STARTS WITH`">
+This is an experimental feature. Use the experimental features endpoint to activate it:
+
+```sh
+curl \
+  -X PATCH 'http://localhost:7700/experimental-features/' \
+  -H 'Content-Type: application/json' \
+  --data-binary '{
+    "containsFilter": true
+  }'
+```
+
+This will also enable the [`CONTAINS`](#contains) operator.
 </Capsule>
 
 ### `NOT`

diff --git a/learn/indexing/indexing_best_practices.mdx b/learn/indexing/indexing_best_practices.mdx
@@ -60,3 +60,13 @@ If you have followed the previous tips in this guide and are still experiencing
 Indexing is a memory-intensive and multi-threaded operation. The more memory and processor cores available, the faster Meilisearch will index new documents. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM.
 
 Due to how Meilisearch works, it is best to avoid HDDs (Hard Disk Drives) as they can easily become performance bottlenecks.
+
+## Enable binary quantization when using AI-powered search
+
+If you are experiencing performance issues when indexing documents for AI-powered search, consider enabling [binary quantization](/reference/api/settings#binaryquantized) for your embedders. Binary quantization compresses vectors by representing each dimension with 1-bit values. This reduces the relevancy of semantic search results, but greatly improves performance.
+
+Binary quantization works best with large datasets containing more than 1M documents and using models with more than 1400 dimensions.
+
+<Capsule intent="danger" title="Binary quantization is an irreversible process">
+**Activating binary quantization is irreversible.** Once enabled, Meilisearch converts all vectors and discards all vector data that does fit within 1-bit. The only way to recover the vectors' original values is to re-vectorize the whole index in a new embedder.
+</Capsule>
diff --git a/learn/resources/telemetry.mdx b/learn/resources/telemetry.mdx
@@ -240,13 +240,14 @@ This list is liable to change with every new version of Meilisearch. It's not be
 | `vector_store`                                     | `true` if the vector store feature is enabled, otherwise `false`                            | true
 | `attributes_to_search_on.total_number_of_uses`     | `true` if the vector store feature is enabled, otherwise `false`                            | true
 | `vector.max_vector_size`                           | Highest number of dimensions given for the `vector` parameter in this batch                 | 1536
-| `vector.retrieve_vectors` | true if the retrieve_vectors parameter has been used in this batch. | false |
+| `vector.retrieve_vectors`                          | true if the retrieve_vectors parameter has been used in this batch.                         | false
 | `hybrid.enabled`                                   | `true` if hybrid search been used in the aggregated event, otherwise `false`                | true
 | `hybrid.semantic_ratio`                            | `true` if semanticRatio was used in this batch, otherwise false                             | false
-| `hybrid.embedder`                                  | `true` if a specific embedder was used in this batch, otherwise false                       | true 
 | `embedders.total`                                  | Numbers of defined embedders                                                                | 2
 | `embedders.sources`                                | An array representing the different provided sources                                        | [”huggingFace”, “userProvided”]
 | `embedders.document_template_used`                 | A boolean indicating if one of the provided embedders has a custom template defined         | true
+| `embedders.document_template_max_bytes`            | a value indicating the largest value for document TemplateMaxBytes across all embedder      | 400
+| `embedders.binary_quantization_used`               | `true` if the user updated the binary quantized field of the embedded settings              | `false`
 | `infos.task_queue_webhook`                         | `true` if the instance is launched with a task queue webhook, otherwise `false`             | `false`
 | `infos.experimental_search_queue_size`             | Size of the search queue                                                                    | 750
 | `locales`                                          | List of locales used with `/search` and `/settings` routes                                  | [”fra”, “eng”]

diff --git a/learn/self_hosted/install_meilisearch_locally.mdx b/learn/self_hosted/install_meilisearch_locally.mdx
@@ -54,14 +54,14 @@ These commands launch the **latest stable release** of Meilisearch.
 
 ```bash
 # Fetch the latest version of Meilisearch image from DockerHub
-docker pull getmeili/meilisearch:v1.10
+docker pull getmeili/meilisearch:v1.11
 
 # Launch Meilisearch in development mode with a master key
 docker run -it --rm \
     -p 7700:7700 \
     -e MEILI_ENV='development' \
     -v $(pwd)/meili_data:/meili_data \
-    getmeili/meilisearch:v1.10
+    getmeili/meilisearch:v1.11
 # Use ${pwd} instead of $(pwd) in PowerShell
 ```
 

diff --git a/reference/api/multi_search.mdx b/reference/api/multi_search.mdx
@@ -34,9 +34,82 @@ Use `federation` to receive a single list with all search results from all speci
 | :--------------------------------------------------------------------------- | :--------------- | :------------ | :-------------------------------------------------- |
 | **[`offset`](/reference/api/search#offset)**                                 | Integer          | `0`           | Number of documents to skip                         |
 | **[`limit`](/reference/api/search#limit)**                                   | Integer          | `20`          | Maximum number of documents returned                |
+| **[`facetsByIndex`](#facetsbyindex)**                                   | Object of arrays          | `null`          | Display facet information for the specified indexes                |
+| **[`mergeFacets`](#mergefacets)**                                   | Object          | `null`          | Display facet information for the specified indexes                |
 
 If `federation` is missing or `null`, Meilisearch returns a list of multiple search result objects, with each item from the list corresponding to a search query in the request.
 
+##### `facetsByIndex`
+
+`facetsByIndex` must be an object. Its keys must correspond to indexes in your Meilisearch project. Each key must be associated with an array of attributes in the filterable attributes list of that index:
+
+```json
+"facetsByIndex": {
+  "INDEX_A": ["ATTRIBUTE_X", "ATTRIBUTE_Y"],
+  "INDEX_B": ["ATTRIBUTE_Z"]
+}
+```
+
+When you specify `facetsByIndex`, multi-search responses include an extra `facetsByIndex` field. The response's `facetsByIndex` is an object with one field for each queried index:
+
+```json
+{
+  "hits" [ … ],
+  …
+  "facetsByIndex": {
+    "INDEX_A": {
+      "distribution": {
+        "ATTRIBUTE_X": {
+          "KEY": <Integer>,
+          "KEY": <Integer>,
+          …
+        },
+        "ATTRIBUTE_Y": {
+          "KEY": <Integer>,
+          …
+        }
+      },
+      "stats": {
+        "KEY": {
+          "min": <Integer>,
+          "max": <Integer>
+        }
+      }
+    },
+    "INDEX_B": {
+      …
+    }
+  }
+}
+```
+
+##### `mergeFacets`
+
+`mergeFacets` must be an object and may contain the following fields:
+
+- `maxValuesPerFacet`: must be an integer. When specified, indicates the maximum number of returned values for a single facet. Defaults to the value assigned to [the `maxValuesPerFacet` index setting](/reference/api/settings#faceting)
+
+When both `facetsByIndex` and `mergeFacets` are present and not null, facet information included in multi-search responses is merged across all queried indexes. Instead of `facetsByIndex`, the response includes two extra fields: `facetDistribution` and `facetStats`:
+
+```json
+{
+  "hits": [ … ],
+  …
+  "facetFederation": {
+    "ATTRIBUTE": {
+      "VALUE": <Integer>,
+      "VALUE": <Integer>
+    }
+  },
+  "facetStats": {
+    "ATTRIBUTE": {
+      "min": <Integer>,
+      "max": <Integer>
+    }
+  }
+}
+```
+
 ##### Merge algorithm for federated searches
 
 Federated search's merged results are returned in decreasing ranking score. To obtain the final list of results, Meilisearch compares with the following procedure:

diff --git a/reference/api/search.mdx b/reference/api/search.mdx
@@ -1181,7 +1181,7 @@ Configures Meilisearch to return search results based on a query's meaning and c
 
 `hybrid` must be an object. It accepts two fields: `embedder` and `semanticRatio`.
 
-`embedder` must be a string indicating an embedder configured with the `/settings` endpoint. If you don't specify an embedder and your index contains a single embedder, Meilisearch uses it by default. If an index contains multiple embedders, Meilisearch will use the embedder named `default`.
+`embedder` must be a string indicating an embedder configured with the `/settings` endpoint. It is mandatory to specify a valid embedder when performing AI-powered searches.
 
 `semanticRatio` must be a number between `0.0` and `1.0` indicating the proportion between keyword and semantic search results. `0.0` causes Meilisearch to only return keyword results. `1.0` causes Meilisearch to only return meaning-based results. Defaults to `0.5`.
 
@@ -1205,6 +1205,12 @@ Use a custom vector to perform a search query. Must be an array of numbers corre
 
 `vector` dimensions must match the dimensions of the embedder.
 
+<Capsule intent="note">
+If a query does not specify `q`, but contains both `vector` and `hybrid.semanticRatio` bigger than `0`, Meilisearch performs a pure semantic search.
+
+If `q` is missing and `semanticRatio` is explicitly set to `0`, Meilisearch performs a placeholder search without any vector search results.
+</Capsule>
+
 #### Example
 
 <CodeSamples id="search_parameter_guide_vector_1" />
@@ -1248,7 +1254,7 @@ Return document embedding data with search results. If `true`, Meilisearch will
 ### Query locales
 
 **Parameter**: `locales`<br />
-**Expected value**: array of [supported ISO-639-2B locales](/reference/api/settings#localized-attributes-object)<br />
+**Expected value**: array of [supported ISO-639 locales](/reference/api/settings#localized-attributes-object)<br />
 **Default value**: `[]`
 
 By default, Meilisearch auto-detects the language of a query. Use this parameter to explicitly state the language of a query.
@@ -1275,7 +1281,6 @@ For full control over the way Meilisearch detects languages during indexing and
     {
       "id": 0,
       "title": "DOCUMENT NAME",
-      "overview_cn": "OVERVIEW TEXT IN CHINESE",
       "overview_jp": "OVERVIEW TEXT IN JAPANESE"
     }
     …