Skip to content

Commit

Permalink
Merge pull request #2998 from meilisearch/v1.11
Browse files Browse the repository at this point in the history
v1.11
  • Loading branch information
Kerollmops authored Oct 28, 2024
2 parents cfb4735 + 75567ed commit 57e795e
Show file tree
Hide file tree
Showing 13 changed files with 220 additions and 36 deletions.
16 changes: 10 additions & 6 deletions .code-samples.meilisearch.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1242,13 +1242,16 @@ search_parameter_guide_hybrid_1: |-
"q": "kitchen utensils",
"hybrid": {
"semanticRatio": 0.9,
"embedder": "default"
"embedder": "EMBEDDER_NAME"
}
}'
search_parameter_guide_vector_1: |-
curl -X POST 'localhost:7700/indexes/INDEX_NAME/search' \
-H 'content-type: application/json' \
--data-binary '{ "vector": [0, 1, 2] }'
--data-binary '{
"vector": [0, 1, 2],
"embedder": "EMBEDDER_NAME"
}'
get_search_cutoff_1: |-
curl \
-X GET 'http://localhost:7700/indexes/movies/settings/search-cutoff-ms'
Expand Down Expand Up @@ -1321,7 +1324,7 @@ search_parameter_reference_retrieve_vectors_1: |-
"q": "kitchen utensils",
"retrieveVectors": true,
"hybrid": {
"embedder": "default"
"embedder": "EMBEDDER_NAME"
}
}'
search_parameter_reference_distinct_1: |-
Expand Down Expand Up @@ -1355,11 +1358,12 @@ get_similar_post_1: |-
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer DEFAULT_SEARCH_API_KEY' \
--data-binary '{
"id": TARGET_DOCUMENT_ID
"id": TARGET_DOCUMENT_ID,
"embedder": "EMBEDDER_NAME"
}'
get_similar_get_1: |-
curl \
-X GET 'http://localhost:7700/indexes/INDEX_NAME/similar?id=TARGET_DOCUMENT_ID'
-X GET 'http://localhost:7700/indexes/INDEX_NAME/similar?id=TARGET_DOCUMENT_ID&embedder=EMBEDDER_NAME'
search_parameter_reference_ranking_score_threshold_1: |-
curl \
-X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
Expand All @@ -1373,7 +1377,7 @@ search_parameter_reference_locales_1: |-
-X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "進撃の巨人",
"q": "QUERY TEXT IN JAPANESE",
"locales": ["jpn"]
}'
get_localized_attribute_settings_1: |-
Expand Down
6 changes: 3 additions & 3 deletions assets/misc/meilisearch-collection-postman.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{
"info": {
"_postman_id": "719caa45-6643-4393-9b84-e8bc6a70d074",
"name": "Meilisearch v1.10",
"_postman_id": "cc6bb097-033d-4f65-8704-f10e4e4b10d0",
"name": "Meilisearch v1.11",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
"_exporter_id": "8898306"
"_exporter_id": "25294324"
},
"item": [
{
Expand Down
16 changes: 8 additions & 8 deletions guides/docker.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Docker is a tool that bundles applications into containers. Docker containers en
Docker containers are distributed in images. To use Meilisearch, use the `docker pull` command to download a Meilisearch image:

```sh
docker pull getmeili/meilisearch:v1.10
docker pull getmeili/meilisearch:v1.11
```

Meilisearch deploys a new Docker image with every release of the engine. Each image is tagged with the corresponding Meilisearch version, indicated in the above example by the text following the `:` symbol. You can see [the full list of available Meilisearch Docker images](https://hub.docker.com/r/getmeili/meilisearch/tags#!) on Docker Hub.
Expand All @@ -31,7 +31,7 @@ After completing the previous step, use `docker run` to launch the Meilisearch i
docker run -it --rm \
-p 7700:7700 \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10
getmeili/meilisearch:v1.11
```

### Configure Meilisearch
Expand All @@ -47,7 +47,7 @@ docker run -it --rm \
-p 7700:7700 \
-e MEILI_MASTER_KEY='MASTER_KEY'\
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10
getmeili/meilisearch:v1.11
```

#### Passing instance options with CLI arguments
Expand All @@ -58,7 +58,7 @@ If you want to pass command-line arguments to Meilisearch with Docker, you must
docker run -it --rm \
-p 7700:7700 \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10 \
getmeili/meilisearch:v1.11 \
meilisearch --master-key="MASTER_KEY"
```

Expand All @@ -76,7 +76,7 @@ To keep your data intact between reboots, specify a dedicated volume by running
docker run -it --rm \
-p 7700:7700 \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10
getmeili/meilisearch:v1.11
```

The example above uses `$(pwd)/meili_data`, which is a directory in the host machine. Depending on your OS, mounting volumes from the host to the container might result in performance loss and is only recommended when developing your application.
Expand All @@ -91,7 +91,7 @@ To import a dump, use Meilisearch's `--import-dump` command-line option and spec
docker run -it --rm \
-p 7700:7700 \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10 \
getmeili/meilisearch:v1.11 \
meilisearch --import-dump /meili_data/dumps/20200813-042312213.dump
```

Expand All @@ -111,7 +111,7 @@ To generate a Meilisearch snapshot with Docker, launch Meilisearch with `--sched
docker run -it --rm \
-p 7700:7700 \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10 \
getmeili/meilisearch:v1.11 \
meilisearch --schedule-snapshot --snapshot-dir /meili_data/snapshots
```

Expand All @@ -123,7 +123,7 @@ To import a snapshot, launch Meilisearch with the `--import-snapshot` option:
docker run -it --rm \
-p 7700:7700 \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10 \
getmeili/meilisearch:v1.11 \
meilisearch --import-snapshot /meili_data/snapshots/data.ms.snapshot
```

Expand Down
6 changes: 3 additions & 3 deletions learn/ai_powered_search/getting_started_with_ai_search.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -50,15 +50,15 @@ curl \

Next, you must generate vector embeddings for all documents in your dataset. Embeddings are mathematical representations of the meanings of words and sentences in your documents. Meilisearch relies on external providers to generate these embeddings. Use OpenAI for this tutorial.

Use the `embedders` index setting of the [update `/settings` endpoint](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=vector-search-guide) to configure a default [OpenAI](https://platform.openai.com/) embedder:
Use the `embedders` index setting of the [update `/settings` endpoint](/reference/api/settings?utm_campaign=vector-search&utm_source=docs&utm_medium=vector-search-guide) to configure an [OpenAI](https://platform.openai.com/) embedder:

```sh
curl \
-X PATCH 'http://localhost:7700/indexes/kitchenware/settings' \
-H 'Content-Type: application/json' \
--data-binary '{
"embedders": {
"default": {
"openai": {
"source": "openAi",
"apiKey": "OPEN_AI_API_KEY",
"model": "text-embedding-3-small",
Expand Down Expand Up @@ -91,7 +91,7 @@ curl \
--data-binary '{
"q": "kitchen utensils made of wood",
"hybrid": {
"embedder": "default",
"embedder": "openai",
"semanticRatio": 0.7
}
}'
Expand Down
36 changes: 35 additions & 1 deletion learn/filtering_and_sorting/filter_expression_reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ NOT genres IN [horror, comedy]

`CONTAINS` filters results containing partial matches to the specified string pattern, similar to a [SQL `LIKE`](https://dev.mysql.com/doc/refman/8.4/en/string-comparison-functions.html#operator_like).

The following expression returns all dairy products whose name start with `"kef"`, such as kefir:
The following expression returns all dairy products whose names contain `"kef"`:

```
dairy_products.name CONTAINS kef
Expand All @@ -185,6 +185,40 @@ curl \
"containsFilter": true
}'
```

This will also enable the [`STARTS WITH`](#starts-with) operator.
</Capsule>

### `STARTS WITH` <NoticeTag type="experimental" label="experimental" />

`STARTS WITH` filters results whose values start with the specified string pattern.

The following expression returns all dairy products whose name start with `"kef"`:

```
dairy_products.name STARTS WITH kef
```

The negated form of the above expression can be written as:

```
dairy_products.name NOT STARTS WITH kef
NOT dairy_product.name STARTS WITH kef
```

<Capsule intent="note" title="Activating `STARTS WITH`">
This is an experimental feature. Use the experimental features endpoint to activate it:

```sh
curl \
-X PATCH 'http://localhost:7700/experimental-features/' \
-H 'Content-Type: application/json' \
--data-binary '{
"containsFilter": true
}'
```

This will also enable the [`CONTAINS`](#contains) operator.
</Capsule>

### `NOT`
Expand Down
10 changes: 10 additions & 0 deletions learn/indexing/indexing_best_practices.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,13 @@ If you have followed the previous tips in this guide and are still experiencing
Indexing is a memory-intensive and multi-threaded operation. The more memory and processor cores available, the faster Meilisearch will index new documents. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM.

Due to how Meilisearch works, it is best to avoid HDDs (Hard Disk Drives) as they can easily become performance bottlenecks.

## Enable binary quantization when using AI-powered search

If you are experiencing performance issues when indexing documents for AI-powered search, consider enabling [binary quantization](/reference/api/settings#binaryquantized) for your embedders. Binary quantization compresses vectors by representing each dimension with 1-bit values. This reduces the relevancy of semantic search results, but greatly improves performance.

Binary quantization works best with large datasets containing more than 1M documents and using models with more than 1400 dimensions.

<Capsule intent="danger" title="Binary quantization is an irreversible process">
**Activating binary quantization is irreversible.** Once enabled, Meilisearch converts all vectors and discards all vector data that does fit within 1-bit. The only way to recover the vectors' original values is to re-vectorize the whole index in a new embedder.
</Capsule>
5 changes: 3 additions & 2 deletions learn/resources/telemetry.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -240,13 +240,14 @@ This list is liable to change with every new version of Meilisearch. It's not be
| `vector_store` | `true` if the vector store feature is enabled, otherwise `false` | true
| `attributes_to_search_on.total_number_of_uses` | `true` if the vector store feature is enabled, otherwise `false` | true
| `vector.max_vector_size` | Highest number of dimensions given for the `vector` parameter in this batch | 1536
| `vector.retrieve_vectors` | true if the retrieve_vectors parameter has been used in this batch. | false |
| `vector.retrieve_vectors` | true if the retrieve_vectors parameter has been used in this batch. | false
| `hybrid.enabled` | `true` if hybrid search been used in the aggregated event, otherwise `false` | true
| `hybrid.semantic_ratio` | `true` if semanticRatio was used in this batch, otherwise false | false
| `hybrid.embedder` | `true` if a specific embedder was used in this batch, otherwise false | true
| `embedders.total` | Numbers of defined embedders | 2
| `embedders.sources` | An array representing the different provided sources | [”huggingFace”, “userProvided”]
| `embedders.document_template_used` | A boolean indicating if one of the provided embedders has a custom template defined | true
| `embedders.document_template_max_bytes` | a value indicating the largest value for document TemplateMaxBytes across all embedder | 400
| `embedders.binary_quantization_used` | `true` if the user updated the binary quantized field of the embedded settings | `false`
| `infos.task_queue_webhook` | `true` if the instance is launched with a task queue webhook, otherwise `false` | `false`
| `infos.experimental_search_queue_size` | Size of the search queue | 750
| `locales` | List of locales used with `/search` and `/settings` routes | [”fra”, “eng”]
Expand Down
4 changes: 2 additions & 2 deletions learn/self_hosted/install_meilisearch_locally.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -54,14 +54,14 @@ These commands launch the **latest stable release** of Meilisearch.

```bash
# Fetch the latest version of Meilisearch image from DockerHub
docker pull getmeili/meilisearch:v1.10
docker pull getmeili/meilisearch:v1.11

# Launch Meilisearch in development mode with a master key
docker run -it --rm \
-p 7700:7700 \
-e MEILI_ENV='development' \
-v $(pwd)/meili_data:/meili_data \
getmeili/meilisearch:v1.10
getmeili/meilisearch:v1.11
# Use ${pwd} instead of $(pwd) in PowerShell
```

Expand Down
73 changes: 73 additions & 0 deletions reference/api/multi_search.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,82 @@ Use `federation` to receive a single list with all search results from all speci
| :--------------------------------------------------------------------------- | :--------------- | :------------ | :-------------------------------------------------- |
| **[`offset`](/reference/api/search#offset)** | Integer | `0` | Number of documents to skip |
| **[`limit`](/reference/api/search#limit)** | Integer | `20` | Maximum number of documents returned |
| **[`facetsByIndex`](#facetsbyindex)** | Object of arrays | `null` | Display facet information for the specified indexes |
| **[`mergeFacets`](#mergefacets)** | Object | `null` | Display facet information for the specified indexes |

If `federation` is missing or `null`, Meilisearch returns a list of multiple search result objects, with each item from the list corresponding to a search query in the request.

##### `facetsByIndex`

`facetsByIndex` must be an object. Its keys must correspond to indexes in your Meilisearch project. Each key must be associated with an array of attributes in the filterable attributes list of that index:

```json
"facetsByIndex": {
"INDEX_A": ["ATTRIBUTE_X", "ATTRIBUTE_Y"],
"INDEX_B": ["ATTRIBUTE_Z"]
}
```

When you specify `facetsByIndex`, multi-search responses include an extra `facetsByIndex` field. The response's `facetsByIndex` is an object with one field for each queried index:

```json
{
"hits" [ ],
"facetsByIndex": {
"INDEX_A": {
"distribution": {
"ATTRIBUTE_X": {
"KEY": <Integer>,
"KEY": <Integer>,
},
"ATTRIBUTE_Y": {
"KEY": <Integer>,
}
},
"stats": {
"KEY": {
"min": <Integer>,
"max": <Integer>
}
}
},
"INDEX_B": {
}
}
}
```

##### `mergeFacets`

`mergeFacets` must be an object and may contain the following fields:

- `maxValuesPerFacet`: must be an integer. When specified, indicates the maximum number of returned values for a single facet. Defaults to the value assigned to [the `maxValuesPerFacet` index setting](/reference/api/settings#faceting)

When both `facetsByIndex` and `mergeFacets` are present and not null, facet information included in multi-search responses is merged across all queried indexes. Instead of `facetsByIndex`, the response includes two extra fields: `facetDistribution` and `facetStats`:

```json
{
"hits": [ ],
"facetFederation": {
"ATTRIBUTE": {
"VALUE": <Integer>,
"VALUE": <Integer>
}
},
"facetStats": {
"ATTRIBUTE": {
"min": <Integer>,
"max": <Integer>
}
}
}
```

##### Merge algorithm for federated searches

Federated search's merged results are returned in decreasing ranking score. To obtain the final list of results, Meilisearch compares with the following procedure:
Expand Down
11 changes: 8 additions & 3 deletions reference/api/search.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -1181,7 +1181,7 @@ Configures Meilisearch to return search results based on a query's meaning and c

`hybrid` must be an object. It accepts two fields: `embedder` and `semanticRatio`.

`embedder` must be a string indicating an embedder configured with the `/settings` endpoint. If you don't specify an embedder and your index contains a single embedder, Meilisearch uses it by default. If an index contains multiple embedders, Meilisearch will use the embedder named `default`.
`embedder` must be a string indicating an embedder configured with the `/settings` endpoint. It is mandatory to specify a valid embedder when performing AI-powered searches.

`semanticRatio` must be a number between `0.0` and `1.0` indicating the proportion between keyword and semantic search results. `0.0` causes Meilisearch to only return keyword results. `1.0` causes Meilisearch to only return meaning-based results. Defaults to `0.5`.

Expand All @@ -1205,6 +1205,12 @@ Use a custom vector to perform a search query. Must be an array of numbers corre

`vector` dimensions must match the dimensions of the embedder.

<Capsule intent="note">
If a query does not specify `q`, but contains both `vector` and `hybrid.semanticRatio` bigger than `0`, Meilisearch performs a pure semantic search.

If `q` is missing and `semanticRatio` is explicitly set to `0`, Meilisearch performs a placeholder search without any vector search results.
</Capsule>

#### Example

<CodeSamples id="search_parameter_guide_vector_1" />
Expand Down Expand Up @@ -1248,7 +1254,7 @@ Return document embedding data with search results. If `true`, Meilisearch will
### Query locales

**Parameter**: `locales`<br />
**Expected value**: array of [supported ISO-639-2B locales](/reference/api/settings#localized-attributes-object)<br />
**Expected value**: array of [supported ISO-639 locales](/reference/api/settings#localized-attributes-object)<br />
**Default value**: `[]`

By default, Meilisearch auto-detects the language of a query. Use this parameter to explicitly state the language of a query.
Expand All @@ -1275,7 +1281,6 @@ For full control over the way Meilisearch detects languages during indexing and
{
"id": 0,
"title": "DOCUMENT NAME",
"overview_cn": "OVERVIEW TEXT IN CHINESE",
"overview_jp": "OVERVIEW TEXT IN JAPANESE"
}
Expand Down
Loading

0 comments on commit 57e795e

Please sign in to comment.