Documentation for native MMR support

bzhangam · bzhangam · commit 292deaeab47a · 2025-09-24T10:06:33.000-07:00
Signed-off-by: Bo Zhang &lt;bzhangam@amazon.com&gt;
diff --git a/_search-plugins/search-pipelines/index.md b/_search-plugins/search-pipelines/index.md
@@ -20,6 +20,7 @@ The following is a list of search pipeline terminology:
 * [_Search response processor_]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/search-processors#search-response-processors): A component that intercepts a search response and search request (the query, results, and metadata passed in the request), performs an operation with or on the search response, and returns the search response.
 * [_Search phase results processor_]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/search-processors#search-phase-results-processors): A component that runs between search phases at the coordinating node level. A search phase results processor intercepts the results retrieved from one search phase and transforms them before passing them to the next search phase.
 * [_Processor_]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/search-processors/): Either a search request processor or a search response processor.
+* [_System Generated Processor_]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/system-generated-search-processors/): System generated search processors.
 * _Search pipeline_: An ordered list of processors that is integrated into OpenSearch. The pipeline intercepts a query, performs processing on the query, sends it to OpenSearch, intercepts the results, performs processing on the results, and returns them to the calling application, as shown in the following diagram. 
 
 ![Search processor diagram]({{site.url}}{{site.baseurl}}/images/search-pipelines.png)
diff --git a/_search-plugins/search-pipelines/search-pipeline-metrics.md b/_search-plugins/search-pipelines/search-pipeline-metrics.md
@@ -1,7 +1,7 @@
 ---
 layout: default
 title: Search pipeline metrics
-nav_order: 50
+nav_order: 60
 has_children: false
 parent: Search pipelines
 ---
diff --git a/_search-plugins/search-pipelines/system-generated-search-processors.md b/_search-plugins/search-pipelines/system-generated-search-processors.md
@@ -0,0 +1,60 @@
+---
+layout: default
+title: System generated search processors
+nav_order: 50
+has_children: false
+parent: Search pipelines
+---
+
+# System generated search processors
+
+System generated search processors are search processors that can be systematically generated based on the search request. 
+
+To enable the processors, you must set the `cluster.search.enabled_system_generated_factories` setting to either `*` or explicitly include the required factories.
+
+Example:
+```json
+PUT _cluster/settings
+{
+  "persistent": {
+    "cluster.search.enabled_system_generated_factories": [
+      "mmr_over_sample_factory",
+      "mmr_rerank_factory"
+    ]
+  }
+}
+```
+{% include copy-curl.html %}
+
+
+
+Search processors can be of the following types:
+
+- [System generated search request processors](#system-generated-search-request-processors)
+- [System generated search response processors](#system-generated-search-response-processors)
+- [System generated search phase results processors](#system-generated-search-phase-results-processors)
+
+# System generated search request processors
+
+| Processor name    | Processor factory name    | Execution stage     | Trigger condition                                           | Description                                                                                                                                                                |
+|-------------------|---------------------------| ------------------- |-------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `mmr_over_sample` | `mmr_over_sample_factory` | `POST_USER_DEFINED` | Triggered when a search request includes the mmr extension  | Modifies the query size and `k` value of the k-NN query to oversample candidates for MMR re-ranking. This processor runs after any user-defined search request processors. |
+
+The execution stage determines whether a system-generated processor runs before or after user-defined processors of the same type.
+
+# System generated search response processors
+
+| Processor name | Processor factory name | Execution stage     | Trigger condition                                          | Description                                                                                                                                                     |
+|----------------|------------------------|---------------------|------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `mmr_rerank`   | `mmr_rerank_factory`   | `PRE_USER_DEFINED`  | Triggered when a search request includes the mmr extension | Re-ranks the oversampled results using MMR and reduces them to the original query size. This processor runs before any user-defined search response processors. |
+
+The execution stage determines whether a system-generated processor runs before or after user-defined processors of the same type.
+
+# System generated search phase results processors
+
+We don't have any for now.
+
+# Limitation
+
+## One processor per type and execution stage
+For each processor type and execution stage, OpenSearch currently supports only one system-generated processor for a search request. For example, only one search request processor can run at the `POST_USER_DEFINED` stage, and only one search response processor can run at the `PRE_USER_DEFINED` stage.
diff --git a/_vector-search/specialized-operations/index.md b/_vector-search/specialized-operations/index.md
@@ -13,6 +13,9 @@ cards:
   - heading: "Radial search"
     description: "Search all points in a vector space that reside within a specified maximum distance or minimum score threshold from a query point"
     link: "/vector-search/specialized-operations/radial-search-knn/"
+  - heading: "Vector search with MMR"
+    description: "Use vector search with maximal marginal relevance(mmr) re-rank."
+    link: "/vector-search/specialized-operations/vector-search-mmr/"
 ---
 
 # Specialized vector search
diff --git a/_vector-search/specialized-operations/vector-search-mmr.md b/_vector-search/specialized-operations/vector-search-mmr.md
@@ -0,0 +1,126 @@
+---
+layout: default
+title: Vector search with MMR
+nav_order: 60
+parent: Specialized vector search
+has_children: false
+has_math: true
+---
+
+# Vector search with MMR
+
+The maximal marginal relevance (MMR) search helps balance relevance and diversity in search results. Instead of returning only the most similar documents, MMR selects results that are both relevant to the query and different from each other. This improves the coverage of the result set and reduces redundancy, which is especially useful in vector search scenarios.
+
+MMR re-ranking balances two competing objectives:
+
+ - Relevance: How well a document matches the query.
+
+ - Diversity: How different a document is from the documents already selected.
+
+The algorithm computes a score for each candidate document using the following principle:
+
+```json
+MMR = (1 − λ) * relevance_score − λ * max(similarity_with_selected_docs)
+```
+
+Where:
+
+ - λ is the diversity parameter (closer to 1 means higher diversity).
+
+ - relevance_score measures similarity between the query vector and the candidate document vector.
+
+ - similarity_with_selected_docs measures similarity between the candidate and already selected documents.
+
+By adjusting the diversity parameter, you can control the tradeoff between highly relevant results and more diverse coverage in the result set.
+
+# Prerequisites
+
+To use MMR, you must enable [system-generated search processor factories](({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/system-generated-search-processors/)). Set the `cluster.search.enabled_system_generated_factories` setting (by default it is an empty list) to either `*` or explicitly include the required factories:
+
+```json
+PUT _cluster/settings
+{
+  "persistent": {
+    "cluster.search.enabled_system_generated_factories": [
+      "mmr_over_sample_factory",
+      "mmr_rerank_factory"
+    ]
+  }
+}
+```
+{% include copy-curl.html %}
+
+# Parameters
+
+The mmr extension in the search API supports the following parameters:
+
+| Parameter                 | Data type | Required                                  | Description                                                                                                                                                                                 |
+| ------------------------- | --------- | ----------------------------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `diversity`               | float     | No                                        | Controls the weight of diversity in the re-ranking process. Valid values range from `0` to `1`. A value of `1` prioritizes maximum diversity, and `0` disables diversity. Default is `0.5`. |
+| `candidates`              | integer   | No                                        | Specifies how many candidate documents to oversample before re-ranking. Default is `3 * query size`.                                                                                        |
+| `vector_field_path`       | string    | Optional, but required for remote indices | Path to the vector field used for MMR re-ranking. If not provided, OpenSearch resolves it automatically from the search request.                                                            |
+| `vector_field_data_type`  | string    | Optional, but required for remote indices | Data type of the vector field. Used to parse the field and calculate similarity. If not provided, OpenSearch resolves it from the index mapping.                                            |
+| `vector_field_space_type` | string    | Optional, but required for remote indices | Used to decide the similarity function for the vector field, such as cosine similarity or Euclidean distance. If not provided, OpenSearch resolves it from the index mapping.               |   
+
+
+# Example request
+
+The following example shows how to use the mmr extension with a k-NN query:
+
+```json
+POST /my-index/_search
+{
+  "query": {
+    "knn": {
+      "my_vector_field": {
+        "vector": [0.12, 0.54, 0.91],
+        "k": 10
+      }
+    }
+  },
+  "ext": {
+    "mmr": {
+      "diversity": 0.7
+    }
+  }
+}
+
+```
+{% include copy-curl.html %}
+
+The following example shows how to use the mmr extension with a neural query:
+```json
+POST /my-index/_search
+{
+  "query": {
+    "neural": {
+      "my_vector_field": {
+        "query_text": "query text",
+        "model_id": "<your model id>"
+      }
+    }
+  },
+  "ext": {
+    "mmr": {
+      "diversity": 0.6,
+      "candidates": 50,
+      "vector_field_path": "my_vector_field",
+      "vector_field_data_type": "float",
+      "vector_field_space_type": "l2"
+    }
+  }
+}
+```
+
+When querying across multiple indices, ensure that the data type, and space type are aligned. Since that info decides the similarity function we use to calculate the similarity between docs.
+{: .note}
+
+# Limitations
+
+## MMR Query Type Restriction:
+MMR currently only supports knn or neural queries as the top-level query in a search request. If knn or neural is nested inside another query type (such as a bool query or hybrid query), MMR is not supported.
+
+## Required Explicit Vector Field Details
+You must explicitly provide the vector field details—`vector_field_path, vector_field_data_type, and vector_field_space_type`—when querying remote indices.
+
+Reason: Unlike a local index where OpenSearch can automatically resolve this metadata from the index mapping, the system cannot reliably fetch this information from the remote cluster. Providing these details ensures correct parsing of the vector data and accurate similarity calculations.