Add documentation for SEISMIC #11127

zirui-song-18 · 2025-09-29T14:21:05Z

Description

This PR adds documentation for SEISMIC feature in Neural-Search plugin. It is co-authored by Liyun Xiu, Yuye Zhu, and Zirui Song.

Issues Resolved

Closes #10876

Version

3.3

Frontend features

If you're submitting documentation for an OpenSearch Dashboards feature, add a video that shows how a user will interact with the UI step by step. A voiceover is optional.
N/A

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…e Zhu, and Zirui Song Signed-off-by: Zirui Song <[email protected]>

github-actions · 2025-09-29T14:21:16Z

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

_query-dsl/specialized/neural-sparse.md

_vector-search/ai-search/neural-sparse-ann-configuration.md

_vector-search/ai-search/neural-sparse-ann.md

yuye-aws · 2025-09-29T14:59:28Z

_vector-search/ai-search/neural-sparse-ann.md

+## Further reading
+
+- [Original SEISMIC paper](https://arxiv.org/abs/2404.18812): "Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations"
+- [OpenSearch neural sparse search blog](https://opensearch.org/blog/improving-document-retrieval-with-sparse-semantic-encoders/): Learn about sparse encoding fundamentals


Add two phase blog

I don't think it's appropriate to add two phase blog here. The only place where we mentioned "two phase" in this page was to address we had significant improvement. Then, in the following configuration page, we addressed users should not combine with "two phase". So, "two phase" is just a "last-generation" algo and has nothing to do with our sparse ANN. I prefer current further reading list.

_vector-search/filter-search-knn/filtering-in-sparse-search.md

…ercase; Content modification Signed-off-by: Zirui Song <[email protected]>

Signed-off-by: Zirui Song <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws

Thank you, @zirui-song-18! Doc review complete; moving on to editorial review.

natebower

Editorial review

_field-types/supported-field-types/sparse-vector.md

_vector-search/ai-search/neural-sparse-ann.md

natebower · 2025-10-07T13:11:32Z

_vector-search/performance-tuning-sparse.md

+
+- `heap_factor`: Controls the trade-off between recall and performance.
+
+    During neural sparse ANN search, the algorithm decides whether to examine a cluster by comparing the cluster's score with the top score in the result queue divided by `heap_factor`. A larger `heap_factor` lowers the threshold that clusters must meed in order to be exained, causing the algorithm to examine more clusters and improving accuracy at the cost of slowing query speed. Conversely, a smaller `heap_factor` raises the threshold, making the algorithm more selective about which clusters to examine. This parameter provides finer control than `top_n`, allowing you to slightly adjust the trade-off between accuracy and latency.


Suggested change

During neural sparse ANN search, the algorithm decides whether to examine a cluster by comparing the cluster's score with the top score in the result queue divided by `heap_factor`. A larger `heap_factor` lowers the threshold that clusters must meed in order to be exained, causing the algorithm to examine more clusters and improving accuracy at the cost of slowing query speed. Conversely, a smaller `heap_factor` raises the threshold, making the algorithm more selective about which clusters to examine. This parameter provides finer control than `top_n`, allowing you to slightly adjust the trade-off between accuracy and latency.

During neural sparse ANN search, the algorithm decides whether to examine a cluster by comparing the cluster's score with the top score in the result queue divided by `heap_factor`. A larger `heap_factor` lowers the threshold that clusters must meet in order to be examined, causing the algorithm to examine more clusters and improving accuracy at the cost of slower query speed. Conversely, a smaller `heap_factor` raises the threshold, making the algorithm more selective about which clusters to examine. This parameter provides finer control than `top_n`, allowing you to slightly adjust the trade-off between accuracy and latency.

natebower · 2025-10-07T13:12:19Z

_vector-search/performance-tuning-sparse.md

+
+Index building can benefit from using multiple threads. You can adjust the number of threads used for cluster building by specifying the `knn.algo_param.index_thread_qty` setting (by default, `1`). For information about updating this setting, see [Vector search settings]({{site.url}}{{site.baseurl}}/vector-search/settings/). Using a higher `knn.algo_param.index_thread_qty` can reduce force merge time when neural sparse ANN search is enabled, though it also consumes more system resources.
+
+### Querying after cold start


Suggested change

### Querying after cold start

### Querying after a cold start

natebower · 2025-10-07T13:13:42Z

_vector-search/performance-tuning-sparse.md

+
+After rebooting OpenSearch, the cache is empty, so the first several hundred queries may experience high latency. To address this "cold start" issue, you can use the [Warmup API]({{site.url}}{{site.baseurl}}/vector-search/api/knn/#warmup-operation). This API loads data from disk into cache, ensuring optimal performance for subsequent queries. You can also use the [Clear Cache API]({{site.url}}{{site.baseurl}}/vector-search/api/knn/#k-nn-clear-cache) to free up memory when needed.
+
+### Force-merging segments into one


Suggested change

### Force-merging segments into one

### Force merging segments

natebower · 2025-10-07T13:15:45Z

_vector-search/settings.md

 The following Neural Search plugin settings apply at the cluster level:

 - `plugins.neural_search.stats_enabled` (Dynamic, Boolean): Enables the [Neural Search Stats API]({{site.url}}{{site.baseurl}}/vector-search/api/neural/#stats). Default is `false`.
+- `plugins.neural_search.circuit_breaker.limit` (Dynamic, percentage): Specifies the JVM memory limit for [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/) circuit breaker. Default is `10%` of the JVM heap. For more information, see [Memory and caching settings]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/#memory-and-caching-settings).


Suggested change

- `plugins.neural_search.circuit_breaker.limit` (Dynamic, percentage): Specifies the JVM memory limit for [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/) circuit breaker. Default is `10%` of the JVM heap. For more information, see [Memory and caching settings]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/#memory-and-caching-settings).

- `plugins.neural_search.circuit_breaker.limit` (Dynamic, percentage): Specifies the JVM memory limit for the [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/) circuit breaker. Default is `10%` of the JVM heap. For more information, see [Memory and caching settings]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/#memory-and-caching-settings).

natebower · 2025-10-07T13:17:24Z

_vector-search/settings.md

 - `plugins.neural_search.stats_enabled` (Dynamic, Boolean): Enables the [Neural Search Stats API]({{site.url}}{{site.baseurl}}/vector-search/api/neural/#stats). Default is `false`.
+- `plugins.neural_search.circuit_breaker.limit` (Dynamic, percentage): Specifies the JVM memory limit for [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/) circuit breaker. Default is `10%` of the JVM heap. For more information, see [Memory and caching settings]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/#memory-and-caching-settings).
+- `plugins.neural_search.circuit_breaker.overhead` (Dynamic, float): A multiplier used to adjust memory usage estimates for [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/). Higher values provide more conservative memory estimates. Default is `1.0`. 
+- `plugins.neural_search.sparse.algo_param.index_thread_qty` (Dynamic, integer): The number of threads used for building indexes for [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/). Increasing this value allocates more CPUs to the index build job and boosts the indexing performance. Default is `1`. For more information, see [Thread pool configuration]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/#thread-pool-configuration).


Suggested change

- `plugins.neural_search.sparse.algo_param.index_thread_qty` (Dynamic, integer): The number of threads used for building indexes for [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/). Increasing this value allocates more CPUs to the index build job and boosts the indexing performance. Default is `1`. For more information, see [Thread pool configuration]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/#thread-pool-configuration).

- `plugins.neural_search.sparse.algo_param.index_thread_qty` (Dynamic, integer): The number of threads used for building indexes for [neural sparse ANN search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/). Increasing this value allocates more CPUs to the index build job and boosts indexing performance. Default is `1`. For more information, see [Thread pool configuration]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann/#thread-pool-configuration).

natebower

Editorial review

Signed-off-by: Nathan Bower <[email protected]>

kolchfa-aws · 2025-10-07T16:10:35Z

_vector-search/api/neural.md

+The following request performs a warmup operation on three indexes:
+
+```json
+POST /_plugins/_neural/warmup/index1,index2,index3?pretty


@zirui-song-18 Is an index name/list of indices required for this operation? Or is there a default for the path parameter?

It is a list of index names

Signed-off-by: Fanit Kolchina <[email protected]>

natebower

LGTM

Signed-off-by: Fanit Kolchina <[email protected]>

yuye-aws · 2025-10-08T06:48:14Z

Thanks review from @kolchfa-aws and @natebower ! @zirui-song-18 is going to be out-of-the-office this week. Feel free to ping me and @chishui if you've got any questions

yuye-aws · 2025-10-08T07:04:16Z

_field-types/supported-field-types/sparse-vector.md

+| `name`                  | String  | Yes | The neural sparse ANN search algorithm. Valid value is `seismic`.                              | -                     | -           | 
+| `n_postings`            | Integer | No | The maximum number of documents to retain in each posting list.            | `0.0005 * doc_count`¹ | (0, ∞) | 
+| `cluster_ratio`         | Float   | No | The fraction of documents in each posting list to determine cluster count.             | `0.1`                 | (0, 1)      | 
+| `summary_prune_ratio`   | Float   | No | The fraction of tokens to keep in cluster summary vectors for approximate matching.     | `0.4`                 | (0, 1]      | 


This is not 100% token fraction. It actually describes the "mass" of the tokens to preserve in the cluster summary. I can provide an example:

Suppose the cluster summary, before pruning, is {"100": 1, "200": 2, "300": 3, "400": 6}. Then, given summary_prune_ratio = 0.5. There is going to be a single token left, which resulting in pruned summary {"400": 6}

yuye-aws · 2025-10-08T07:07:37Z

_field-types/supported-field-types/sparse-vector.md


-To increase search efficiency and reduce memory consumption, the `sparse_vector` field automatically performs quantization on the token weight. You can adjust the parameter `quantization_ceiling_search` and `quantization_ceiling_ingest` according to different token weight distribution. For doc-only queries, we recommend the default value (`16`). If you're querying with bi-encoder mode alone, we recommend setting `quantization_ceiling_search` to `3`. For doc-only and bi-encoder mode, you can refer to [`generating sparse vector embeddings automatically`]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-with-pipelines/) for more details. 
-{: .note }
+To increase search efficiency and reduce memory consumption, the `sparse_vector` field automatically performs quantization of the token weight. You can adjust the `quantization_ceiling_search` and `quantization_ceiling_ingest` parameters according to different token weight distributions. For doc-only queries, we recommend the default value (`16`). For bi-encoder queries, we recommend setting `quantization_ceiling_search` to `3`. For more information about doc-only and bi-encoder query modes, see [Generating sparse vector embeddings automatically]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-with-pipelines/).


we recommend the default value (16) -> we recommend the default value (16) to quantization_ceiling_search

yuye-aws · 2025-10-08T07:26:31Z

_vector-search/api/neural.md

+You can use the warm up API operation with index patterns to clear one or more indexes that match a specified pattern from the cache:
+
+```json
+POST /_plugins/_neural/warm_up/index*


POST /_plugins/_neural/warmup/index*

_field-types/supported-field-types/sparse-vector.md

_vector-search/api/neural.md

Signed-off-by: kolchfa-aws <[email protected]>

Add complete documentation for SEISMIC. Co-authored by Liyun Xiu, Yuy…

7731786

…e Zhu, and Zirui Song Signed-off-by: Zirui Song <[email protected]>

zirui-song-18 requested review from kolchfa-aws, AMoo-Miki, natebower, dlvenable, epugh and sumobrian as code owners September 29, 2025 14:21

github-actions bot assigned kolchfa-aws Sep 29, 2025

yuye-aws reviewed Sep 29, 2025

View reviewed changes

kolchfa-aws added Tech review PR: Tech review in progress release-notes PR: Include this PR in the automated release notes v3.3.0 labels Sep 29, 2025

zirui-song-18 added 2 commits September 30, 2025 08:01

Address Yuye's comments: Fix link; Add section link; Uppercase to low…

848698b

…ercase; Content modification Signed-off-by: Zirui Song <[email protected]>

Address Yuye's comments; Update correct response

b743ae4

Signed-off-by: Zirui Song <[email protected]>

kolchfa-aws added Doc review PR: Doc review in progress and removed Tech review PR: Tech review in progress labels Oct 2, 2025

Doc review

e0a98e6

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws reviewed Oct 6, 2025

View reviewed changes

kolchfa-aws added Editorial review PR: Editorial review in progress and removed Doc review PR: Doc review in progress labels Oct 6, 2025

kolchfa-aws assigned natebower Oct 6, 2025

natebower reviewed Oct 7, 2025

View reviewed changes

Apply suggestions from code review

4055bec

Signed-off-by: Nathan Bower <[email protected]>

kolchfa-aws reviewed Oct 7, 2025

View reviewed changes

kolchfa-aws added 2 commits October 7, 2025 12:30

Rewrite neual API page

d2923f5

Signed-off-by: Fanit Kolchina <[email protected]>

one more rewording

1b095f5

Signed-off-by: Fanit Kolchina <[email protected]>

natebower approved these changes Oct 7, 2025

View reviewed changes

kolchfa-aws added the Awaiting response label Oct 7, 2025

natebower removed the Editorial review PR: Editorial review in progress label Oct 7, 2025

natebower removed their assignment Oct 7, 2025

Fix links

b74f070

Signed-off-by: Fanit Kolchina <[email protected]>

yuye-aws reviewed Oct 8, 2025

View reviewed changes

kolchfa-aws reviewed Oct 8, 2025

View reviewed changes

_field-types/supported-field-types/sparse-vector.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Oct 8, 2025

View reviewed changes

_field-types/supported-field-types/sparse-vector.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Oct 8, 2025

View reviewed changes

_vector-search/api/neural.md Outdated Show resolved Hide resolved

Apply suggestions from code review

e2298a3

Signed-off-by: kolchfa-aws <[email protected]>


		- `heap_factor`: Controls the trade-off between recall and performance.

		During neural sparse ANN search, the algorithm decides whether to examine a cluster by comparing the cluster's score with the top score in the result queue divided by `heap_factor`. A larger `heap_factor` lowers the threshold that clusters must meed in order to be exained, causing the algorithm to examine more clusters and improving accuracy at the cost of slowing query speed. Conversely, a smaller `heap_factor` raises the threshold, making the algorithm more selective about which clusters to examine. This parameter provides finer control than `top_n`, allowing you to slightly adjust the trade-off between accuracy and latency.


		Index building can benefit from using multiple threads. You can adjust the number of threads used for cluster building by specifying the `knn.algo_param.index_thread_qty` setting (by default, `1`). For information about updating this setting, see [Vector search settings]({{site.url}}{{site.baseurl}}/vector-search/settings/). Using a higher `knn.algo_param.index_thread_qty` can reduce force merge time when neural sparse ANN search is enabled, though it also consumes more system resources.

		### Querying after cold start

	### Querying after cold start
	### Querying after a cold start


		After rebooting OpenSearch, the cache is empty, so the first several hundred queries may experience high latency. To address this "cold start" issue, you can use the [Warmup API]({{site.url}}{{site.baseurl}}/vector-search/api/knn/#warmup-operation). This API loads data from disk into cache, ensuring optimal performance for subsequent queries. You can also use the [Clear Cache API]({{site.url}}{{site.baseurl}}/vector-search/api/knn/#k-nn-clear-cache) to free up memory when needed.

		### Force-merging segments into one

	### Force-merging segments into one
	### Force merging segments

Add documentation for SEISMIC #11127

Are you sure you want to change the base?

Add documentation for SEISMIC #11127

Conversation

zirui-song-18 commented Sep 29, 2025

Description

Issues Resolved

Version

Frontend features

Checklist

Uh oh!

github-actions bot commented Sep 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kolchfa-aws left a comment

Choose a reason for hiding this comment

Uh oh!

natebower left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

natebower left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

natebower left a comment

Choose a reason for hiding this comment

Uh oh!

yuye-aws commented Oct 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!