From b66465bb7a7aad0d73c3f3550e31809680207bf0 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Tue, 24 Dec 2024 16:00:00 +0800 Subject: [PATCH 01/13] add prune doc Signed-off-by: zhichao-aws --- .../processors/sparse-encoding.md | 27 +++++++++++++------ .../neural-sparse-with-pipelines.md | 2 ++ ...neural-sparse-query-two-phase-processor.md | 3 ++- 3 files changed, 23 insertions(+), 9 deletions(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 3af6f4e987..ea65b20ecb 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -36,6 +36,8 @@ The following table lists the required and optional parameters for the `sparse_e | Parameter | Data type | Required/Optional | Description | |:---|:---|:---|:---| `model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). +`prune_type` | String | Optional | The prune strategy for sparse vectors. Choose one value from `max_ratio`, `alpha_mass`, `top_k`, `abs_value` and `none`. Default value is `none`. +`prune_ratio` | Float | Optional | The ratio for prune strategy. Once the `prune_type` is provided, `prune_ratio` field is required. `field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a `rank_features` field. `field_map.` | String | Required | The name of the field from which to obtain text for generating vector embeddings. `field_map.` | String | Required | The name of the vector field in which to store the generated vector embeddings. @@ -43,6 +45,21 @@ The following table lists the required and optional parameters for the `sparse_e `tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. | `batch_size` | Integer | Optional | Specifies the number of documents to be batched and processed each time. Default is `1`. | +### Sparse vectors prune +The token weights in sparse vectors exhibit a significant long-tail distribution, where tokens with lower semantic importance occupy a large portion of the storage space. Prune is to remove less-important tokens based on their weights. It trades some search relevance for much smaller index size. + +The `sparse_encoding` processor can be used to prune sparse vectors by configuring `prune_type` and `prune_ratio` parameters. The following table lists the supported prune options for `sparse_encoding` processor. + +| Prune type | Valid prune ratio | Description | +|:---|:---|:---| +max_ratio | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose values are within the prune_ratio of the max value in the vector. +abs_value | Float in (0, +∞) | Prunes a sparse vector by removing elements with values below the prune_ratio. +alpha_mass | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose cumulative sum of values is within the prune_ratio of the total sum. +top_k | Integer in (0, +∞) | Prunes a sparse vector by keeping only the top prune_ratio elements with the highest values. +none | - | Does nothing on sparse vectors. + +Among all prune options, the combination of (`max_ratio`, 0.1) demonstrates great generalization on test datasets. Which saves around 40% storage at a cost of <1% search relevance loss. + ## Using the processor Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/). @@ -59,6 +76,8 @@ PUT /_ingest/pipeline/nlp-ingest-pipeline { "sparse_encoding": { "model_id": "aP2Q8ooBpBj3wT4HVS8a", + "prune_type": "max_ratio", + "prune_ratio": 0.1, "field_map": { "passage_text": "passage_embedding" } @@ -111,23 +130,15 @@ The response confirms that in addition to the `passage_text` field, the processo "worlds" : 2.7839446, "yes" : 0.75845814, "##world" : 2.5432441, - "born" : 0.2682308, "nothing" : 0.8625516, - "goodbye" : 0.17146169, "greeting" : 0.96817183, "birth" : 1.2788506, - "come" : 0.1623208, - "global" : 0.4371151, - "it" : 0.42951578, "life" : 1.5750692, - "thanks" : 0.26481047, "world" : 4.7300377, - "tiny" : 0.5462298, "earth" : 2.6555297, "universe" : 2.0308156, "worldwide" : 1.3903781, "hello" : 6.696973, - "so" : 0.20279501, "?" : 0.67785245 }, "passage_text" : "hello world" diff --git a/_search-plugins/neural-sparse-with-pipelines.md b/_search-plugins/neural-sparse-with-pipelines.md index ef7044494a..2e8f01a446 100644 --- a/_search-plugins/neural-sparse-with-pipelines.md +++ b/_search-plugins/neural-sparse-with-pipelines.md @@ -229,6 +229,8 @@ PUT /_ingest/pipeline/nlp-ingest-pipeline-sparse { "sparse_encoding": { "model_id": "", + "prune_type": "max_ratio", + "prune_ratio": 0.1, "field_map": { "passage_text": "passage_embedding" } diff --git a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md index 41119e643a..c3523a7c96 100644 --- a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md +++ b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md @@ -23,7 +23,8 @@ Field | Data type | Description :--- | :--- | :--- `enabled` | Boolean | Controls whether the two-phase processor is enabled. Default is `true`. `two_phase_parameter` | Object | A map of key-value pairs representing the two-phase parameters and their associated values. You can specify the value of `prune_ratio`, `expansion_rate`, `max_window_size`, or any combination of these three parameters. Optional. -`two_phase_parameter.prune_ratio` | Float | A ratio that represents how to split the high-weight tokens and low-weight tokens. The threshold is the token's maximum score multiplied by its `prune_ratio`. Valid range is [0,1]. Default is `0.4` +`two_phase_parameter.prune_type` | String | The prune strategy of how to split the high-weight tokens and low-weight tokens. Default is `max_ratio`. See prune options [here]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/#sparse-vectors-prune). +`two_phase_parameter.prune_ratio` | Float | A ratio that represents how to split the high-weight tokens and low-weight tokens. The threshold is the token's maximum score multiplied by its `prune_ratio`. Valid range is [0,1] for `max_ratio` prune_type. Default is `0.4` `two_phase_parameter.expansion_rate` | Float | The rate at which documents will be fine-tuned during the second phase. The second-phase document number equals the query size (default is 10) multiplied by its expansion rate. Valid range is greater than 1.0. Default is `5.0` `two_phase_parameter.max_window_size` | Int | The maximum number of documents that can be processed using the two-phase processor. Valid range is greater than 50. Default is `10000`. `tag` | String | The processor's identifier. Optional. From 29f79ae85ca05890cda6d0afcd6bec781719b4d8 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:14:38 +0800 Subject: [PATCH 02/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index ea65b20ecb..8872c1000a 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -36,7 +36,7 @@ The following table lists the required and optional parameters for the `sparse_e | Parameter | Data type | Required/Optional | Description | |:---|:---|:---|:---| `model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). -`prune_type` | String | Optional | The prune strategy for sparse vectors. Choose one value from `max_ratio`, `alpha_mass`, `top_k`, `abs_value` and `none`. Default value is `none`. +`prune_type` | String | Optional | The prune strategy for sparse vectors. Valid values are `max_ratio`, `alpha_mass`, `top_k`, `abs_value` and `none`. Default is `none`. `prune_ratio` | Float | Optional | The ratio for prune strategy. Once the `prune_type` is provided, `prune_ratio` field is required. `field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a `rank_features` field. `field_map.` | String | Required | The name of the field from which to obtain text for generating vector embeddings. From 18d922856afb31bb2400f6fee175062c1e2cc487 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:14:50 +0800 Subject: [PATCH 03/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 8872c1000a..920f9a5442 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -37,7 +37,7 @@ The following table lists the required and optional parameters for the `sparse_e |:---|:---|:---|:---| `model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). `prune_type` | String | Optional | The prune strategy for sparse vectors. Valid values are `max_ratio`, `alpha_mass`, `top_k`, `abs_value` and `none`. Default is `none`. -`prune_ratio` | Float | Optional | The ratio for prune strategy. Once the `prune_type` is provided, `prune_ratio` field is required. +`prune_ratio` | Float | Optional | The ratio for prune strategy. Required when `prune_type` is specified. `field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a `rank_features` field. `field_map.` | String | Required | The name of the field from which to obtain text for generating vector embeddings. `field_map.` | String | Required | The name of the vector field in which to store the generated vector embeddings. From 2649cfb42fac4933b6f801139e62c7dc13274299 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:15:00 +0800 Subject: [PATCH 04/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 920f9a5442..e89d9b4c31 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -45,7 +45,8 @@ The following table lists the required and optional parameters for the `sparse_e `tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. | `batch_size` | Integer | Optional | Specifies the number of documents to be batched and processed each time. Default is `1`. | -### Sparse vectors prune +### Pruning sparse vectors + The token weights in sparse vectors exhibit a significant long-tail distribution, where tokens with lower semantic importance occupy a large portion of the storage space. Prune is to remove less-important tokens based on their weights. It trades some search relevance for much smaller index size. The `sparse_encoding` processor can be used to prune sparse vectors by configuring `prune_type` and `prune_ratio` parameters. The following table lists the supported prune options for `sparse_encoding` processor. From 9966557ac5045c9f9281e15bf45074985c1c242a Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:15:21 +0800 Subject: [PATCH 05/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index e89d9b4c31..19138794d2 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -47,7 +47,7 @@ The following table lists the required and optional parameters for the `sparse_e ### Pruning sparse vectors -The token weights in sparse vectors exhibit a significant long-tail distribution, where tokens with lower semantic importance occupy a large portion of the storage space. Prune is to remove less-important tokens based on their weights. It trades some search relevance for much smaller index size. +Sparse vectors often have a long-tail distribution of token weights, with less important tokens occupying a significant amount of storage space. Pruning reduces index size by removing tokens with lower semantic importance, balancing a slight decrease in search relevance for a more compact index. The `sparse_encoding` processor can be used to prune sparse vectors by configuring `prune_type` and `prune_ratio` parameters. The following table lists the supported prune options for `sparse_encoding` processor. From 3cd079ce253aeb42574187483d09f705c5a6e59f Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:15:35 +0800 Subject: [PATCH 06/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 19138794d2..3ed075f95e 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -49,7 +49,7 @@ The following table lists the required and optional parameters for the `sparse_e Sparse vectors often have a long-tail distribution of token weights, with less important tokens occupying a significant amount of storage space. Pruning reduces index size by removing tokens with lower semantic importance, balancing a slight decrease in search relevance for a more compact index. -The `sparse_encoding` processor can be used to prune sparse vectors by configuring `prune_type` and `prune_ratio` parameters. The following table lists the supported prune options for `sparse_encoding` processor. +The `sparse_encoding` processor can be used to prune sparse vectors by configuring the `prune_type` and `prune_ratio` parameters. The following table lists the supported prune options for the `sparse_encoding` processor. | Prune type | Valid prune ratio | Description | |:---|:---|:---| From 6313973db3baf52cd1a7abe35f993e32c1c9ee19 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:15:51 +0800 Subject: [PATCH 07/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 3ed075f95e..aed63adcce 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -53,7 +53,7 @@ The `sparse_encoding` processor can be used to prune sparse vectors by configuri | Prune type | Valid prune ratio | Description | |:---|:---|:---| -max_ratio | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose values are within the prune_ratio of the max value in the vector. +`max_ratio` | Float [0, 1) | Prunes a sparse vector by keeping only elements whose values are within the `prune_ratio` of the largest value in the vector. abs_value | Float in (0, +∞) | Prunes a sparse vector by removing elements with values below the prune_ratio. alpha_mass | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose cumulative sum of values is within the prune_ratio of the total sum. top_k | Integer in (0, +∞) | Prunes a sparse vector by keeping only the top prune_ratio elements with the highest values. From 63d78245c57fc242a6cbc203184ffe662847d9b6 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:16:02 +0800 Subject: [PATCH 08/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index aed63adcce..8474c272ac 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -56,7 +56,7 @@ The `sparse_encoding` processor can be used to prune sparse vectors by configuri `max_ratio` | Float [0, 1) | Prunes a sparse vector by keeping only elements whose values are within the `prune_ratio` of the largest value in the vector. abs_value | Float in (0, +∞) | Prunes a sparse vector by removing elements with values below the prune_ratio. alpha_mass | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose cumulative sum of values is within the prune_ratio of the total sum. -top_k | Integer in (0, +∞) | Prunes a sparse vector by keeping only the top prune_ratio elements with the highest values. +`top_k` | Integer (0, +∞) | Prunes a sparse vector by keeping only the top `prune_ratio` elements with the highest values. none | - | Does nothing on sparse vectors. Among all prune options, the combination of (`max_ratio`, 0.1) demonstrates great generalization on test datasets. Which saves around 40% storage at a cost of <1% search relevance loss. From 33ec439dff399b7b9aeff1c71131f190b64dc6af Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:16:10 +0800 Subject: [PATCH 09/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 8474c272ac..22adb8e980 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -57,7 +57,7 @@ The `sparse_encoding` processor can be used to prune sparse vectors by configuri abs_value | Float in (0, +∞) | Prunes a sparse vector by removing elements with values below the prune_ratio. alpha_mass | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose cumulative sum of values is within the prune_ratio of the total sum. `top_k` | Integer (0, +∞) | Prunes a sparse vector by keeping only the top `prune_ratio` elements with the highest values. -none | - | Does nothing on sparse vectors. +none | N/A | Leaves sparse vectors unchanged. Among all prune options, the combination of (`max_ratio`, 0.1) demonstrates great generalization on test datasets. Which saves around 40% storage at a cost of <1% search relevance loss. From e0493f0c94afdb0eb40f33e7351ed243399f0ff3 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:16:27 +0800 Subject: [PATCH 10/13] Update _ingest-pipelines/processors/sparse-encoding.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- _ingest-pipelines/processors/sparse-encoding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 22adb8e980..06c8c6f703 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -59,7 +59,7 @@ alpha_mass | Float in [0, 1) | Prunes a sparse vector by keeping only elements w `top_k` | Integer (0, +∞) | Prunes a sparse vector by keeping only the top `prune_ratio` elements with the highest values. none | N/A | Leaves sparse vectors unchanged. -Among all prune options, the combination of (`max_ratio`, 0.1) demonstrates great generalization on test datasets. Which saves around 40% storage at a cost of <1% search relevance loss. +Among all pruning options, specifying `max_ratio` equal to `0.1` shows strong generalization on test datasets. This approach reduces storage requirements by approximately 40% while incurring less than a 1% loss in search relevance. ## Using the processor From e826cfeb61b002f4effcade1314386d15109ec4b Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:16:49 +0800 Subject: [PATCH 11/13] Update _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- .../search-pipelines/neural-sparse-query-two-phase-processor.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md index c3523a7c96..03cc29c221 100644 --- a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md +++ b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md @@ -23,7 +23,7 @@ Field | Data type | Description :--- | :--- | :--- `enabled` | Boolean | Controls whether the two-phase processor is enabled. Default is `true`. `two_phase_parameter` | Object | A map of key-value pairs representing the two-phase parameters and their associated values. You can specify the value of `prune_ratio`, `expansion_rate`, `max_window_size`, or any combination of these three parameters. Optional. -`two_phase_parameter.prune_type` | String | The prune strategy of how to split the high-weight tokens and low-weight tokens. Default is `max_ratio`. See prune options [here]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/#sparse-vectors-prune). +`two_phase_parameter.prune_type` | String | The prune strategy for separating high-weight and low-weight tokens. Default is `max_ratio`. For valid values, see [Pruning sparse vectors]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/#pruning-sparse-vectors). `two_phase_parameter.prune_ratio` | Float | A ratio that represents how to split the high-weight tokens and low-weight tokens. The threshold is the token's maximum score multiplied by its `prune_ratio`. Valid range is [0,1] for `max_ratio` prune_type. Default is `0.4` `two_phase_parameter.expansion_rate` | Float | The rate at which documents will be fine-tuned during the second phase. The second-phase document number equals the query size (default is 10) multiplied by its expansion rate. Valid range is greater than 1.0. Default is `5.0` `two_phase_parameter.max_window_size` | Int | The maximum number of documents that can be processed using the two-phase processor. Valid range is greater than 50. Default is `10000`. From 9ecea3d2cd059603cd27cbe856d934557e8c3c9a Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 25 Dec 2024 10:17:02 +0800 Subject: [PATCH 12/13] Update _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: zhichao-aws --- .../search-pipelines/neural-sparse-query-two-phase-processor.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md index 03cc29c221..4e18860c26 100644 --- a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md +++ b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md @@ -24,7 +24,7 @@ Field | Data type | Description `enabled` | Boolean | Controls whether the two-phase processor is enabled. Default is `true`. `two_phase_parameter` | Object | A map of key-value pairs representing the two-phase parameters and their associated values. You can specify the value of `prune_ratio`, `expansion_rate`, `max_window_size`, or any combination of these three parameters. Optional. `two_phase_parameter.prune_type` | String | The prune strategy for separating high-weight and low-weight tokens. Default is `max_ratio`. For valid values, see [Pruning sparse vectors]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/#pruning-sparse-vectors). -`two_phase_parameter.prune_ratio` | Float | A ratio that represents how to split the high-weight tokens and low-weight tokens. The threshold is the token's maximum score multiplied by its `prune_ratio`. Valid range is [0,1] for `max_ratio` prune_type. Default is `0.4` +`two_phase_parameter.prune_ratio` | Float | This ratio defines how high-weight and low-weight tokens are separated. The threshold is calculated by multiplying the token's maximum score by its `prune_ratio`. Valid values are in the [0,1] range for `prune_type` set to `max_ratio` . Default is `0.4`. `two_phase_parameter.expansion_rate` | Float | The rate at which documents will be fine-tuned during the second phase. The second-phase document number equals the query size (default is 10) multiplied by its expansion rate. Valid range is greater than 1.0. Default is `5.0` `two_phase_parameter.max_window_size` | Int | The maximum number of documents that can be processed using the two-phase processor. Valid range is greater than 50. Default is `10000`. `tag` | String | The processor's identifier. Optional. From 8a2813c66a161629b5ecd857cf3c437cc3819c45 Mon Sep 17 00:00:00 2001 From: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Date: Thu, 2 Jan 2025 08:15:28 -0500 Subject: [PATCH 13/13] Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- .../processors/sparse-encoding.md | 18 +++++++++--------- .../neural-sparse-query-two-phase-processor.md | 4 ++-- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 06c8c6f703..95f2a252d2 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -36,8 +36,8 @@ The following table lists the required and optional parameters for the `sparse_e | Parameter | Data type | Required/Optional | Description | |:---|:---|:---|:---| `model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). -`prune_type` | String | Optional | The prune strategy for sparse vectors. Valid values are `max_ratio`, `alpha_mass`, `top_k`, `abs_value` and `none`. Default is `none`. -`prune_ratio` | Float | Optional | The ratio for prune strategy. Required when `prune_type` is specified. +`prune_type` | String | Optional | The prune strategy for sparse vectors. Valid values are `max_ratio`, `alpha_mass`, `top_k`, `abs_value`, and `none`. Default is `none`. +`prune_ratio` | Float | Optional | The ratio for the pruning strategy. Required when `prune_type` is specified. `field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a `rank_features` field. `field_map.` | String | Required | The name of the field from which to obtain text for generating vector embeddings. `field_map.` | String | Required | The name of the vector field in which to store the generated vector embeddings. @@ -47,19 +47,19 @@ The following table lists the required and optional parameters for the `sparse_e ### Pruning sparse vectors -Sparse vectors often have a long-tail distribution of token weights, with less important tokens occupying a significant amount of storage space. Pruning reduces index size by removing tokens with lower semantic importance, balancing a slight decrease in search relevance for a more compact index. +A sparse vector often has a long-tail distribution of token weights, with less important tokens occupying a significant amount of storage space. Pruning reduces the size of an index by removing tokens with lower semantic importance, yielding a slight decrease in search relevance in exchange for a more compact index. -The `sparse_encoding` processor can be used to prune sparse vectors by configuring the `prune_type` and `prune_ratio` parameters. The following table lists the supported prune options for the `sparse_encoding` processor. +The `sparse_encoding` processor can be used to prune sparse vectors by configuring the `prune_type` and `prune_ratio` parameters. The following table lists the supported pruning options for the `sparse_encoding` processor. -| Prune type | Valid prune ratio | Description | +| Pruning type | Valid pruning ratio | Description | |:---|:---|:---| `max_ratio` | Float [0, 1) | Prunes a sparse vector by keeping only elements whose values are within the `prune_ratio` of the largest value in the vector. -abs_value | Float in (0, +∞) | Prunes a sparse vector by removing elements with values below the prune_ratio. -alpha_mass | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose cumulative sum of values is within the prune_ratio of the total sum. -`top_k` | Integer (0, +∞) | Prunes a sparse vector by keeping only the top `prune_ratio` elements with the highest values. +`abs_value` | Float (0, +∞) | Prunes a sparse vector by removing elements with values lower than the `prune_ratio`. +`alpha_mass` | Float [0, 1) | Prunes a sparse vector by keeping only elements whose cumulative sum of values is within the `prune_ratio` of the total sum. +`top_k` | Integer (0, +∞) | Prunes a sparse vector by keeping only the top `prune_ratio` elements. none | N/A | Leaves sparse vectors unchanged. -Among all pruning options, specifying `max_ratio` equal to `0.1` shows strong generalization on test datasets. This approach reduces storage requirements by approximately 40% while incurring less than a 1% loss in search relevance. +Among all pruning options, specifying `max_ratio` as equal to `0.1` demonstrates strong generalization on test datasets. This approach reduces storage requirements by approximately 40% while incurring less than a 1% loss in search relevance. ## Using the processor diff --git a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md index 4e18860c26..536d167083 100644 --- a/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md +++ b/_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md @@ -23,8 +23,8 @@ Field | Data type | Description :--- | :--- | :--- `enabled` | Boolean | Controls whether the two-phase processor is enabled. Default is `true`. `two_phase_parameter` | Object | A map of key-value pairs representing the two-phase parameters and their associated values. You can specify the value of `prune_ratio`, `expansion_rate`, `max_window_size`, or any combination of these three parameters. Optional. -`two_phase_parameter.prune_type` | String | The prune strategy for separating high-weight and low-weight tokens. Default is `max_ratio`. For valid values, see [Pruning sparse vectors]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/#pruning-sparse-vectors). -`two_phase_parameter.prune_ratio` | Float | This ratio defines how high-weight and low-weight tokens are separated. The threshold is calculated by multiplying the token's maximum score by its `prune_ratio`. Valid values are in the [0,1] range for `prune_type` set to `max_ratio` . Default is `0.4`. +`two_phase_parameter.prune_type` | String | The pruning strategy for separating high-weight and low-weight tokens. Default is `max_ratio`. For valid values, see [Pruning sparse vectors]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/#pruning-sparse-vectors). +`two_phase_parameter.prune_ratio` | Float | This ratio defines how high-weight and low-weight tokens are separated. The threshold is calculated by multiplying the token's maximum score by its `prune_ratio`. Valid values are in the [0,1] range for `prune_type` set to `max_ratio`. Default is `0.4`. `two_phase_parameter.expansion_rate` | Float | The rate at which documents will be fine-tuned during the second phase. The second-phase document number equals the query size (default is 10) multiplied by its expansion rate. Valid range is greater than 1.0. Default is `5.0` `two_phase_parameter.max_window_size` | Int | The maximum number of documents that can be processed using the two-phase processor. Valid range is greater than 50. Default is `10000`. `tag` | String | The processor's identifier. Optional.