-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update 2023-12-05-improving-document-retrieval-with-spade-semantic-en… #2483
Changes from 1 commit
7d4ab4a
f201f17
7e3815b
e50effc
96e0cb2
b3564f0
3dc0577
e48f1ba
ff7ea37
e0973f4
8c3ef2e
217d246
8f499f3
e32981f
1af2c0e
ac4f8b1
ee0fd00
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -15,7 +15,7 @@ | |||||
has_science_table: true | ||||||
--- | ||||||
|
||||||
OpenSearch 2.11 introduced neural sparse search---a new, efficient way of semantic retrieval. In this blog post, you'll learn about using sparse encoders for semantic search. You'll find that neural sparse search reduces costs, performs faster, and improves search relevance. We're excited to share benchmarking results that show why neural sparse search is now the top-performing search method. You can even try it out by building your own search engine in just five steps. For a TLDR on benchmarking learnings, see [Key takeaways](#here-are-the-key-takeaways). | ||||||
OpenSearch 2.11 introduced neural sparse search---a new efficient method of semantic retrieval. In this blog post, you'll learn about using sparse encoders for semantic search. You'll find that neural sparse search reduces costs, performs faster, and improves search relevance. We're excited to share benchmarking results that show why neural sparse search is now the top-performing search method. You can even try it out by building your own search engine in just five steps. For a TLDR on benchmarking learnings, see [Key takeaways](#here-are-the-key-takeaways). | ||||||
|
||||||
## What are dense and sparse vector embeddings? | ||||||
|
||||||
|
@@ -68,25 +68,25 @@ | |||||
|
||||||
### Here are the key takeaways: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would just make this "Key takeaways". |
||||||
|
||||||
* Both bi-encoder and document-only modes provide the highest relevance on the BEIR and Amazon ESCI datasets. | ||||||
* Both modes provide the highest relevance on the BEIR and Amazon ESCI datasets. | ||||||
* Without online inference, the search latency of document-only mode is comparable to BM25. | ||||||
* Sparse encoding results in a much smaller index size than dense encoding. The size of an index a document-only sparse encoder generates is **10.4%** of a dense encoding index size. For a bi-encoder, the index size is **7.2%** of a dense encoding index size. | ||||||
* Sparse encoding results in a much smaller index size than dense encoding. A document-only sparse encoder generates an index that is **10.4%** of the size of a dense encoding index. For a bi-encoder, the index size is **7.2%** of the size of a dense encoding index. | ||||||
* Dense encoding uses k-NN retrieval and incurs a 7.9% increase in RAM cost at search time. Neural sparse search uses a native Lucene index, so the RAM cost does not increase at search time. | ||||||
|
||||||
## Benchmarking results | ||||||
Check failure on line 76 in _posts/2023-12-05-improving-document-retrieval-with-sparse-semantic-encoders.md
|
||||||
|
||||||
The benchmarking results are presented in the following tables. | ||||||
|
||||||
### Table I. Relevance comparison on BEIR<sup>*</sup> benchmark and Amazon ESCI, in terms of NDCG@10 and rank. | ||||||
Check failure on line 80 in _posts/2023-12-05-improving-document-retrieval-with-sparse-semantic-encoders.md
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
<table> | ||||||
<tr style="text\-align:center;"> | ||||||
<td></td> | ||||||
<td colspan="2">BM25</td> | ||||||
<td colspan="2">Dense(with TAS-B model)</td> | ||||||
<td colspan="2">Hybrid(Dense + BM25)</td> | ||||||
<td colspan="2">Dense (with TAS-B model)</td> | ||||||
<td colspan="2">Hybrid (Dense + BM25)</td> | ||||||
<td colspan="2">Neural sparse search bi-encoder</td> | ||||||
<td colspan="2">Neural sparse search doc-only</td> | ||||||
<td colspan="2">Neural sparse search document-only</td> | ||||||
</tr> | ||||||
<tr> | ||||||
<td><b>Dataset</b></td> | ||||||
|
@@ -102,7 +102,7 @@ | |||||
<td><b>Rank</b></td> | ||||||
</tr> | ||||||
<tr> | ||||||
<td>Trec Covid</td> | ||||||
Check failure on line 105 in _posts/2023-12-05-improving-document-retrieval-with-sparse-semantic-encoders.md
|
||||||
<td>0.688</td> | ||||||
<td>4</td> | ||||||
<td>0.481</td> | ||||||
|
@@ -206,7 +206,7 @@ | |||||
<td>2</td> | ||||||
</tr> | ||||||
<tr> | ||||||
<td>SCIDOCS</td> | ||||||
Check failure on line 209 in _posts/2023-12-05-improving-document-retrieval-with-sparse-semantic-encoders.md
|
||||||
<td>0.165</td> | ||||||
<td>2</td> | ||||||
<td>0.149</td> | ||||||
|
@@ -298,11 +298,11 @@ | |||||
</tr> | ||||||
</table> | ||||||
|
||||||
<sup>*</sup> BEIR is short for Benchmarking Information Retrieval. For more information, see [the BEIR GitHub page](https://github.com/beir-cellar/beir). | ||||||
<sup>*</sup> BEIR stands for Benchmarking Information Retrieval. For more information, see [the BEIR GitHub page](https://github.com/beir-cellar/beir). | ||||||
|
||||||
### Table II. Speed comparison, in terms of latency and throughput. | ||||||
### Table II. Speed comparison in terms of latency and throughput | ||||||
Check failure on line 303 in _posts/2023-12-05-improving-document-retrieval-with-sparse-semantic-encoders.md
|
||||||
|
||||||
| | BM25 | Dense (with TAS-B model) | Neural sparse search bi-encoder | Neural sparse search doc-only | | ||||||
| | BM25 | Dense (with TAS-B model) | Neural sparse search bi-encoder | Neural sparse search document-only | | ||||||
Check failure on line 305 in _posts/2023-12-05-improving-document-retrieval-with-sparse-semantic-encoders.md
|
||||||
|---------------------------|---------------|---------------------------| ------------------------------- | ------------------------------ | | ||||||
| P50 latency (ms) | 8 ms | 56.6 ms |176.3 ms | 10.2ms | | ||||||
| P90 latency (ms) | 12.4 ms | 71.12 ms |267.3 ms | 15.2ms | | ||||||
|
@@ -311,17 +311,17 @@ | |||||
| Mean throughput (op/s) | 2214.6 op/s | 298.2 op/s |106.3 op/s | 1790.2 op/s | | ||||||
|
||||||
|
||||||
<sup>*</sup> We tested latency on a subset of MSMARCO v2, with 1M documents in total. To obtain latency data, we used 20 clients to loop search requests. | ||||||
<sup>*</sup> We tested latency on a subset of MS MARCO v2 containing 1M documents in total. To obtain latency data, we used 20 clients to loop search requests. | ||||||
|
||||||
### Table III. Capacity consumption comparison | ||||||
|
||||||
| |BM25 |Dense (with TAS-B model) |Neural sparse search bi-encoder | Neural sparse search doc-only | | ||||||
| |BM25 |Dense (with TAS-B model) |Neural sparse search bi-encoder | Neural sparse search document-only | | ||||||
|-|-|-|-|-| | ||||||
|Index size |1 GB |65.4 GB |4.7 GB |6.8 GB | | ||||||
|RAM usage |480.74 GB |675.36 GB |480.64 GB |494.25 GB | | ||||||
|Runtime RAM delta |+0.01 GB |+53.34 GB |+0.06 GB |+0.03 GB | | ||||||
|
||||||
<sup>*</sup> We performed this experiment using the full MSMARCO v2 dataset, with 8.8M passages. For all methods, we excluded the `_source` fields and force merged the index before measuring index size. We set the heap size of the OpenSearch JVM to half of the node RAM, so an empty OpenSearch cluster still consumed close to 480 GB of memory. | ||||||
<sup>*</sup> We performed this experiment using the full MS MARCO v2 dataset, containing 8.8M passages. For all methods, we excluded the `_source` fields and force merged the index before measuring index size. We set the heap size of the OpenSearch JVM to half of the node RAM, so an empty OpenSearch cluster still consumed close to 480 GB of memory. | ||||||
|
||||||
## Build your search engine in five steps | ||||||
|
||||||
|
@@ -371,7 +371,7 @@ | |||||
GET /_plugins/_ml/tasks/<task_id> | ||||||
``` | ||||||
|
||||||
Once the task is complete, the task state changes `COMPLETED` and OpenSearch returns the `model_id` for the deployed model: | ||||||
Once the task is complete, the task state changes to `COMPLETED` and OpenSearch returns the `model_id` for the deployed model: | ||||||
|
||||||
```json | ||||||
{ | ||||||
|
@@ -456,7 +456,7 @@ | |||||
} | ||||||
``` | ||||||
|
||||||
### The neural sparse query parameters | ||||||
### Neural sparse query parameters | ||||||
|
||||||
The `neural_sparse` query supports two parameters: | ||||||
|
||||||
|
@@ -465,9 +465,9 @@ | |||||
|
||||||
## Selecting a model | ||||||
|
||||||
OpenSearch provides several pretrained encoder models that you can use out-of-the-box without fine-tuning. For a list of sparse encoding models provided by OpenSearch, see [Sparse encoding models](https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#sparse-encoding-models). | ||||||
OpenSearch provides several pretrained encoder models that you can use out of the box without fine-tuning. For a list of sparse encoding models provided by OpenSearch, see [Sparse encoding models](https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#sparse-encoding-models). | ||||||
|
||||||
Use the following recommendations to help you select a sparse encoder model: | ||||||
Use the following recommendations to select a sparse encoder model: | ||||||
|
||||||
- For **bi-encoder** mode, we recommend using the `opensearch-neural-sparse-encoding-v1` pretrained model. For this model, both online search and offline ingestion share the same model file. | ||||||
|
||||||
|
@@ -477,6 +477,6 @@ | |||||
## Next steps | ||||||
|
||||||
- For more information about neural sparse search, see [Neural sparse search](https://opensearch.org/docs/latest/search-plugins/neural-sparse-search/). | ||||||
- For an OpenSearch end-to-end neural search tutorial, see [Neural search tutorial](https://opensearch.org/docs/latest/search-plugins/neural-search-tutorial/). | ||||||
- For an end-to-end neural search tutorial, see [Neural search tutorial](https://opensearch.org/docs/latest/search-plugins/neural-search-tutorial/). | ||||||
- For a list of all search methods OpenSearch supports, see [Search methods](https://opensearch.org/docs/latest/search-plugins/index/#search-methods). | ||||||
- Give us your feedback on the [OpenSearch Forum](https://forum.opensearch.org/). | ||||||
- Provide your feedback on the [OpenSearch Forum](https://forum.opensearch.org/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.