Skip to content

Commit

Permalink
Rephrase sentences
Browse files Browse the repository at this point in the history
Signed-off-by: Naveen Tatikonda <[email protected]>
  • Loading branch information
naveentatikonda committed Jun 27, 2024
1 parent a937cc4 commit 5562fa7
Showing 1 changed file with 2 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ leading to higher memory requirements and increased operational costs. Faiss sca

## Why use Faiss scalar quantization?

When you index vectors in [OpenSearch 2.13](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.13.0.md) or later versions, you can configure your k-NN index to apply _scalar quantization_. Scalar quantization converts each dimension of a vector from a 32-bit floating-point (`fp32`) to a 16-bit floating-point (`fp16`) representation. Using the Faiss scalar quantizer (SQfp16), integrated in the k-NN plugin, you can get up to a 50% memory savings with a very minimal loss of recall (see [Benchmarking results](#benchmarking-results)). When used with [SIMD optimization](https://opensearch.org/docs/latest/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine),
When you index vectors in [OpenSearch 2.13](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.13.0.md) or later versions, you can configure your k-NN index to apply _scalar quantization_. Scalar quantization converts each dimension of a vector from a 32-bit floating-point (`fp32`) to a 16-bit floating-point (`fp16`) representation. Using the Faiss scalar quantizer (SQfp16), integrated in the k-NN plugin, saves about 50% of the memory with minimal reduction in recall (see [Benchmarking results](#benchmarking-results)). When used with [SIMD optimization](https://opensearch.org/docs/latest/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine),

Check failure on line 23 in _posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.TableHeadings] 'm' is a table heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.TableHeadings] 'm' is a table heading and should be in sentence case.", "location": {"path": "_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md", "range": {"start": {"line": 23, "column": 62}}}, "severity": "ERROR"}
SQfp16 quantization can also significantly reduce search latencies and improve indexing throughput.

## How to use Faiss scalar quantization
Expand Down Expand Up @@ -234,8 +234,7 @@ To achieve even greater memory efficiency, we plan to introduce `int8` quantizat
This technique will enable a remarkable 75% reduction in memory requirements, or 4x compression, compared to full-precision vectors and we expect to find minimal reduction in recall.
The quantizers will accept `fp32` vectors as input, perform online training, and quantize the data into byte-sized vectors, eliminating the need for external quantization or extra training steps.

Furthermore, we aim to release binary vector support, enabling an unprecedented 32x compression rate. This approach will further reduce memory consumption. In
addition to this we will soon add support for avx512 optimization which helps to further reduce search latency.
Furthermore, we aim to release binary vector support, enabling an unprecedented 32x compression rate. This approach will further reduce memory consumption. Moreover, we plan to incorporate AVX-512 optimization, which will contribute to further reducing search latency.

Our ongoing analysis and tuning of OpenSearch lets you address large-scale similarity search while minimizing resource requirements and maximizing cost-effectiveness.

Expand Down

0 comments on commit 5562fa7

Please sign in to comment.