add 0.4.0 docs

usamoi · usamoi · commit 4d88f31b97d5 · 2025-05-22T13:52:06.000+08:00
Signed-off-by: usamoi &lt;usamoi@outlook.com&gt;
diff --git a/src/vectorchord/usage/advanced-features.md b/src/vectorchord/usage/advanced-features.md
@@ -12,16 +12,6 @@ To improve performance for the first query, you can try the following SQL that p
 SELECT vchordrq_prewarm('gist_train_embedding_idx')
 ```
 
-The indexing of each specific dimension in VectorChord requires preprocessing, which can increase the latency of the first query. This can be mitigated by modifying GUC `vchordrq.prewarm_dim` to perform the preprocessing ahead of time during PostgreSQL cluster startup.
-
-```SQL
--- Add your vector dimensions to the `prewarm_dim` list to reduce latency.
--- If this is not configured, the first query will have higher latency as the matrix is generated on demand.
--- Default value: '64,128,256,384,512,768,1024,1536'
--- Note: This setting requires a database restart to take effect.
-ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128,256,384,512,768,1024,1536';
-```
-
 ## Indexing Progress
 
 You can check the indexing progress by querying the `pg_stat_progress_create_index` view.
diff --git a/src/vectorchord/usage/indexing.md b/src/vectorchord/usage/indexing.md
@@ -67,7 +67,6 @@ When dealing with large datasets (> $10^6$ vectors), please follow these guideli
 - Example:
     - `residual_quantization = false` means that residual quantization is not used.
     - `residual_quantization = true` means that residual quantization is used.
-- Note: set `residual_quantization` to `true` if your model generates embeddings where the metric is Euclidean distance. This option only works for L2 distance. Using it with other distance metrics will result in an error in building.
 
 ### Internal Build Parameters
 
diff --git a/src/vectorchord/usage/performance-tuning.md b/src/vectorchord/usage/performance-tuning.md
@@ -43,13 +43,6 @@ SET vchordrq.probes = 100;
 -- If you need a less precise query, setting it to 1.0 may be appropriate.
 -- Recommended range: 1.0–1.9. Default value is 1.9.
 SET vchordrq.epsilon = 1.9;
-
--- vchordrq relies on a projection matrix to optimize performance.
--- Add your vector dimensions to the `prewarm_dim` list to reduce latency.
--- If this is not configured, the first query will have higher latency as the matrix is generated on demand.
--- Default value: '64,128,256,384,512,768,1024,1536'
--- Note: This setting requires a database restart to take effect.
-ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128,256,384,512,768,1024,1536';
 ```
 
 And for postgres's setting
diff --git a/src/vectorchord/usage/search.md b/src/vectorchord/usage/search.md
@@ -37,7 +37,6 @@ SELECT 1 FROM items WHERE category_id = 1 ORDER BY embedding <#> '[0.5,0.5,0.5]'
     - If `lists = []`, then probes must be an empty list.
     - If `lists = [11, 22]`, then probes can be 2,4 or 4,8, but it must not be an empty list, `3`, `7,8,9`, or `5,5,5,5`.
 
-
 #### `vchordrq.epsilon`
     
 - Description: Even after pruning, the number of retrieved vectors remains substantial. The index employs the RaBitQ algorithm to quantize vectors into bit vectors, which require just $\frac{1}{32}$ the memory of single-precision floating-point vectors. With minimal floating-point operations, most computations are integer-based, leading to faster processing. Unlike typical quantization algorithms, RaBitQ not only estimates distances but also their lower bounds. The index computes the lower bound for each vector and dynamically adjusts the number of vectors needing recalculated distances, based on the query count, thus balancing performance and accuracy. The GUC parameter `vchordrq.epsilon` controls the conservativeness of the lower bounds of distances. The higher the value, the higher the accuracy, but the worse the performance. The default value provides unnecessarily high accuracy for most indexes, so you can try lowering this parameter to achieve better performance.
@@ -51,11 +50,38 @@ SELECT 1 FROM items WHERE category_id = 1 ORDER BY embedding <#> '[0.5,0.5,0.5]'
 
 You can refer to [performance tuning](../usage/performance-tuning#query-performance) for more information about tuning the query performance.
 
-#### `vchordrq.prewarm_dim`
-    
-- Description: The `vchordrq.prewarm_dim` GUC parameter is used to precompute the RaBitQ projection matrix for the specified dimensions. This can help to reduce the latency of the first query after the PostgreSQL cluster is started.
-- Type: list of integers
-- Default: `64,128,256,384,512,768,1024,1536`
-- Example:
-    - `ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128'` means that the projection matrix will be precomputed for dimensions 64 and 128.
-- Note: This setting requires a database restart to take effect.
+#### `vchordrq.prefilter`
+
+- Description: The `vchordrq.prefilter` GUC parameter enables condition evaluation before distance computation. For example, in the query `SELECT * FROM items WHERE id % 2 = 0 ORDER BY embedding <-> '[3,1,2]' LIMIT 5`, the index normally computes all useful `embedding <-> '[3,1,2]'` distances first and then pass the rows to PostgreSQL, which filters out rows where `id % 2 != 0`. This parameter allows the index to pre-evaluate the condition and discard non-matching rows before computing their distances, improving query efficiency.
+- Type: boolean
+- Default: `false`
+
+#### `vchordrq.search_rerank`
+
+- Description: This GUC parameter controls the I/O prefetching strategy for reading bit vectors in vector search, which can significantly impact search performance on disk-based vectors.
+- Type: string
+- Domain: Depends on PostgreSQL version
+    - PostgreSQL 13, 14, 15, 16: `{"read_buffer", "prefetch_buffer"}`
+    - PostgreSQL 17: `{"read_buffer", "prefetch_buffer", "read_stream"}`
+- Default: Depends on PostgreSQL version
+    - PostgreSQL 13, 14, 15, 16: `prefetch_buffer`
+    - PostgreSQL 17: `read_stream`
+- Note:
+    - `read_buffer` indicates a preference for `ReadBuffer`.
+    - `prefetch_buffer` indicates a preference for both `PrefetchBuffer` and `ReadBuffer`.
+    - `read_stream` indicates a preference for `read_stream`.
+
+#### `vchordrq.io_rerank`
+
+- Description: This GUC parameter controls the I/O prefetching strategy for reading full precision vectors in vector search, which can significantly impact search performance on disk-based vectors.
+- Type: string
+- Domain: Depends on PostgreSQL version
+    - PostgreSQL 13, 14, 15, 16: `{"read_buffer", "prefetch_buffer"}`
+    - PostgreSQL 17: `{"read_buffer", "prefetch_buffer", "read_stream"}`
+- Default: Depends on PostgreSQL version
+    - PostgreSQL 13, 14, 15, 16: `prefetch_buffer`
+    - PostgreSQL 17: `read_stream`
+- Note:
+    - `read_buffer` indicates a preference for `ReadBuffer`.
+    - `prefetch_buffer` indicates a preference for both `PrefetchBuffer` and `ReadBuffer`.
+    - `read_stream` indicates a preference for `read_stream`.