Skip to content

Commit 4d88f31

Browse files
committed
add 0.4.0 docs
Signed-off-by: usamoi <[email protected]>
1 parent 750df4f commit 4d88f31

File tree

4 files changed

+35
-27
lines changed

4 files changed

+35
-27
lines changed

src/vectorchord/usage/advanced-features.md

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,16 +12,6 @@ To improve performance for the first query, you can try the following SQL that p
1212
SELECT vchordrq_prewarm('gist_train_embedding_idx')
1313
```
1414

15-
The indexing of each specific dimension in VectorChord requires preprocessing, which can increase the latency of the first query. This can be mitigated by modifying GUC `vchordrq.prewarm_dim` to perform the preprocessing ahead of time during PostgreSQL cluster startup.
16-
17-
```SQL
18-
-- Add your vector dimensions to the `prewarm_dim` list to reduce latency.
19-
-- If this is not configured, the first query will have higher latency as the matrix is generated on demand.
20-
-- Default value: '64,128,256,384,512,768,1024,1536'
21-
-- Note: This setting requires a database restart to take effect.
22-
ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128,256,384,512,768,1024,1536';
23-
```
24-
2515
## Indexing Progress
2616

2717
You can check the indexing progress by querying the `pg_stat_progress_create_index` view.

src/vectorchord/usage/indexing.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,6 @@ When dealing with large datasets (> $10^6$ vectors), please follow these guideli
6767
- Example:
6868
- `residual_quantization = false` means that residual quantization is not used.
6969
- `residual_quantization = true` means that residual quantization is used.
70-
- Note: set `residual_quantization` to `true` if your model generates embeddings where the metric is Euclidean distance. This option only works for L2 distance. Using it with other distance metrics will result in an error in building.
7170

7271
### Internal Build Parameters
7372

src/vectorchord/usage/performance-tuning.md

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,6 @@ SET vchordrq.probes = 100;
4343
-- If you need a less precise query, setting it to 1.0 may be appropriate.
4444
-- Recommended range: 1.0–1.9. Default value is 1.9.
4545
SET vchordrq.epsilon = 1.9;
46-
47-
-- vchordrq relies on a projection matrix to optimize performance.
48-
-- Add your vector dimensions to the `prewarm_dim` list to reduce latency.
49-
-- If this is not configured, the first query will have higher latency as the matrix is generated on demand.
50-
-- Default value: '64,128,256,384,512,768,1024,1536'
51-
-- Note: This setting requires a database restart to take effect.
52-
ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128,256,384,512,768,1024,1536';
5346
```
5447

5548
And for postgres's setting

src/vectorchord/usage/search.md

Lines changed: 35 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ SELECT 1 FROM items WHERE category_id = 1 ORDER BY embedding <#> '[0.5,0.5,0.5]'
3737
- If `lists = []`, then probes must be an empty list.
3838
- If `lists = [11, 22]`, then probes can be 2,4 or 4,8, but it must not be an empty list, `3`, `7,8,9`, or `5,5,5,5`.
3939

40-
4140
#### `vchordrq.epsilon`
4241

4342
- Description: Even after pruning, the number of retrieved vectors remains substantial. The index employs the RaBitQ algorithm to quantize vectors into bit vectors, which require just $\frac{1}{32}$ the memory of single-precision floating-point vectors. With minimal floating-point operations, most computations are integer-based, leading to faster processing. Unlike typical quantization algorithms, RaBitQ not only estimates distances but also their lower bounds. The index computes the lower bound for each vector and dynamically adjusts the number of vectors needing recalculated distances, based on the query count, thus balancing performance and accuracy. The GUC parameter `vchordrq.epsilon` controls the conservativeness of the lower bounds of distances. The higher the value, the higher the accuracy, but the worse the performance. The default value provides unnecessarily high accuracy for most indexes, so you can try lowering this parameter to achieve better performance.
@@ -51,11 +50,38 @@ SELECT 1 FROM items WHERE category_id = 1 ORDER BY embedding <#> '[0.5,0.5,0.5]'
5150

5251
You can refer to [performance tuning](../usage/performance-tuning#query-performance) for more information about tuning the query performance.
5352

54-
#### `vchordrq.prewarm_dim`
55-
56-
- Description: The `vchordrq.prewarm_dim` GUC parameter is used to precompute the RaBitQ projection matrix for the specified dimensions. This can help to reduce the latency of the first query after the PostgreSQL cluster is started.
57-
- Type: list of integers
58-
- Default: `64,128,256,384,512,768,1024,1536`
59-
- Example:
60-
- `ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128'` means that the projection matrix will be precomputed for dimensions 64 and 128.
61-
- Note: This setting requires a database restart to take effect.
53+
#### `vchordrq.prefilter`
54+
55+
- Description: The `vchordrq.prefilter` GUC parameter enables condition evaluation before distance computation. For example, in the query `SELECT * FROM items WHERE id % 2 = 0 ORDER BY embedding <-> '[3,1,2]' LIMIT 5`, the index normally computes all useful `embedding <-> '[3,1,2]'` distances first and then pass the rows to PostgreSQL, which filters out rows where `id % 2 != 0`. This parameter allows the index to pre-evaluate the condition and discard non-matching rows before computing their distances, improving query efficiency.
56+
- Type: boolean
57+
- Default: `false`
58+
59+
#### `vchordrq.search_rerank`
60+
61+
- Description: This GUC parameter controls the I/O prefetching strategy for reading bit vectors in vector search, which can significantly impact search performance on disk-based vectors.
62+
- Type: string
63+
- Domain: Depends on PostgreSQL version
64+
- PostgreSQL 13, 14, 15, 16: `{"read_buffer", "prefetch_buffer"}`
65+
- PostgreSQL 17: `{"read_buffer", "prefetch_buffer", "read_stream"}`
66+
- Default: Depends on PostgreSQL version
67+
- PostgreSQL 13, 14, 15, 16: `prefetch_buffer`
68+
- PostgreSQL 17: `read_stream`
69+
- Note:
70+
- `read_buffer` indicates a preference for `ReadBuffer`.
71+
- `prefetch_buffer` indicates a preference for both `PrefetchBuffer` and `ReadBuffer`.
72+
- `read_stream` indicates a preference for `read_stream`.
73+
74+
#### `vchordrq.io_rerank`
75+
76+
- Description: This GUC parameter controls the I/O prefetching strategy for reading full precision vectors in vector search, which can significantly impact search performance on disk-based vectors.
77+
- Type: string
78+
- Domain: Depends on PostgreSQL version
79+
- PostgreSQL 13, 14, 15, 16: `{"read_buffer", "prefetch_buffer"}`
80+
- PostgreSQL 17: `{"read_buffer", "prefetch_buffer", "read_stream"}`
81+
- Default: Depends on PostgreSQL version
82+
- PostgreSQL 13, 14, 15, 16: `prefetch_buffer`
83+
- PostgreSQL 17: `read_stream`
84+
- Note:
85+
- `read_buffer` indicates a preference for `ReadBuffer`.
86+
- `prefetch_buffer` indicates a preference for both `PrefetchBuffer` and `ReadBuffer`.
87+
- `read_stream` indicates a preference for `read_stream`.

0 commit comments

Comments
 (0)