From 1ab170f97a02ef71751dd2cfb1defcdcaa655c1f Mon Sep 17 00:00:00 2001 From: Dorsa Hasanlee <49491638+iamdorsa@users.noreply.github.com> Date: Tue, 27 Aug 2024 22:37:33 +0330 Subject: [PATCH 1/3] docs: Correct segment merging documentation --- qdrant-landing/content/documentation/concepts/optimizer.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/qdrant-landing/content/documentation/concepts/optimizer.md b/qdrant-landing/content/documentation/concepts/optimizer.md index f3b401180..c2fca98c6 100644 --- a/qdrant-landing/content/documentation/concepts/optimizer.md +++ b/qdrant-landing/content/documentation/concepts/optimizer.md @@ -50,7 +50,7 @@ Such segments, for example, are created as copy-on-write segments during optimiz It is also essential to have at least one small segment that Qdrant will use to store frequently updated data. On the other hand, too many small segments lead to suboptimal search performance. -There is the Merge Optimizer, which combines the smallest segments into one large segment. It is used if too many segments are created. +Qdrant uses a parameter called max_segment_size to control the size of segments. Increasing this value allows the creation of larger segments, reducing the number of segments and potentially improving search performance. The criteria for starting the optimizer are defined in the configuration file. @@ -59,8 +59,8 @@ Here is an example of parameter values: ```yaml storage: optimizers: - # If the number of segments exceeds this value, the optimizer will merge the smallest segments. - max_segment_number: 5 + # This parameter defines the maximum size of a segment. Increasing this value can help in reducing the number of segments. + max_segment_size: ``` ## Indexing Optimizer From b3f4b1ac247e05657890d3fec875631bc450ff65 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Tim=20Vis=C3=A9e?= Date: Wed, 27 Nov 2024 16:42:48 +0100 Subject: [PATCH 2/3] Update configuration to match qdrant/qdrant --- .../documentation/concepts/optimizer.md | 21 +++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/qdrant-landing/content/documentation/concepts/optimizer.md b/qdrant-landing/content/documentation/concepts/optimizer.md index c2fca98c6..0d276125d 100644 --- a/qdrant-landing/content/documentation/concepts/optimizer.md +++ b/qdrant-landing/content/documentation/concepts/optimizer.md @@ -59,8 +59,25 @@ Here is an example of parameter values: ```yaml storage: optimizers: - # This parameter defines the maximum size of a segment. Increasing this value can help in reducing the number of segments. - max_segment_size: + # Target amount of segments optimizer will try to keep. + # Real amount of segments may vary depending on multiple parameters: + # - Amount of stored points + # - Current write RPS + # + # It is recommended to select default number of segments as a factor of the number of search threads, + # so that each segment would be handled evenly by one of the threads. + # If `default_segment_number = 0`, will be automatically selected by the number of available CPUs + default_segment_number: 0 + + # Do not create segments larger this size (in KiloBytes). + # Large segments might require disproportionately long indexation times, + # therefore it makes sense to limit the size of segments. + # + # If indexation speed have more priority for your - make this parameter lower. + # If search speed is more important - make this parameter higher. + # Note: 1Kb = 1 vector of size 256 + # If not set, will be automatically selected considering the number of available CPUs. + max_segment_size_kb: null ``` ## Indexing Optimizer From 418f7d80f6144579b3ea9786aa5c5871e1c38140 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Tim=20Vis=C3=A9e?= Date: Wed, 27 Nov 2024 16:59:34 +0100 Subject: [PATCH 3/3] Improve merge optimizer description --- .../content/documentation/concepts/optimizer.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/qdrant-landing/content/documentation/concepts/optimizer.md b/qdrant-landing/content/documentation/concepts/optimizer.md index 0d276125d..de6d09fef 100644 --- a/qdrant-landing/content/documentation/concepts/optimizer.md +++ b/qdrant-landing/content/documentation/concepts/optimizer.md @@ -50,7 +50,15 @@ Such segments, for example, are created as copy-on-write segments during optimiz It is also essential to have at least one small segment that Qdrant will use to store frequently updated data. On the other hand, too many small segments lead to suboptimal search performance. -Qdrant uses a parameter called max_segment_size to control the size of segments. Increasing this value allows the creation of larger segments, reducing the number of segments and potentially improving search performance. +The merge optimizer constantly tries to reduce the number of segments if there +currently are too many. The desired number of segments is specified +with `default_segment_number` and defaults to the number of CPUs. The optimizer +may takes at least the three smallest segments and merges them into one. + +Segments will not be merged if they'll exceed the maximum configured segment +size with `max_segment_size_kb`. It prevents creating segments that are too +large to efficiently index. Increasing this number may help to reduce the number +of segments if you have a lot of data, and can potentially improve search performance. The criteria for starting the optimizer are defined in the configuration file.