Skip to content

Commit

Permalink
Merge pull request #1195 from qdrant/capacity-planning
Browse files Browse the repository at this point in the history
[Docs] Capacity Planning
  • Loading branch information
davidmyriel authored Sep 27, 2024
2 parents 48a55e4 + 425a69d commit 8f2f4c2
Show file tree
Hide file tree
Showing 3 changed files with 111 additions and 73 deletions.
71 changes: 0 additions & 71 deletions qdrant-landing/content/documentation/cloud/capacity-sizing.md

This file was deleted.

4 changes: 2 additions & 2 deletions qdrant-landing/content/documentation/cloud/create-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ A free tier cluster only includes 1 single node with the following resources:
| Disk space | 4 GB |
| Nodes | 1 |

This configuration supports serving about 1 M vectors of 768 dimensions. To calculate your needs, refer to our documentation on [Capacity and sizing](/documentation/cloud/capacity-sizing/).
This configuration supports serving about 1 M vectors of 768 dimensions. To calculate your needs, refer to our documentation on [Capacity Planning](/documentation/guides/capacity-planning/).

The choice of cloud providers and regions is limited.

Expand Down Expand Up @@ -73,7 +73,7 @@ This page shows you how to use the Qdrant Cloud Console to create a custom Qdran

1. Choose your data center region or Hybrid Cloud environment.
1. Configure RAM for each node.
> For more information, see our [**Capacity and Sizing**](/documentation/cloud/capacity-sizing/) guidance.
> For more information, see our [Capacity Planning](/documentation/guides/capacity-planning/) guidance.
1. Choose the number of vCPUs per node. If you add more
RAM, the menu provides different options for vCPUs.
1. Select the number of nodes you want the cluster to be deployed on.
Expand Down
109 changes: 109 additions & 0 deletions qdrant-landing/content/documentation/guides/capacity-planning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
title: Capacity Planning
weight: 11
aliases:
- capacity
- /documentation/cloud/capacity-sizing
---
# Capacity Planning

When setting up your cluster, you'll need to figure out the right balance of **RAM** and **disk storage**. The best setup depends on a few things:

- How many vectors you have and their dimensions.
- The amount of payload data you're using and their indexes.
- What data you want to store in memory versus on disk.
- Your cluster's replication settings.
- Whether you're using quantization and how you’ve set it up.

## Calculating RAM size

You should store frequently accessed data in RAM for faster retrieval. If you want to keep all vectors in memory for optimal performance, you can use this rough formula for estimation:

```text
memory_size = number_of_vectors * vector_dimension * 4 bytes * 1.5
```

At the end, we multiply everything by 1.5. This extra 50% accounts for metadata (such as indexes and point versions) and temporary segments created during optimization.

Let's say you want to store 1 million vectors with 1024 dimensions:

```text
memory_size = 1,000,000 * 1024 * 4 bytes * 1.5
```
The memory_size is approximately 6,144,000,000 bytes, or about 5.72 GB.

Depending on the use case, large datasets can benefit from reduced memory requirements via [quantization](/documentation/guides/quantization/).

## Calculating payload size

This is always different. The size of the payload depends on the [structure and content of your data](/documentation/concepts/payload/#payload-types). For instance:

- **Text fields** consume space based on length and encoding (e.g. a large chunk of text vs a few words).
- **Floats** have fixed sizes of 8 bytes for `int64` or `float64`.
- **Boolean fields** typically consume 1 byte.

<aside role="alert">
The easiest way to calculate your payload size is to use a JSON size calculator.
</aside>

Calculating total payload size is similar to vectors. We have to multiply it by 1.5 for back-end indexing processes.

```text
total_payload_size = number_of_points * payload_size * 1.5
```

Let's say you want to store 1 million points with JSON payloads of 5KB:

```text
total_payload_size = 1,000,000 * 5KB * 1.5
```
The total_payload_size is approximately 5,000,000 bytes, or about 4.77 GB.

## Choosing disk over RAM

For optimal performance, you should store only frequently accessed data in RAM. The rest should be offloaded to the disk. For example, extra payload fields that you don't use for filtering can be stored on disk.

Only [indexed fields](../../concepts/indexing/#payload-index) should be stored in RAM. You can read more about payload storage in the [Storage](../../concepts/storage/#payload-storage) section.

### Storage-focused configuration

If your priority is to handle large volumes of vectors with average search latency, it's recommended to configure [memory-mapped (mmap) storage](/documentation/concepts/storage/#configuring-memmap-storage). In this setup, vectors are stored on disk in memory-mapped files, while only the most frequently accessed vectors are cached in RAM.

The amount of available RAM greatly impacts search performance. As a general rule, if you store half as many vectors in RAM, search latency will roughly double.

Disk speed is also crucial. [Contact us](/documentation/support/) if you have specific requirements for high-volume searches in our Cloud.

### Subgroup-oriented configuration

If your use case involves splitting vectors into multiple collections or subgroups based on payload values (e.g., serving searches for multiple users, each with their own subset of vectors), memory-mapped storage is recommended.

In this scenario, only the active subset of vectors will be cached in RAM, allowing for fast searches for the most recent and active users. You can estimate the required memory size as:

```text
memory_size = number_of_active_vectors * vector_dimension * 4 bytes * 1.5
```

Please refer to our [multitenancy](/documentation/guides/multiple-partitions/) documentation for more details on partitioning data in a Qdrant.

## Scaling disk space in Qdrant Cloud

Clusters supporting vector search require substantial disk space compared to other search systems. If you're running low on disk space, you can use the UI at [cloud.qdrant.io](https://cloud.qdrant.io/) to **Scale Up** your cluster.

<aside role="status">Note: If you increase disk space via the Qdrant UI, you cannot reduce it later.</aside>

When running low on disk space, consider the following benefits of scaling up:

- **Larger Datasets**: Supports larger datasets, which can improve the relevance and quality of search results.
- **Improved Indexing**: Enables the use of advanced indexing strategies like HNSW.
- **Caching**: Enhances speed by having more RAM, allowing more frequently accessed data to be cached.
- **Backups and Redundancy**: Facilitates more frequent backups, which is a key advantage for data safety.

Always remember to add 50% of the vector size. This would account for things like indexes and auxiliary data used during operations such as vector insertion, deletion, and search. Thus, the estimated memory size including metadata is:

```text
total_vector_size = number_of_dimensions * 4 bytes * 1.5
```

**Disclaimer**

The above calculations are estimates at best. If you're looking for more accurate numbers, you should always test your data set in practice.

0 comments on commit 8f2f4c2

Please sign in to comment.