diff --git a/qdrant-landing/content/articles/what-is-a-vector-database.md b/qdrant-landing/content/articles/what-is-a-vector-database.md
index 5b5add8c5..d27d5f89f 100644
--- a/qdrant-landing/content/articles/what-is-a-vector-database.md
+++ b/qdrant-landing/content/articles/what-is-a-vector-database.md
@@ -1,216 +1,499 @@
---
-title: "What is a Vector Database?"
+title: "An Introduction to Vector Databases"
draft: false
-slug: what-is-a-vector-database?
short_description: What is a Vector Database? Use Cases & Examples | Qdrant
-description: Discover what a vector database is, its core functionalities, and real-world applications. Unlock advanced data management with our comprehensive guide.
+description: Discover what a vector database is, its core functionalities, and real-world applications.
preview_dir: /articles_data/what-is-a-vector-database/preview
-weight: -100
-social_preview_image: /articles_data/what-is-a-vector-database/preview/social-preview.jpg
+weight: -211
+social_preview_image: /articles_data/what-is-a-vector-database/preview/social_preview.png
small_preview_image: /articles_data/what-is-a-vector-database/icon.svg
-date: 2024-01-25T09:29:33-03:00
+date: 2024-10-09T09:29:33-03:00
+aliases: [ /blog/what-is-a-vector-database/ ]
author: Sabrina Aquino
featured: true
tags:
- vector-search
- vector-database
- embeddings
-
-aliases: [ /blog/what-is-a-vector-database/ ]
---
-> A [Vector Database](https://qdrant.tech/qdrant-vector-database/) is a specialized database system designed for efficiently indexing, querying, and retrieving high-dimensional vector data. Those systems enable advanced data analysis and similarity-search operations that extend well beyond the traditional, structured query approach of conventional databases.
+## What Is a Vector Database?
+
+![vector-database-architecture](/articles_data/what-is-a-vector-database/vector-database-1.jpeg)
+
+A [Vector Database](https://qdrant.tech/qdrant-vector-database/) is a specialized system designed to efficiently handle high-dimensional vector data. It excels at indexing, querying, and retrieving this data, enabling advanced analysis and similarity searches that traditional databases cannot easily perform.
+
+Most of the millions of terabytes of data we generate each day is **unstructured**. Think of the meal photos you snap, the PDFs shared at work, or the podcasts you save but may never listen to. None of it fits neatly into rows and columns.
+
+Unstructured data lacks a strict format or schema, making it challenging for conventional databases to manage. Yet, this unstructured data holds immense potential for **AI**, **machine learning**, and **modern search engines**.
+
+> Vector databases are built to unravel this complexity, allowing us to extract meaning and find connections within vast, unstructured datasets.
+
+### The Challenge with Traditional Databases
+
+Traditional [OLTP](https://www.ibm.com/topics/oltp) and [OLAP](https://www.ibm.com/topics/olap) databases have been the backbone of data storage for decades. They are great at managing structured data with well-defined schemas, like `name`, `address`, `phone number`, and `purchase history`.
+
+
+
+But when data can't be easily categorized, like the content inside a PDF file, things start to get complicated.
+
+You can always store the PDF file as raw data, perhaps with some metadata attached. However, the database still wouldn’t be able to understand what's inside the document, categorize it, or even search for the information that it contains.
+
+Also, this applies to more than just PDF documents. Think about the vast amounts of text, audio, and image data you generate every day. If a database can’t grasp the **meaning** of this data, how can you search for or find relationships within the data?
+
+
+
+This is where vector databases come in. They can help you understand the **context** or **conceptual similarity** of unstructured data by representing the data as **vectors**.
+
+## When to Use a Vector Database
+
+Not sure if you should use a vector database or a traditional database? This chart may help.
+
+| **Feature** | **OLTP Database** | **OLAP Database** | **Vector Database** |
+|---------------------|--------------------------------------|--------------------------------------------|--------------------------------------------|
+| **Data Structure** | Rows and columns | Rows and columns | Vectors |
+| **Type of Data** | Structured | Structured/Partially Unstructured | Unstructured |
+| **Query Method** | SQL-based (Transactional Queries) | SQL-based (Aggregations, Analytical Queries) | Vector Search (Similarity-Based) |
+| **Storage Focus** | Schema-based, optimized for updates | Schema-based, optimized for reads | Context and Semantics |
+| **Performance** | Optimized for high-volume transactions | Optimized for complex analytical queries | Optimized for unstructured data retrieval |
+| **Use Cases** | Inventory, order processing, CRM | Business intelligence, data warehousing | Similarity search, recommendations, RAG, anomaly detection, etc. |
+
-The data flood is real.
+## What Is a Vector?
-In 2024, we're drowning in unstructured data like images, text, and audio, that don’t fit into neatly organized tables. Still, we need a way to easily tap into the value within this chaos of almost 330 million terabytes of data being created each day.
+![vector-database-vector](/articles_data/what-is-a-vector-database/vector-database-7.jpeg)
-Traditional databases, even with extensions that provide some vector handling capabilities, struggle with the complexities and demands of high-dimensional vector data.
+When a machine needs to process unstructured data - an image, a piece of text, or an audio file, it first has to translate that data into a format it can work with: **vectors**.
-Handling of vector data is extremely resource-intensive. A traditional vector is around 6Kb. You can see how scaling to millions of vectors can demand substantial system memory and computational resources. Which is at least very challenging for traditional [OLTP](https://www.ibm.com/topics/oltp) and [OLAP](https://www.ibm.com/topics/olap) databases to manage.
+> A **vector** is a numerical representation of data that can capture the **context** and **semantics** of data.
-![](/articles_data/what-is-a-vector-database/Why-Use-Vector-Database.jpg)
+When you deal with unstructured data, traditional databases struggle to understand its meaning. However, a vector can translate that data into something a machine can process. For example, a vector generated from text can represent relationships and meaning between words, making it possible for a machine to compare and understand their context.
-Vector databases allow you to understand the **context** or **conceptual similarity** of unstructured data by representing them as **vectors**, enabling advanced analysis and retrieval based on data similarity.
+There are three key elements that define a vector in a vector database: the **ID**, the **dimensions**, and the **payload**. These components work together to represent a vector effectively within the system. Together, they form a **point**, which is the core unit of data stored and retrieved in a vector database.
-For example, in recommendation systems, vector databases can analyze user behavior and item characteristics to suggest products or content with a high degree of personal relevance.
+
-In search engines and research databases, they enhance the user experience by providing results that are **semantically** similar to the query. They do not rely solely on the exact words typed into the search bar.
+Each one of these parts plays an important role in how vectors are stored, retrieved, and interpreted. Let's see how.
-If you're new to the vector search space, this article explains the key concepts and relationships that you need to know.
+### 1. The ID: Your Vector’s Unique Identifier
-So let's get into it.
+Just like in a relational database, each vector in a vector database gets a unique ID. Think of it as your vector’s name tag, a **primary key** that ensures the vector can be easily found later. When a vector is added to the database, the ID is created automatically.
+While the ID itself doesn't play a part in the similarity search (which operates on the vector's numerical data), it is essential for associating the vector with its corresponding "real-world" data, whether that’s a document, an image, or a sound file.
-## What is Vector Data?
+After a search is performed and similar vectors are found, their IDs are returned. These can then be used to **fetch additional details or metadata** tied to the result.
-To understand vector databases, let's begin by defining what is a 'vector' or 'vector data'.
+### 2. The Dimensions: The Core Representation of the Data
-Vectors are a **numerical representation** of some type of complex information.
+At the core of every vector is a set of numbers, which together form a representation of the data in a **multi-dimensional** space.
-To represent textual data, for example, it will encapsulate the nuances of language, such as semantics and context.
+#### From Text to Vectors: How Does It Work?
-With an image, the vector data encapsulates aspects like color, texture, and shape. The **dimensions** relate to the complexity and the amount of information each image contains.
+These numbers are generated by **embedding models**, such as deep learning algorithms, and capture the essential patterns or relationships within the data. That's why the term **embedding** is often used interchangeably with vector when referring to the output of these models.
-Each pixel in an image can be seen as one dimension, as it holds data (like color intensity values for red, green, and blue channels in a color image). So even a small image with thousands of pixels translates to thousands of dimensions.
+To represent textual data, for example, an embedding will encapsulate the nuances of language, such as semantics and context within its dimensions.
-So from now on, when we talk about high-dimensional data, we mean that the data contains a large number of data points (pixels, features, semantics, syntax).
+
-The **creation** of vector data (so we can store this high-dimensional data on our vector database) is primarily done through **embeddings**.
+For that reason, when comparing two similar sentences, their embeddings will turn out to be very similar, because they have similar **linguistic elements**.
-![](/articles_data/what-is-a-vector-database/Vector-Data.jpg)
+
-### How do Embeddings Work?
+That’s the beauty of embeddings. Tthe complexity of the data is distilled into something that can be compared across a multi-dimensional space.
-[Embeddings](https://qdrant.tech/articles/what-are-embeddings/) translate this high-dimensional data into a more manageable, **lower-dimensional** vector form that's more suitable for machine learning and data processing applications, typically through **neural network models**.
+### 3. The Payload: Adding Context with Metadata
-In creating dimensions for text, for example, the process involves analyzing the text to capture its linguistic elements.
+Sometimes you're going to need more than just numbers to fully understand or refine a search. While the dimensions capture the essence of the data, the payload holds **metadata** for structured information.
-Transformer-based neural networks like **BERT** (Bidirectional Encoder Representations from Transformers) and **GPT** (Generative Pre-trained Transformer), are widely used for creating text embeddings.
+It could be textual data like descriptions, tags, categories, or it could be numerical values like dates or prices. This extra information is vital when you want to filter or rank search results based on criteria that aren’t directly encoded in the vector.
-Each layer extracts different levels of features, such as context, semantics, and syntax.
+> This metadata is invaluable when you need to apply additional **filters** or **sorting** criteria.
-![](/articles_data/what-is-a-vector-database/How-Do-Embeddings-Work_.jpg)
+For example, if you’re searching for a picture of a dog, the vector helps the database find images that are visually similar. But let's say you want results showing only images taken within the last year, or those tagged with “vacation.”
+
-The final layers of the network condense this information into a vector that is a compact, lower-dimensional representation of the image but still retains the essential information.
+The payload can help you narrow down those results by ignoring vectors that doesn't match your query vector filtering criteria. If you want the full picture of how filtering works in Qdrant, check out our [Complete Guide to Filtering.](https://qdrant.tech/articles/vector-search-filtering/)
+## The Architecture of a Vector Database
+
+A vector database is made of multiple different entities and relations. Let's understand a bit of what's happening here:
+
+
+### Collections
+
+A [collection](https://qdrant.tech/documentation/concepts/collections/) is essentially a group of **vectors** (or “[points](https://qdrant.tech/documentation/concepts/points/)”) that are logically grouped together **based on similarity or a specific task**. Every vector within a collection shares the same dimensionality and can be compared using a single metric. Avoid creating multiple collections unless necessary; instead, consider techniques like **sharding** for scaling across nodes or **multitenancy** for handling different use cases within the same infrastructure.
+
+### Distance Metrics
+
+These metrics defines how similarity between vectors is calculated. The choice of distance metric is made when creating a collection and the right choice depends on the type of data you’re working with and how the vectors were created. Here are the three most common distance metrics:
+
+- **Euclidean Distance:** The straight-line path. It’s like measuring the physical distance between two points in space. Pick this one when the actual distance (like spatial data) matters.
+
+- **Cosine Similarity:** This one is about the angle, not the length. It measures how two vectors point in the same direction, so it works well for text or documents when you care more about meaning than magnitude. For example, if two things are *similar*, *opposite*, or *unrelated*:
+
+
+
+- **Dot Product:** This looks at how much two vectors align. It’s popular in recommendation systems where you're interested in how much two things “agree” with each other.
+
+### RAM-Based and Memmap Storage
+
+By default, Qdrant stores vectors in RAM, delivering incredibly fast access for datasets that fit comfortably in memory. But when your dataset exceeds RAM capacity, Qdrant offers Memmap as an alternative.
+
+Memmap allows you to store vectors **on disk**, yet still access them efficiently by mapping the data directly into memory if you have enough RAM. To enable it, you only need to set `"on_disk": true` when you are **creating a collection:**
+
+```python
+client.create_collection(
+ collection_name="{collection_name}",
+ vectors_config=models.VectorParams(
+ size=768, distance=models.Distance.COSINE, on_disk=True
+ ),
+)
+```
+
+For other configurations like `hnsw_config.on_disk` or `memmap_threshold_kb`, see the Qdrant documentation for [Storage.](https://qdrant.tech/documentation/concepts/storage/)
+
+### SDKs
+
+Qdrant offers a range of SDKs. You can use the programming language you're most comfortable with, whether you're coding in [Python](https://github.com/qdrant/qdrant-client), [Go](https://github.com/qdrant/go-client), [Rust](https://github.com/qdrant/rust-client), or [Javascript/Typescript](https://github.com/qdrant/qdrant-js).
## The Core Functionalities of Vector Databases
-### Vector Database Indexing
+![vector-database-functions](/articles_data/what-is-a-vector-database/vector-database-3.jpeg)
+
+When you think of a traditional database, the operations are familiar: you **create**, **read**, **update**, and **delete** records. These are the fundamentals. And guess what? In many ways, vector databases work the same way, but the operations are translated for the complexity of vectors.
+
+### 1. Indexing: HNSW Index and Sending Data to Qdrant
+
+Indexing your vectors is like creating an entry in a traditional database. But for vector databases, this step is very important. Vectors need to be indexed in a way that makes them easy to search later on.
+
+**HNSW** (Hierarchical Navigable Small World) is an powerful indexing algorithm that most vector databases rely on to organize vectors for fast and efficient search.
+
+It builds a multi-layered graph, where each vector is a node and connections represent similarity. The higher layers connect broadly similar vectors, while lower layers link vectors that are closely related, making searches progressively more refined as they go deeper.
+
+
+
+When you run a search, HNSW starts at the top, quickly narrowing down the search by hopping between layers. It focuses only on relevant vectors as it goes deeper, refining the search with each step.
+
+### 1.1 Payload Indexing
+
+In Qdrant, indexing is modular. You can configure indexes for **both vectors and payloads independently**. The payload index is responsible for optimizing filtering based on metadata. Each payload index is built for a specific field and allows you to quickly filter vectors based on specific conditions.
+
+
+
+You need to build the payload index for **each field** you'd like to search. The magic here is in the combination: HNSW finds similar vectors, and the payload index makes sure only the ones that fit your criteria come through. Learn more about Qdrant's [Filtrable HNSW](https://qdrant.tech/articles/filtrable-hnsw/) and why it was build like this.
+
+> Combining [full-text search](https://qdrant.tech/documentation/concepts/indexing/#full-text-index) with vector-based search gives you even more versatility. You can simultaneously search for conceptually similar documents while ensuring specific keywords are present, all within the same query.
+
+### 2. Searching: Approximate Nearest Neighbors (ANN) Search
+
+Similarity search allows you to search by **meaning**. This way you can do searches such as similar songs that evoke the same mood, finding images that match your artistic vision, or even exploring emotional patterns in text.
+
+
+
+The way it works is, when the user queries the database, this query is also converted into a vector. The algorithm quickly identifies the area of the graph likely to contain vectors closest to the **query vector**.
+
+
+
+The search then moves down progressively narrowing down to more closely related and relevant vectors. Once the closest vectors are identified at the bottom layer, these points translate back to actual data, representing your **top scored documents**.
+
+Here's a high-level overview of this process:
+
+
+
+### 3. Updating Vectors: Real-Time and Bulk Adjustments
+
+Data isn't static, and neither are vectors. Keeping your vectors up to date is crucial for maintaining relevance in your searches.
+
+Vector updates don’t always need to happen instantly, but when they do, Qdrant handles real-time modifications efficiently with a simple API call:
+
+```python
+qdrant_client.upsert(
+ collection_name='product_collection',
+ points=[PointStruct(id=product_id, vector=new_vector, payload=new_payload)]
+)
+```
+
+For large-scale changes, like re-indexing vectors after a model update, batch updating allows you to update multiple vectors in one operation without impacting search performance:
+
+```python
+batch_of_updates = [
+ PointStruct(id=product_id_1, vector=updated_vector_1, payload=new_payload_1),
+ PointStruct(id=product_id_2, vector=updated_vector_2, payload=new_payload_2),
+ # Add more points...
+]
+
+qdrant_client.upsert(
+ collection_name='product_collection',
+ points=batch_of_updates
+)
+```
+
+### 4. Deleting Vectors: Managing Outdated and Duplicate Data
+
+Efficient vector management is key to keeping your searches accurate and your database lean. Deleting vectors that represent outdated or irrelevant data, such as expired products, old news articles, or archived profiles, helps maintain both performance and relevance.
+
+In Qdrant, removing vectors is straightforward, requiring only the vector IDs to be specified:
+
+```python
+qdrant_client.delete(
+ collection_name='data_collection',
+ points_selector=PointIdsList([vector_id_1, vector_id_2])
+)
+```
+You can use deletion to remove outdated data, clean up duplicates, and manage the lifecycle of vectors by automatically deleting them after a set period to keep your dataset relevant and focused.
+
+## Dense vs. Sparse Vectors
+
+![vector-database-dense-sparse](/articles_data/what-is-a-vector-database/vector-database-4.jpeg)
+
+Now that you understand what vectors are and how they are created, let's learn more about the two possible types of vectors you can use: **dense** or **sparse**. The main difference between the two are:
+
+### 1. Dense Vectors
+
+Dense vectors are, quite literally, dense with information. Every element in the vector contributes to the **semantic meaning**, **relationships** and **nuances** of the data. A dense vector representation of this sentence might look like this:
+
+
+
+Each number holds weight. Together, they convey the overall meaning of the sentence, and are better for identifying contextually similar items, even if the words don’t match exactly.
+
+### 2. Sparse Vectors
+
+Sparse vectors operate differently. They focus only on the essentials. In most sparse vectors, a large number of elements are zeros. When a feature or token is present, it’s marked—otherwise, zero.
+
+In the image, you can see a sentence, *“I love Vector Similarity,”* broken down into tokens like *“i,” “love,” “vector”* through tokenization. Each token is assigned a unique `ID` from a large vocabulary. For example, *“i”* becomes `193`, and *“vector”* becomes `15012`.
+
+
+
+Sparse vectors, are used for **exact matching** and specific token-based identification. The values on the right, such as `193: 0.04` and `9182: 0.12`, are the scores or weights for each token, showing how relevant or important each token is in the context. The final result is a sparse vector:
+
+```yaml
+{
+ 193: 0.04,
+ 9182: 0.12,
+ 15012: 0.73,
+ 6731: 0.69,
+ 454: 0.21
+}
+```
-Have you ever tried to find a specific face in a massive crowd photo? Well, vector databases face a similar challenge when dealing with tons of high-dimensional vectors.
+Everything else in the vector space is assumed to be zero.
-Now, imagine dividing the crowd into smaller groups based on hair color, then eye color, then clothing style. Each layer gets you closer to who you’re looking for. Vector databases use similar **multi-layered** structures called indexes to organize vectors based on their "likeness."
+Sparse vectors are ideal for tasks like **keyword search** or **metadata filtering**, where you need to check for the presence of specific tokens without needing to capture the full meaning or context. They suited for exact matches within the **data itself**, rather than relying on external metadata, which is handled by payload filtering.
-This way, finding similar images becomes a quick hop across related groups, instead of scanning every picture one by one.
+## Benefits of Hybrid Search
+![vector-database-get-started](/articles_data/what-is-a-vector-database/vector-database-5.jpeg)
-![](/articles_data/what-is-a-vector-database/Indexing.jpg)
+Sometimes context alone isn’t enough. Sometimes you need precision, too. Dense vectors are fantastic when you need to retrieve results based on the context or meaning behind the data. Sparse vectors are useful when you also need **keyword or specific attribute matching**.
+> With hybrid search you don’t have to choose one over the othe and use both to get searches that are more **relevant** and **filtered**.
-Different indexing methods exist, each with its strengths. [HNSW](/articles/filtrable-hnsw/) balances speed and accuracy like a well-connected network of shortcuts in the crowd. Others, like IVF or Product Quantization, focus on specific tasks or memory efficiency.
+To achieve this balance, Qdrant uses **normalization** and **fusion** techniques to blend results from multiple search methods. One common approach is **Reciprocal Rank Fusion (RRF)**, where results from different methods are merged, giving higher importance to items ranked highly by both methods. This ensures that the best candidates, whether identified through dense or sparse vectors, appear at the top of the results.
+Qdrant combines dense and sparse vector results through a process of **normalization** and **fusion**.
-### Binary Quantization
+
-Quantization is a technique used for reducing the total size of the database. It works by compressing vectors into a more compact representation at the cost of accuracy.
+### How to Use Hybrid Search in Qdrant
+Qdrant makes it easy to implement hybrid search through its Query API. Here’s how you can make it happen in your own project:
-[Binary Quantization](/articles/binary-quantization/) is a fast indexing and data compression method used by Qdrant. It supports vector comparisons, which can dramatically speed up query processing times (up to 40x faster!).
+
-Think of each data point as a ruler. Binary quantization splits this ruler in half at a certain point, marking everything above as "1" and everything below as "0". This [binarization](https://deepai.org/machine-learning-glossary-and-terms/binarization) process results in a string of bits, representing the original vector.
+**Example Hybrid Query:** Let’s say a researcher is looking for papers on NLP, but the paper must specifically mention "transformers" in the content:
+```python
+search_query = {
+ "vector": query_vector, # Dense vector for semantic search
+ "filter": { # Sparse vector filtering for specific terms
+ "must": [
+ {"key": "text", "match": "transformers"} # Exact keyword match in the paper
+ ]
+ }
+}
+```
+In this query the dense vector search finds papers related to the broad topic of NLP and the sparse vector filtering ensures that the papers specifically mention “transformers”.
+This is just a simple example and there's so much more you can do with it. See our complete [article on Hybrid Search](https://qdrant.tech/articles/hybrid-search/) guide to see what's happening behind the scenes and all the possibilities when building a hybrid search system.
-![](/articles_data/what-is-a-vector-database/Binary-Quant.png)
+## Quantization: Get 40x Faster Results
+![vector-database-architecture](/articles_data/what-is-a-vector-database/vector-database-2.jpeg)
-This "quantized" code is much smaller and easier to compare. Especially for OpenAI embeddings, this type of quantization has proven to achieve a massive performance improvement at a lower cost of accuracy.
+As your vector dataset grow larger, so do the computational demands of searching through it.
+Quantized vectors are much smaller and easier to compare. With methods like [**Binary Quantization**](https://qdrant.tech/articles/binary-quantization/), you can see **search speeds improve by up to 40x while memory usage decreases by 32x**. Improvements that can be decicive when dealing with large datasets or needing low-latency results.
-### Similarity Search
+It works by converting high-dimensional vectors, which typically use `4 bytes` per dimension, into binary representations, using just `1 bit` per dimension. Values above zero become "1", and everything else becomes "0".
-[Similarity search](/documentation/concepts/search/) allows you to search not by keywords but by meaning. This way you can do searches such as similar songs that evoke the same mood, finding images that match your artistic vision, or even exploring emotional patterns in text.
+
-The way it works is, when the user queries the database, this query is also converted into a vector (the query vector). The [vector search](/documentation/overview/vector-search/) starts at the top layer of the HNSW index, where the algorithm quickly identifies the area of the graph likely to contain vectors closest to the query vector. The algorithm compares your query vector to all the others, using metrics like "distance" or "similarity" to gauge how close they are.
+Quantization reduces data precision, and yes, this does lead to some loss of accuracy. However, for binary quantization, **OpenAI embeddings** achieves this performance improvement at a cost of only 5% of accuracy. If you apply techniques like **oversampling** and **rescoring**, this loss can be brought down even further.
-The search then moves down progressively narrowing down to more closely related vectors. The goal is to narrow down the dataset to the most relevant items. The image below illustrates this.
+However, binary quantization isn’t the only available option. Techniques like [**Scalar Quantization**](https://qdrant.tech/documentation/guides/quantization/#scalar-quantization) and [**Product Quantization**](https://qdrant.tech/documentation/guides/quantization/#product-quantization) are also popular alternatives when optimizing vector compression.
+You can set up your chosen quantization method using the `quantization_config` parameter when creating a new collection:
-![](/articles_data/what-is-a-vector-database/Similarity-Search-and-Retrieval.jpg)
+```python
+client.create_collection(
+ collection_name="{collection_name}",
+ vectors_config=models.VectorParams(
+ size=1536,
+ distance=models.Distance.COSINE
+ ),
+ # Choose your preferred quantization method
+ quantization_config=models.BinaryQuantization(
+ binary=models.BinaryQuantizationConfig(
+ always_ram=True, # Store the quantized vectors in RAM for faster access
+ ),
+ ),
+)
+```
+You can store original vectors on disk within the `vectors_config` by setting `on_disk=True` to save RAM space, while keeping quantized vectors in RAM for faster access
-Once the closest vectors are identified at the bottom layer, these points translate back to actual data, like images or music, representing your search results.
+We recommend checking out our [Vector Quantization guide](https://qdrant.tech/articles/what-is-vector-quantization/) for a full breakdown of methods and tips on **optimizing performance** for your specific use case.
+## Distributed Deployment
-### Scalability
+When thinking about scaling, the key factors to consider are **fault tolerance**, **load balancing**, and **availability**. One node, no matter how powerful, can only take you so far. Eventually, you'll need to spread the workload across multiple machines to ensure the system remains fast and stable.
-[Vector databases](https://qdrant.tech/qdrant-vector-database/) often deal with datasets that comprise billions of high-dimensional vectors. This data isn't just large in volume but also complex in nature, requiring more computing power and memory to process. Scalable systems can handle this increased complexity without performance degradation. This is achieved through a combination of a **distributed architecture**, **dynamic resource allocation**, **data partitioning**, **load balancing**, and **optimization techniques**.
+### Sharding: Distributing Data Across Nodes
-Systems like Qdrant exemplify scalability in vector databases. It [leverages Rust's efficiency](https://qdrant.tech/articles/why-rust/) in **memory management** and **performance**, which allows the handling of large-scale data with optimized resource usage.
+In a distributed Qdrant cluster, data is split into smaller units called **shards**, which are distributed across different nodes. which helps balance the load and ensures that queries can be processed in parallel.
+Each collection—a group of related data points—can be split into non-overlapping subsets, which are then managed by different nodes.
-### Efficient Query Processing
+
-The key to efficient query processing in these databases is linked to their **indexing methods**, which enable quick navigation through complex data structures. By mapping and accessing the high-dimensional vector space, HNSW and similar indexing techniques significantly reduce the time needed to locate and retrieve relevant data.
+**Raft Consensus** ensures that all the nodes stay in sync and have a consistent view of the data. Each node knows where every shard is, and Raft ensures that all nodes are in sync. If one node fails, the others know where the missing data is located and can take over.
+By default, the number of shards in your Qdrant system matches the number of nodes in your cluster. But if you need more control, you can choose the `shard_number` manually when creating a collection.
+```python
+client.create_collection(
+ collection_name="{collection_name}",
+ vectors_config=models.VectorParams(size=300, distance=models.Distance.COSINE),
+ shard_number=4, # Custom number of shards
+)
+```
-![](/articles_data/what-is-a-vector-database/search-query.jpg)
+There are two main types of sharding:
+1. **Automatic Sharding:** Points (vectors) are automatically distributed across shards using consistent hashing. Each shard contains non-overlapping subsets of the data.
+2. **User-defined Sharding:** Specify how points are distributed, enabling more control over your data organization, especially for use cases like **multitenancy**, where each tenant (a user, client, or organization) has their own isolated data.
-Other techniques like **handling computational load** and **parallel processing** are used for performance, especially when managing multiple simultaneous queries. Complementing them, **strategic caching** is also employed to store frequently accessed data, facilitating a quicker retrieval for subsequent queries.
+Each shard is divided into **segments**. They are a smaller storage unit within a shard, storing a subset of vectors and their associated payloads (metadata). When a query is executed, it targets the only relevant segments, processing them in parallel.
+
-### Using Metadata and Filters
+### Replication: High Availability and Data Integrity
-Filters use metadata to refine search queries within the database. For example, in a database containing text documents, a user might want to search for documents not only based on textual similarity but also filter the results by publication date or author.
+You don’t want a single failure to take down your system, right? Replication keeps multiple copies of the same data across different nodes to ensure **high availability**.
-When a query is made, the system can use **both** the vector data and the metadata to process the query. In other words, the database doesn’t just look for the closest vectors. It also considers the additional criteria set by the metadata filters, creating a more customizable search experience.
+In Qdrant, **Replica Sets** manage these copies of shards across different nodes. If one replica becomes unavailable, others are there to take over and keep the system running. Whether the data is local or remote is mainly influenced by how you've configured the cluster.
+
-![](/articles_data/what-is-a-vector-database/metadata.jpg)
+When a query is made, if the relevant data is stored locally, the local shard handles the operation. If the data is on a remote shard, it’s retrieved via gRPC.
+You can control how many copies you want with the `replication_factor`. For example, creating a collection with 4 shards and a replication factor of 2 will result in 8 physical shards distributed across the cluster:
+```python
+client.create_collection(
+ collection_name="{collection_name}",
+ vectors_config=models.VectorParams(size=300, distance=models.Distance.COSINE),
+ shard_number=4,
+ replication_factor=2,
+)
+```
-### Data Security and Access Control
+We recommend using sharding and replication together so that your data is both split across nodes and replicated for availability.
-Vector databases often store sensitive information. This could include personal data in customer databases, confidential images, or proprietary text documents. Ensuring data security means protecting this information from unauthorized access, breaches, and other forms of cyber threats.
+For more details on features like **user-defined sharding, node failure recovery**, and **consistency guarantees**, see our guide on [Distributed Deployment.](https://qdrant.tech/documentation/guides/distributed_deployment/)
-At Qdrant, this includes mechanisms such as:
+## Multitenancy: Data Isolation for Multi-Tenant Architectures
- - User authentication
- - Encryption for data at rest and in transit
- - Keeping audit trails
- - Advanced database monitoring and anomaly detection
+![vector-database-get-started](/articles_data/what-is-a-vector-database/vector-database-6.png)
+Sharding efficiently distributes data across nodes, while replication guarantees redundancy and fault tolerance. But what happens when you’ve got multiple clients or user groups, and you need to keep their data isolated within the same infrastructure?
-## What is the Architecture of a Vector Database?
+**Multitenancy** allows you to keep data for different tenants (users, clients, or organizations) isolated within a single cluster. Instead of creating separate collections for `Tenant 1` and `Tenant 2`, you store their data in the same collection but tag each vector with a `group_id` to identify which tenant it belongs to.
-A vector database is made of multiple different entities and relations. Here's a high-level overview of Qdrant's terminologies and how they fit into the larger picture:
+
+In the backend, Qdrant can store `Tenant 1`’s data in Shard 1 located in Canada (perhaps for compliance reasons like GDPR), while `Tenant 2`’s data is stored in Shard 2 located in Germany. The data will be physically separated but still within the same infrastructure.
-![](/articles_data/what-is-a-vector-database/Architecture-of-a-Vector-Database.jpg)
+To implement this, you tag each vector with a tenant-specific `group_id` during the upsert operation:
+```python
+client.upsert(
+ collection_name="tenant_data",
+ points=[models.PointStruct(
+ id=2,
+ payload={"group_id": "tenant_1"},
+ vector=[0.1, 0.9, 0.1]
+ )],
+ shard_key_selector="canada"
+)
+```
-**Collections**: [Collections](/documentation/concepts/collections/) are a named set of data points, where each point is a vector with an associated payload. All vectors within a collection must have the same dimensionality and be comparable using a single metric.
+Each tenant’s data remains isolated while still benefiting from the shared infrastructure. Optimizing for data privacy, compliance with local regulations, and scalability, without the need to create excessive collections or maintain separate clusters for each tenant.
-**Distance Metrics**: These metrics are used to measure the similarity between vectors. The choice of distance metric is made when creating a collection. It depends on the nature of the vectors and how they were generated, considering the neural network used for the encoding.
+If you want to learn more about working with a multitenant setup in Qdrant, you can check out our [Multitenancy and Custom Sharding dedicated guide.](https://qdrant.tech/articles/multitenancy/)
-**Points**: Each [point](/documentation/concepts/points/) consists of a **vector** and can also include an optional **identifier** (ID) and **[payload](/documentation/concepts/payload/)**. The vector represents the high-dimensional data and the payload carries metadata information in a JSON format, giving the data point more context or attributes.
+## Data Security and Access Control
-**Storage Options**: There are two primary storage options. The in-memory storage option keeps all vectors in RAM, which allows for the highest speed in data access since disk access is only required for persistence.
+A common security risk in vector databases is the possibility of **embedding inversion attacks**, where attackers could reconstruct the original data from embeddings. There are many layers of protection you can use to secure your instance that are very important before getting your vector database into production.
-Alternatively, the Memmap storage option creates a virtual address space linked with the file on disk, giving a balance between memory usage and access speed.
+For quick security in simpler use cases, you can use the **API key authentication**. To enable it, set up the API key in the configuration or environment variable.
-**Clients**: Qdrant supports various programming languages for client interaction, such as Python, Go, Rust, and Typescript. This way developers can connect to and interact with Qdrant using the programming language they prefer.
+```yaml
+service:
+ api_key: your_secret_api_key_here
+ enable_tls: true # Make sure to enable TLS to protect the API key from being exposed
+```
+Once this is set up, remember to include the API key in all your requests:
-## Vector Database Use Cases
+```python
+client = QdrantClient(
+ url="https://localhost:6333",
+ api_key="your_secret_api_key_here"
+)
+```
-If we had to summarize the [use cases for vector databases](https://qdrant.tech/use-cases/) into a single word, it would be "match". They are great at finding non-obvious ways to correspond or “match” data with a given query. Whether it's through similarity in images, text, user preferences, or patterns in data.
+In more advanced setups, Qdrant uses **JWT (JSON Web Tokens)** to enforce **Role-Based Access Control (RBAC)**.
-Here are some examples of how to take advantage of using vector databases:
+RBAC defines roles and assigns permissions, while JWT securely encodes these roles into tokens. Each request is validated against the user's JWT, ensuring they can only access or modify data based on their assigned permissions.
-[Personalized recommendation systems](https://qdrant.tech/recommendations/) to analyze and interpret complex user data, such as preferences, behaviors, and interactions. For example, on Spotify, if a user frequently listens to the same song or skips it, the recommendation engine takes note of this to personalize future suggestions.
+You can easily setup you access tokens and secure access to sensitive data through the **Qdrant Web UI:**
-[Semantic search](https://qdrant.tech/documentation/tutorials/search-beginners/) allows for systems to be able to capture the deeper semantic meaning of words and text. In modern search engines, if someone searches for "tips for planting in spring," it tries to understand the intent and contextual meaning behind the query. It doesn’t try just matching the words themselves.
+
-Here’s an example of a [vector search engine for Startups](https://demo.qdrant.tech/) made with Qdrant:
+By default, Qdrant instances are **unsecured**, so it's important to configure security measures before moving to production. To learn more about how to configure security for your Qdrant instance and other advanced options, please check out the [official Qdrant documentation on security.](https://qdrant.tech/documentation/guides/security/)
+## Time to Experiment
-![](/articles_data/what-is-a-vector-database/semantic-search.png)
+As we've seen in this article, a vector database is definitely not **just** a database as we traditionally know it. It opens up a world of possibilities, from advanced similarity search to hybrid search that allows content retrieval with both context and precision.
-There are many other use cases like for **fraud detection and anomaly analysis** used in sectors like finance and cybersecurity, to detect anomalies and potential fraud. And **Content-Based Image Retrieval (CBIR)** for images by comparing vector representations rather than metadata or tags.
+But there’s no better way to learn than by doing. Try building a [semantic search engine](https://qdrant.tech/documentation/tutorials/search-beginners/) or experiment deploying a [hybrid search service](https://qdrant.tech/documentation/tutorials/hybrid-search-fastembed/) from zero. You'll realize there's endless ways you can take advantage of vectors.
-Those are just a few examples. The ability of vector databases to “match” data with queries makes them essential for multiple types of applications. Here are some more [use cases examples](/use-cases/) you can take a look at.
+| **Use Case** | **How It Works** | **Examples** |
+|-----------------------------------|------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|
+| **Similarity Search** | Finds similar data points using vector distances | Find similar product images, retrieve documents based on themes, discover related topics |
+| **Anomaly Detection** | Identifies outliers based on deviations in vector space | Detect unusual user behavior in banking, spot irregular patterns |
+| **Recommendation Systems** | Uses vector embeddings to learn and model user preferences | Personalized movie or music recommendations, e-commerce product suggestions |
+| **RAG (Retrieval-Augmented Generation)** | Combines vector search with large language models (LLMs) for contextually relevant answers | Customer support, auto-generate summaries of documents, research reports |
+| **Multimodal Search** | Search across different types of data like text, images, and audio in a single query. | Search for products with a description and image, retrieve images based on audio or text |
+| **Voice & Audio Recognition** | Uses vector representations to recognize and retrieve audio content | Speech-to-text transcription, voice-controlled smart devices, identify and categorize sounds |
+| **Knowledge Graph Augmentation** | Links unstructured data to concepts in knowledge graphs using vectors | Link research papers to related studies, connect customer reviews to product features, organize patents by innovation trends|
-### Get Started With Qdrant’s Vector Database Today
+You can also watch our video tutorial and get started with Qdrant to generate semantic search results and recommendations from a sample dataset.
-Now that you're familiar with the core concepts around vector databases, it’s time to get your hands dirty. [Start by building your own semantic search engine](/documentation/tutorials/search-beginners/) for science fiction books in just about 5 minutes with the help of Qdrant. You can also watch our [video tutorial](https://www.youtube.com/watch?v=AASiqmtKo54).
+
-Feeling ready to dive into a more complex project? Take the next step and get started building an actual [Neural Search Service with a complete API and a dataset](/documentation/tutorials/neural-search/).
+Phew! I hope you found some of the concepts here useful. If you have any questions feel free to send them in our [Discord Community](https://discord.com/invite/qdrant) where our team will be more than happy to help you out!
-Let’s get into action!
+> Remember, don't get lost in vector space! 🚀
\ No newline at end of file
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/Architecture-of-a-Vector-Database.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/Architecture-of-a-Vector-Database.jpg
deleted file mode 100644
index 1bfdf52a5..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/Architecture-of-a-Vector-Database.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/Binary-Quant.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/Binary-Quant.png
deleted file mode 100644
index 903272fe5..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/Binary-Quant.png and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/How-Do-Embeddings-Work_.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/How-Do-Embeddings-Work_.jpg
deleted file mode 100644
index a953f9471..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/How-Do-Embeddings-Work_.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/Indexing.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/Indexing.jpg
deleted file mode 100644
index ff30c9a5b..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/Indexing.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/Similarity-Search-and-Retrieval.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/Similarity-Search-and-Retrieval.jpg
deleted file mode 100644
index 437a05cbe..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/Similarity-Search-and-Retrieval.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/Vector-Data.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/Vector-Data.jpg
deleted file mode 100644
index a7a6c68c6..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/Vector-Data.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/Why-Use-Vector-Database.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/Why-Use-Vector-Database.jpg
deleted file mode 100644
index a24f57829..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/Why-Use-Vector-Database.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/ann-search.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/ann-search.png
new file mode 100644
index 000000000..c4e8a285a
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/ann-search.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/architecture-vector-db.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/architecture-vector-db.png
new file mode 100644
index 000000000..9771f6afe
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/architecture-vector-db.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/binary-quantization.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/binary-quantization.png
new file mode 100644
index 000000000..be3c4b8f0
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/binary-quantization.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/cosine-similarity.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/cosine-similarity.png
new file mode 100644
index 000000000..8c40bf1da
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/cosine-similarity.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/dense-1.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/dense-1.png
new file mode 100644
index 000000000..9b42ac856
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/dense-1.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/embedding-model.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/embedding-model.png
new file mode 100644
index 000000000..9ae48fb11
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/embedding-model.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/filtering-example.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/filtering-example.png
new file mode 100644
index 000000000..d488a6309
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/filtering-example.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/filtering.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/filtering.png
new file mode 100644
index 000000000..3c10fe716
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/filtering.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/hnsw-search.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/hnsw-search.png
new file mode 100644
index 000000000..b4f5c2d1b
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/hnsw-search.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/hnsw.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/hnsw.png
new file mode 100644
index 000000000..53b00b951
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/hnsw.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/hybrid-query-1.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/hybrid-query-1.png
new file mode 100644
index 000000000..1212b97a6
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/hybrid-query-1.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/hybrid-search-2.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/hybrid-search-2.png
new file mode 100644
index 000000000..cedbf10db
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/hybrid-search-2.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/i-love-similarity.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/i-love-similarity.png
deleted file mode 100644
index 4c9dd36dc..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/i-love-similarity.png and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/icon.svg b/qdrant-landing/static/articles_data/what-is-a-vector-database/icon.svg
deleted file mode 100644
index 8b3694fd7..000000000
--- a/qdrant-landing/static/articles_data/what-is-a-vector-database/icon.svg
+++ /dev/null
@@ -1,6 +0,0 @@
-
-
\ No newline at end of file
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/jwt-web-ui.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/jwt-web-ui.png
new file mode 100644
index 000000000..4120b0128
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/jwt-web-ui.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/metadata.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/metadata.jpg
deleted file mode 100644
index b125ac093..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/metadata.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/multitenancy-1.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/multitenancy-1.png
new file mode 100644
index 000000000..b87ea23e0
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/multitenancy-1.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/oltp-and-olap.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/oltp-and-olap.png
new file mode 100644
index 000000000..cc8c6f880
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/oltp-and-olap.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/point.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/point.png
new file mode 100644
index 000000000..35f2bb9ba
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/point.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/preview/social-preview.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/preview/social-preview.jpg
deleted file mode 100644
index 72cc220a9..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/preview/social-preview.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/preview/social_preview.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/preview/social_preview.png
new file mode 100644
index 000000000..0b5bafe36
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/preview/social_preview.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/replication.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/replication.png
new file mode 100644
index 000000000..519e7c1a3
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/replication.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/search-query.jpg b/qdrant-landing/static/articles_data/what-is-a-vector-database/search-query.jpg
deleted file mode 100644
index 29b9bc4cc..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/search-query.jpg and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/segments.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/segments.png
new file mode 100644
index 000000000..7ca6673ed
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/segments.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/semantic-search.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/semantic-search.png
deleted file mode 100644
index 6bac2164a..000000000
Binary files a/qdrant-landing/static/articles_data/what-is-a-vector-database/semantic-search.png and /dev/null differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/sharding-raft.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/sharding-raft.png
new file mode 100644
index 000000000..b403a1273
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/sharding-raft.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/similarity.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/similarity.png
new file mode 100644
index 000000000..45db0aed5
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/similarity.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/simple-arquitecture.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/simple-arquitecture.png
new file mode 100644
index 000000000..7960477c5
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/simple-arquitecture.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/sparse.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/sparse.png
new file mode 100644
index 000000000..337947388
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/sparse.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/two-similar-vectors.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/two-similar-vectors.png
new file mode 100644
index 000000000..695d924a8
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/two-similar-vectors.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-1.jpeg b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-1.jpeg
new file mode 100644
index 000000000..5a6573ec7
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-1.jpeg differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-2.jpeg b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-2.jpeg
new file mode 100644
index 000000000..86c4568d9
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-2.jpeg differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-3.jpeg b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-3.jpeg
new file mode 100644
index 000000000..d92f53ac0
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-3.jpeg differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-4.jpeg b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-4.jpeg
new file mode 100644
index 000000000..b430e4130
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-4.jpeg differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-5.jpeg b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-5.jpeg
new file mode 100644
index 000000000..0fb02aec6
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-5.jpeg differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-6.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-6.png
new file mode 100644
index 000000000..03f7d2248
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-6.png differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-7.jpeg b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-7.jpeg
new file mode 100644
index 000000000..ac05b0579
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-database-7.jpeg differ
diff --git a/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-db-structure.png b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-db-structure.png
new file mode 100644
index 000000000..856776afa
Binary files /dev/null and b/qdrant-landing/static/articles_data/what-is-a-vector-database/vector-db-structure.png differ