-
Notifications
You must be signed in to change notification settings - Fork 79
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Anush008 <[email protected]>
- Loading branch information
Showing
18 changed files
with
197 additions
and
39 deletions.
There are no files selected for viewing
47 changes: 24 additions & 23 deletions
47
qdrant-landing/content/documentation/embeddings/_index.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,36 +1,37 @@ | ||
|
||
--- | ||
title: Embeddings | ||
weight: 19 | ||
partition: build | ||
--- | ||
|
||
# Supported Embedding Providers & Models | ||
|
||
Qdrant supports all available text and multimodal dense vector embedding models as well as vector embedding services without any limitations. | ||
Qdrant supports all available text and multimodal dense vector embedding models as well as vector embedding services without any limitations. | ||
|
||
## Some of the Embeddings you can use with Qdrant: | ||
## Some of the Embeddings you can use with Qdrant | ||
|
||
SentenceTransformers, BERT, SBERT, Clip, OpenClip, Open AI, Vertex AI, Azure AI, AWS Bedrock, Jina AI, Upstage AI, Mistral AI, Cohere AI, Voyage AI, Aleph Alpha, Baidu Qianfan, BGE, Instruct, Watsonx Embeddings, Snowflake Embeddings, NVIDIA NeMo, Nomic, OCI Embeddings, Ollama Embeddings, MixedBread, Together AI, Clarifai, Databricks Embeddings, GPT4All Embeddings, John Snow Labs Embeddings. | ||
|
||
Additionally, [any open-source embeddings from HuggingFace](https://huggingface.co/spaces/mteb/leaderboard) can be used with Qdrant. | ||
Additionally, [any open-source embeddings from HuggingFace](https://huggingface.co/spaces/mteb/leaderboard) can be used with Qdrant. | ||
|
||
## Code samples: | ||
## Code samples | ||
|
||
| Embeddings Providers | Description | | ||
| ----------------------------- | ----------- | | ||
| [Aleph Alpha](/documentation/embeddings/aleph-alpha/) | Multilingual embeddings focused on European languages. | | ||
| [Bedrock](/documentation/embeddings/bedrock/) | AWS managed service for foundation models and embeddings. | | ||
| [Cohere](/documentation/embeddings/cohere/) | Language model embeddings for NLP tasks. | | ||
| [Gemini](/documentation/embeddings/gemini/) | Google’s multimodal embeddings for text and vision. | ||
| [Jina AI](/documentation/embeddings/jina-embeddings/) | Customizable embeddings for neural search. | | ||
| [Mistral](/documentation/embeddings/mistral/) | Open-source, efficient language model embeddings. | | ||
| [MixedBread](/documentation/embeddings/mixedbread/) | Lightweight embeddings for constrained environments. | | ||
| [Mixpeek](/documentation/embeddings/mixpeek/) | Managed SDK for video chunking, embedding, and post-processing. | | ||
| [Nomic](/documentation/embeddings/nomic/) | Embeddings for data visualization. | | ||
| [Nvidia](/documentation/embeddings/nvidia/) | GPU-optimized embeddings from Nvidia. | | ||
| [Ollama](/documentation/embeddings/ollama/) | Embeddings for conversational AI. | | ||
| [OpenAI](/documentation/embeddings/openai/) | Industry-leading embeddings for NLP. | | ||
| [Prem AI](/documentation/embeddings/premai/) | Precise language embeddings. | | ||
| [Snowflake](/documentation/embeddings/snowflake/) | Scalable embeddings for big data. | | ||
| [Upstage](/documentation/embeddings/upstage/) | Embeddings for speech and language tasks. | | ||
| [Voyage AI](/documentation/embeddings/voyage/) | Navigation and spatial understanding embeddings. | | ||
| Embeddings Providers | Description | | ||
| ----------------------------------------------------- | ---------------------------------------------------------------- | | ||
| [Aleph Alpha](/documentation/embeddings/aleph-alpha/) | Multilingual embeddings focused on European languages. | | ||
| [Bedrock](/documentation/embeddings/bedrock/) | AWS managed service for foundation models and embeddings. | | ||
| [Cohere](/documentation/embeddings/cohere/) | Language model embeddings for NLP tasks. | | ||
| [Gemini](/documentation/embeddings/gemini/) | Google’s multimodal embeddings for text and vision. | | ||
| [Jina AI](/documentation/embeddings/jina-embeddings/) | Customizable embeddings for neural search. | | ||
| [Mistral](/documentation/embeddings/mistral/) | Open-source, efficient language model embeddings. | | ||
| [MixedBread](/documentation/embeddings/mixedbread/) | Lightweight embeddings for constrained environments. | | ||
| [Mixpeek](/documentation/embeddings/mixpeek/) | Managed SDK for video chunking, embedding, and post-processing. | | ||
| [Nomic](/documentation/embeddings/nomic/) | Embeddings for data visualization. | | ||
| [Nvidia](/documentation/embeddings/nvidia/) | GPU-optimized embeddings from Nvidia. | | ||
| [Ollama](/documentation/embeddings/ollama/) | Embeddings for conversational AI. | | ||
| [OpenAI](/documentation/embeddings/openai/) | Industry-leading embeddings for NLP. | | ||
| [Prem AI](/documentation/embeddings/premai/) | Precise language embeddings. | | ||
| [Twelve Labs](/documentation/embeddings/twelvelabs/) | Multimodal embeddings from Twelve labs. | | ||
| [Snowflake](/documentation/embeddings/snowflake/) | Scalable embeddings for big data. | | ||
| [Upstage](/documentation/embeddings/upstage/) | Embeddings for speech and language tasks. | | ||
| [Voyage AI](/documentation/embeddings/voyage/) | Navigation and spatial understanding embeddings. | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: AWS Bedrock | ||
weight: 1000 | ||
--- | ||
|
||
# Bedrock Embeddings | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Cohere | ||
weight: 1400 | ||
aliases: [ ../integrations/cohere/ ] | ||
--- | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 change: 0 additions & 1 deletion
1
qdrant-landing/content/documentation/embeddings/jina-embeddings.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: MixedBread | ||
weight: 2200 | ||
--- | ||
|
||
# Using MixedBread with Qdrant | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Mixpeek | ||
weight: 2250 | ||
--- | ||
|
||
# Mixpeek Video Embeddings | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: "Nomic" | ||
weight: 2300 | ||
--- | ||
|
||
# Nomic | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Nvidia | ||
weight: 2400 | ||
--- | ||
|
||
# Nvidia | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Ollama | ||
weight: 2600 | ||
--- | ||
|
||
# Using Ollama with Qdrant | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: OpenAI | ||
weight: 2700 | ||
aliases: [ ../integrations/openai/ ] | ||
--- | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Prem AI | ||
weight: 2800 | ||
--- | ||
|
||
# Prem AI | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Snowflake Models | ||
weight: 2900 | ||
--- | ||
|
||
# Snowflake | ||
|
173 changes: 173 additions & 0 deletions
173
qdrant-landing/content/documentation/embeddings/twelvelabs.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,173 @@ | ||
--- | ||
title: Twelve Labs | ||
--- | ||
|
||
# Twelve Labs | ||
|
||
[Twelve Labs](https://twelvelabs.io) Embed API provides powerful embeddings that represent videos, texts, images, and audio in a unified vector space. This space enables any-to-any searches across different types of content. | ||
|
||
By natively processing all modalities, it captures interactions like visual expressions, speech, and context, enabling advanced applications such as sentiment analysis, anomaly detection, and recommendation systems with precision and efficiency. | ||
|
||
We'll look at how to work with Twelve Labs embeddings in Qdrant via the Python and Node SDKs. | ||
|
||
### Installing the SDKs | ||
|
||
```python | ||
$ pip install twelvelabs qdrant-client | ||
``` | ||
|
||
```typescript | ||
$ npm install twelvelabs-js @qdrant/js-client-rest | ||
``` | ||
|
||
### Setting up the clients | ||
|
||
```python | ||
from twelvelabs import TwelveLabs | ||
from qdrant_client import QdrantClient | ||
|
||
# Get your API keys from: | ||
# https://playground.twelvelabs.io/dashboard/api-key | ||
TL_API_KEY = "<YOUR_TWELVE_LABS_API_KEY>" | ||
|
||
twelvelabs_client = TwelveLabs(api_key=TL_API_KEY) | ||
qdrant_client = QdrantClient(url="http://localhost:6333/") | ||
``` | ||
|
||
```typescript | ||
import { QdrantClient } from '@qdrant/js-client-rest'; | ||
import { TwelveLabs, EmbeddingsTask, SegmentEmbedding } from 'twelvelabs'; | ||
|
||
// Get your API keys from: | ||
// https://playground.twelvelabs.io/dashboard/api-key | ||
const TL_API_KEY = "<YOUR_TWELVE_LABS_API_KEY>" | ||
|
||
const twelveLabsClient = new TwelveLabs({ apiKey: TL_API_KEY }); | ||
const qdrantClient = new QdrantClient({ url: 'http://localhost:6333' }); | ||
``` | ||
|
||
The following example uses the `"Marengo-retrieval-2.6"` engine to embed a video. It generates vector embeddings of 1024 dimensionality and works with cosine similarity. | ||
|
||
You can use the same engine to embed audio, text and images into a common vector space. Enabling cross-modality searches! | ||
|
||
### Embedding videos | ||
|
||
```python | ||
task = twelvelabs_client.embed.task.create( | ||
engine_name="Marengo-retrieval-2.6", | ||
video_url="https://sample-videos.com/video321/mp4/720/big_buck_bunny_720p_2mb.mp4" | ||
) | ||
|
||
task.wait_for_done(sleep_interval=3) | ||
|
||
task_result = twelvelabs_client.embed.task.retrieve(task.id) | ||
``` | ||
|
||
```typescript | ||
const task = await twelveLabsClient.embed.task.create("Marengo-retrieval-2.6", { | ||
url: "https://sample-videos.com/video321/mp4/720/big_buck_bunny_720p_2mb.mp4" | ||
}) | ||
|
||
await task.waitForDone(3) | ||
|
||
const taskResult = await twelveLabsClient.embed.task.retrieve(task.id) | ||
``` | ||
|
||
### Converting the model outputs to Qdrant points | ||
|
||
```python | ||
from qdrant_client.models import PointStruct | ||
|
||
points = [ | ||
PointStruct( | ||
id=idx, | ||
vector=v.embeddings_float, | ||
payload={ | ||
"start_offset_sec": v.start_offset_sec, | ||
"end_offset_sec": v.end_offset_sec, | ||
"embedding_scope": v.embedding_scope, | ||
}, | ||
) | ||
for idx, v in enumerate(task_result.video_embedding.segments) | ||
] | ||
``` | ||
|
||
```typescript | ||
let points = taskResult.videoEmbedding.segments.map((data, i) => { | ||
return { | ||
id: i, | ||
vector: data.embeddingsFloat, | ||
payload: { | ||
startOffsetSec: data.startOffsetSec, | ||
endOffsetSec: data.endOffsetSec, | ||
embeddingScope: data.embeddingScope | ||
} | ||
} | ||
}) | ||
``` | ||
|
||
### Creating a collection to insert the vectors | ||
|
||
```python | ||
from qdrant_client.models import VectorParams, Distance | ||
|
||
collection_name = "twelve_labs_collection" | ||
|
||
qdrant_client.create_collection( | ||
collection_name, | ||
vectors_config=VectorParams( | ||
size=1024, | ||
distance=Distance.COSINE, | ||
), | ||
) | ||
qdrant_client.upsert(collection_name, points) | ||
``` | ||
|
||
```typescript | ||
const COLLECTION_NAME = "twelve_labs_collection" | ||
|
||
await qdrantClient.createCollection(COLLECTION_NAME, { | ||
vectors: { | ||
size: 1024, | ||
distance: 'Cosine', | ||
} | ||
}); | ||
|
||
await qdrantClient.upsert(COLLECTION_NAME, { | ||
wait: true, | ||
points | ||
}) | ||
``` | ||
|
||
## Perform a search | ||
|
||
Once the vectors are added, you can run semantic searches across different modalities. Let's try text. | ||
|
||
```python | ||
segment = twelvelabs_client.embed.create( | ||
engine_name="Marengo-retrieval-2.6", | ||
text="<YOUR_QUERY_TEXT>", | ||
).text_embedding.segments[0] | ||
|
||
|
||
qdrant_client.query_points( | ||
collection_name=collection_name, | ||
query=segment.embeddings_float, | ||
) | ||
``` | ||
|
||
```typescript | ||
const segment = (await twelveLabsClient.embed.create({ | ||
engineName: "Marengo-retrieval-2.6", | ||
text: "<YOUR_QUERY_TEXT>" | ||
})).textEmbedding.segments[0] | ||
|
||
await qdrantClient.query(COLLECTION_NAME, { | ||
query: segment.embeddingsFloat, | ||
}); | ||
``` | ||
|
||
## Further Reading | ||
|
||
- [Twelve Labs Documentation](https://docs.twelvelabs.io/) | ||
- [Twelve Labs Examples](https://docs.twelvelabs.io/docs/sample-applications) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Upstage | ||
weight: 3100 | ||
--- | ||
|
||
# Upstage | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
--- | ||
title: Voyage AI | ||
weight: 3200 | ||
--- | ||
|
||
# Voyage AI | ||
|