🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
-
Updated
Jan 29, 2025 - Python
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
MTEB: Massive Text Embedding Benchmark
Study guides for MIT's 15.003 Data Science Tools
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
A realtime serving engine for Data-Intensive Generative AI Applications
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.
SGPT: GPT Sentence Embeddings for Semantic Search
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
Epsilla is a high performance Vector Database Management System
Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.
My personal note about local and global descriptor
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch
Add a description, image, and links to the retrieval topic page so that developers can more easily learn about it.
To associate your repository with the retrieval topic, visit your repo's landing page and select "manage topics."