Stars
Entropy Based Sampling and Parallel CoT Decoding
A high-throughput and memory-efficient inference and serving engine for LLMs
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Inspect: A framework for large language model evaluations
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
DSPy: The framework for programming—not prompting—language models
Superfast AI decision making and intelligent processing of multi-modal data.
Convert PDF to markdown + JSON quickly with high accuracy
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Efficient Attention for Long Sequence Processing
Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph
Vision utilities for web interaction agents 👀
2D Positional Embeddings for Webpage Structural Understanding 🦙👀
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
Search for words, documents, images, videos, news and maps using the Brave search engine. Downloading files and images to a local hard drive.
HAAS = Hierarchical Autonomous Agent Swarm - "Resistance is futile!"
Blazing fast fuzzy text search for Python.
Generative Agents: Interactive Simulacra of Human Behavior
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.