Stars
A Datacenter Scale Distributed Inference Serving Framework
Recipes to train the self-rewarding reasoning LLMs.
Introduction to Modular Forms: A Chinese textbook about modular forms
Code for the paper "UVDoc: Neural Grid-based Document Unwarping" - Dataset capture and creation
[ICLR 2025] On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning
verl: Volcano Engine Reinforcement Learning for LLMs
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
SGLang is a fast serving framework for large language models and vision language models.
FastVideo is a lightweight framework for accelerating large video diffusion models.
nndeploy is an end-to-end model inference and deployment framework. It aims to provide users with a powerful, easy-to-use, high-performance, and mainstream framework compatible model inference and …
Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, off…
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Muon optimizer: +>30% sample efficiency with <3% wallclock overhead
Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents…
HunyuanVideo: A Systematic Framework For Large Video Generation Model
ScholArxiv is an open-source, aesthetic, minimal and AI powered app that allows users to search, read, bookmark, share, download and view summaries of academic papers from the arXiv repository.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Pytorch implementation of MIMO, Controllable Character Video Synthesis with Spatial Decomposed Modeling, from Alibaba Intelligence Group
A project to map out the relations between different equational theories of Magmas.