Skip to content
@bentoml

BentoML

The easiest way to run AI Inference in the cloud

Welcome to BentoML 👋 Twitter Follow Slack

BentoML

What is BentoML? 👩‍🍳

BentoML is an open-source model serving library for building model inference APIs and multi-model serving systems with any open-source or custom AI models. It comes with everything you need for serving optimization, model packaging, and simplifies production deployment via ☁️ BentoCloud.

Get in touch 💬

👉 Join our Slack community!

👀 Follow us on X @bentomlai and LinkedIn

📖 Read our blog

Pinned Loading

  1. BentoML BentoML Public

    The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

    Python 6.8k 767

  2. OpenLLM OpenLLM Public

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

    Python 9.2k 588

Repositories

Showing 10 of 80 repositories
  • bentoml/BentoTRTLLM’s past year of commit activity
    Python 2 1 0 0 Updated Jul 2, 2024
  • BentoML Public

    The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

    bentoml/BentoML’s past year of commit activity
    Python 6,772 Apache-2.0 767 216 12 Updated Jul 2, 2024
  • bentoml/llm-bench’s past year of commit activity
    Python 14 1 2 1 Updated Jul 2, 2024
  • OpenLLM Public

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

    bentoml/OpenLLM’s past year of commit activity
    Python 9,233 Apache-2.0 588 60 3 Updated Jul 1, 2024
  • asynq Public Forked from hibiken/asynq

    Simple, reliable, and efficient distributed task queue in Go

    bentoml/asynq’s past year of commit activity
    Go 0 MIT 698 0 1 Updated Jul 1, 2024
  • BentoVLLM Public

    Self-host LLMs with vLLM and BentoML

    bentoml/BentoVLLM’s past year of commit activity
    Python 37 9 3 0 Updated Jun 25, 2024
  • BentoLMDeploy Public

    Self-host LLMs with LMDeploy and BentoML

    bentoml/BentoLMDeploy’s past year of commit activity
    Python 7 1 1 0 Updated Jun 25, 2024
  • bentoml/bentocloud-homepage-news’s past year of commit activity
    1 1 0 0 Updated Jun 21, 2024
  • bentoml/openllm-repo’s past year of commit activity
    HTML 0 0 0 0 Updated Jun 21, 2024
  • chatgpt-lite Public Forked from blrchen/chatgpt-lite

    Fast ChatGPT UI with support for both OpenAI and Azure OpenAI. 快速的ChatGPT UI,支持OpenAI和Azure OpenAI。

    bentoml/chatgpt-lite’s past year of commit activity
    TypeScript 0 MIT 78 0 1 Updated Jun 20, 2024