Skip to content
Change the repository type filter

All

    Repositories list

    • nndeploy

      Public
      nndeploy is an end-to-end model inference and deployment framework. It aims to provide users with a powerful, easy-to-use, high-performance, and mainstream framework compatible model inference and deployment experience.一款端到端的模型推理和部署框架。它旨在为用户提供功能强大、简单易用、高性能且兼容主流框架的模型推理和部署体验。
      C++
      Apache License 2.0
      10470170Updated Mar 1, 2025Mar 1, 2025
    • .github

      Public
      0000Updated Feb 5, 2025Feb 5, 2025
    • Header-only safetensors loader and saver in C++
      C++
      MIT License
      11000Updated Nov 19, 2024Nov 19, 2024
    • onnx-llm

      Public
      llm deploy project based onnx.
      C++
      Apache License 2.0
      7000Updated Oct 9, 2024Oct 9, 2024
    • Universal cross-platform tokenizers binding to HF and sentencepiece
      C++
      Apache License 2.0
      74100Updated Jun 3, 2024Jun 3, 2024
    • 💻A small Collection for Awesome LLM Inference [Papers|Blogs|Docs] with codes, contains TensorRT-LLM, streaming-llm, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
      GNU General Public License v3.0
      243200Updated Dec 3, 2023Dec 3, 2023
    • Simplify your onnx model
      Python
      Apache License 2.0
      389100Updated Apr 27, 2022Apr 27, 2022