emai: [email protected]
- AWS Amplify Customizable Auth Components
- Event Search and Recommendation APP
- Sorting Algorithms Visualizer
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A high-throughput and memory-efficient inference and serving engine for LLMs
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.