zhentaoyu

Follow

🎯

Focusing

zhentaoyu zhentaoyu

🎯

Focusing

Follow

5 followers · 17 following

intel
Shanghai
densecollections.top

Achievements

Achievements

Pinned Loading

intel/neural-speed intel/neural-speed Public archive

An innovative library for efficient LLM inference via low-bit quantization

C++ 351 38
intel/intel-extension-for-transformers intel/intel-extension-for-transformers Public archive

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2.2k 211
ggml-org/llama.cpp ggml-org/llama.cpp Public

LLM inference in C/C++

C++ 75.8k 11k
leejet/stable-diffusion.cpp leejet/stable-diffusion.cpp Public

Stable Diffusion and Flux in pure C/C++

C++ 3.9k 349
vllm-fork vllm-fork Public

Forked from HabanaAI/vllm-fork

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
intel/neural-compressor intel/neural-compressor Public

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2.3k 263