yuankuns

Yuankun Shi yuankuns

Achievements

FlashMLA-fork FlashMLA-fork Public

Forked from deepseek-ai/FlashMLA

C++
vllm vllm Public

Forked from Wanzizhu/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
flash-attention flash-attention Public

Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python
test_xetla_paged_attention test_xetla_paged_attention Public

Forked from baodii/flash_attention_factory

C++
test_xetla_group_gemm test_xetla_group_gemm Public

C++
cutlass-sycl cutlass-sycl Public

Forked from intel/sycl-tla

A CUTLASS implementation using SYCL

C++