I am a first-year PhD student in the CS Dept. at Tsinghua University, focusing on efficient training and inference of large models.
- 🏠 My Homepage.
I am a first-year PhD student in the CS Dept. at Tsinghua University, focusing on efficient training and inference of large models.
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
Day of Week | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | |||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |