Skip to content

Commit

Permalink
Update demo at README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
hodlen committed Dec 16, 2023
1 parent 7b699b0 commit 6949bb8
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
---

*Demo* 🔥
## Demo 🔥

https://github.com/SJTU-IPADS/PowerInfer/assets/34213478/d26ae05b-d0cf-40b6-8788-bda3fe447e28

<sub>PowerInfer v.s. llama.cpp on a single RTX 4090(24G) running Falcon(ReLU)-40B-FP16 with a 11x speedup!</sub>
PowerInfer v.s. llama.cpp on a single RTX 4090(24G) running Falcon(ReLU)-40B-FP16 with a 11x speedup!

<sub>Both PowerInfer and llama.cpp were running on the same hardware and fully utilized VRAM on RTX 4090.</sub>

---
## Abstract
Expand Down

0 comments on commit 6949bb8

Please sign in to comment.