Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Commit

Permalink
Merge branch 'hengguo/h2o' of https://github.com/intel/intel-extensio…
Browse files Browse the repository at this point in the history
…n-for-transformers into hengguo/h2o
  • Loading branch information
n1ck-guo committed Jul 15, 2024
2 parents 3723158 + 0c547c5 commit 7da0cf5
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion examples/huggingface/pytorch/text-generation/h2o/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

**Heavy-Hitter Oracal (H2O)** is a novel approach for implementing the KV cache wihich significantly reduces memory footprint.
**Heavy-Hitter Oracal (H2O)** is a novel approach for implementing the KV cache which significantly reduces memory footprint.

This methods base on the fact that the accumulated attention scores of all tokens in attention blocks adhere to a power-law distribution. It suggests that there exists a small set of influential tokens that are critical during generation, named heavy-hitters (H2). H2 provides an opportunity to step away from the combinatorial search problem and identify an eviction policy that maintains accuracy.

Expand Down

0 comments on commit 7da0cf5

Please sign in to comment.