Skip to content

Commit

Permalink
Updated site
Browse files Browse the repository at this point in the history
  • Loading branch information
jonah-ramponi committed Mar 30, 2024
1 parent 67caa9c commit 4a65ce9
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions posts/sparse_attention/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,11 @@ <h1 class="title">Sparse Attention</h1>
\text{attention}(Q,K,V, S_i) &amp;= \text{softmax}\Big( \frac{(Q_{S_i}) K^T_{S_i}}{\sqrt{d_k}} \Big) V_{S_i}.
\end{align}</p>
<p>Here, we have defined</p>
<p>$$ Q_{S_i} = (W_q \vec{x}<em>j )</em>{j \in S_i} $$</p>
<p>\begin{align*}
Q_{S_i} &amp;=(W_q \vec{x}<em>j )</em>{j \in S_i}, \\
K_{S_i} &amp;=(W_k \vec{x}<em>j )</em>{j \in S_i}, \\
V_{S_i} &amp;= (W_v\vec{x}<em>j )</em>{j \in S_i}.
Q_{S_i} &amp;= (W_q \vec{x}<em>j )</em>{j \in S_i}, \\
K_{S_i} &amp;= (W_k \vec{x}<em>j )</em>{j \in S_i}, \\
V_{S_i} &amp;= (W_v\ vec{x}<em>j )</em>{j \in S_i}.
\end{align*}</p>
<p>So how do we define the set of connectivity patterns $S$? Formally, we let $S_i = A_i^{h}$ for head $h$ where $A_i^{h} \subset {j : j \leq i}$. It is still no clearer how we pick which indices we should take for a given $S_i$. The original authors consider two key criteria initially:</p>
<h4 id="criteria-1">Criteria 1</h4>
Expand All @@ -115,7 +116,7 @@ <h4 id="criteria-2">Criteria 2</h4>
A^{(2)}_i &amp;= \Big{ j : j \mod l \in { t, t + 1, \ldots, l } \Big}, \text{ where } t = l - c \text{ and } c \text{ is a hyperparameter.}
\end{align*}</p>
<p>These are best understood visually in my opinion. In the image below, $A_i^{(1)}$ is responsible for the dark blue shading and $A_i^{(2)}$ for the light blue shading. If we take stride, $l$ = 128 and $c=8$, then all positions greater than 128 can attend to positions $120-128$. The authors find choosing $c \in {8,16,32}$ worked well.</p>
<p><img src="/img/sparse_attention.png" alt="my alt text">{: style=&ldquo;width: 85%; border: none;&rdquo;}</p>
<p><img src="/img/sparse_attention.png" alt="Sparse Attention Matrix"></p>

</section>

Expand Down

0 comments on commit 4a65ce9

Please sign in to comment.