diff --git a/_posts/2024-11-12-copilot-arena.md b/_posts/2024-11-12-copilot-arena.md
index 574d34e..b8ea075 100644
--- a/_posts/2024-11-12-copilot-arena.md
+++ b/_posts/2024-11-12-copilot-arena.md
@@ -144,8 +144,8 @@ Most current Copilot Arena users code in Python, followed by javascript/typescri
 **What kind of context lengths are we looking at?**  
 The mean context length is 1002 tokens and the median is 560 tokens. This is much longer than tasks considered in existing static benchmarks. For example, human eval has a median length of ~100 tokens.
 
-<img src="/assets/img/blog/copilot_arena/filetype_dist.png" alt="Copilot Arena filetype distribution" style="display:block; margin-top: auto; margin-left: auto; margin-right: auto; margin-bottom: auto; width: 90%">
-<p style="color:gray; text-align: center;">Figure 3. Filetypes requested in Copilot Arena. Filetypes are determined based on file extension.</p>
+<img src="/assets/img/blog/copilot_arena/context_length_dist.png" alt="Copilot Arena filetype distribution" style="display:block; margin-top: auto; margin-left: auto; margin-right: auto; margin-bottom: auto; width: 90%">
+<p style="color:gray; text-align: center;">Figure 3. Context length of files requested in Copilot Arena.</p>
 
 **Are people biased towards the top completion?** Yes. In fact, 82% of accepted completions were the top completion. We are still analyzing our data, but here are some of our insights.
 
diff --git a/assets/img/blog/copilot_arena/leaderboard_pfp.png b/assets/img/blog/copilot_arena/leaderboard_pfp.png
index ed10bdb..973c252 100644
Binary files a/assets/img/blog/copilot_arena/leaderboard_pfp.png and b/assets/img/blog/copilot_arena/leaderboard_pfp.png differ