diff --git a/_posts/2024-11-12-copilot-arena.md b/_posts/2024-11-12-copilot-arena.md index 574d34e..b8ea075 100644 --- a/_posts/2024-11-12-copilot-arena.md +++ b/_posts/2024-11-12-copilot-arena.md @@ -144,8 +144,8 @@ Most current Copilot Arena users code in Python, followed by javascript/typescri **What kind of context lengths are we looking at?** The mean context length is 1002 tokens and the median is 560 tokens. This is much longer than tasks considered in existing static benchmarks. For example, human eval has a median length of ~100 tokens. -Copilot Arena filetype distribution -

Figure 3. Filetypes requested in Copilot Arena. Filetypes are determined based on file extension.

+Copilot Arena filetype distribution +

Figure 3. Context length of files requested in Copilot Arena.

**Are people biased towards the top completion?** Yes. In fact, 82% of accepted completions were the top completion. We are still analyzing our data, but here are some of our insights. diff --git a/assets/img/blog/copilot_arena/leaderboard_pfp.png b/assets/img/blog/copilot_arena/leaderboard_pfp.png index ed10bdb..973c252 100644 Binary files a/assets/img/blog/copilot_arena/leaderboard_pfp.png and b/assets/img/blog/copilot_arena/leaderboard_pfp.png differ