From 6bc18b9591a6160d17560a22e24a77138a8997c0 Mon Sep 17 00:00:00 2001 From: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Date: Mon, 6 May 2024 17:15:26 -0700 Subject: [PATCH 1/3] Update index.md --- docs/leaderboards/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/leaderboards/index.md b/docs/leaderboards/index.md index d251f9285..3fb6627c3 100644 --- a/docs/leaderboards/index.md +++ b/docs/leaderboards/index.md @@ -8,7 +8,7 @@ highlight_image: /assets/leaderboard.jpg Aider works best with LLMs which are good at *editing* code, not just good at writing code. To evaluate an LLM's editing skill, aider uses a pair of benchmarks that -assess their ability to consistently follow the system instructions +assess a model's ability to consistently follow the system instructions to successfully edit code. The leaderboards below report the results from a number of popular LLMs. From 18761b70be66063fbff23f3099258c02d5a09fc2 Mon Sep 17 00:00:00 2001 From: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Date: Mon, 6 May 2024 17:16:29 -0700 Subject: [PATCH 2/3] Update index.md --- docs/leaderboards/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/leaderboards/index.md b/docs/leaderboards/index.md index 3fb6627c3..998dc78e8 100644 --- a/docs/leaderboards/index.md +++ b/docs/leaderboards/index.md @@ -8,7 +8,7 @@ highlight_image: /assets/leaderboard.jpg Aider works best with LLMs which are good at *editing* code, not just good at writing code. To evaluate an LLM's editing skill, aider uses a pair of benchmarks that -assess a model's ability to consistently follow the system instructions +assess a model's ability to consistently follow the system prompt to successfully edit code. The leaderboards below report the results from a number of popular LLMs. From 6fb4983d88e945832094bb81aa6ae7f53c80c024 Mon Sep 17 00:00:00 2001 From: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Date: Mon, 6 May 2024 20:42:23 -0700 Subject: [PATCH 3/3] Update edit_leaderboard.yml --- _data/edit_leaderboard.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_data/edit_leaderboard.yml b/_data/edit_leaderboard.yml index 51ec96298..9c34f9c36 100644 --- a/_data/edit_leaderboard.yml +++ b/_data/edit_leaderboard.yml @@ -42,7 +42,7 @@ total_cost: 0.0000 - dirname: 2024-04-29-19-17-28--deepseek-coder-whole test_cases: 132 - model: openai/deepseek-coder + model: deepseek-coder edit_format: whole commit_hash: c07f793-dirty pass_rate_1: 47.0