From a58bc98f51b85c713231da6e9ab0292edf2c4fb9 Mon Sep 17 00:00:00 2001 From: Paul Gauthier Date: Mon, 6 May 2024 12:42:14 -0700 Subject: [PATCH] copy --- docs/leaderboards/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/leaderboards/index.md b/docs/leaderboards/index.md index c9a01359d..68a6a4ca7 100644 --- a/docs/leaderboards/index.md +++ b/docs/leaderboards/index.md @@ -175,7 +175,7 @@ Therefore, results are available for fewer models. The key benchmarking results are: - **Percent completed correctly** - Measures what percentage of the coding tasks that the LLM completed successfully. To complete a task, the LLM must solve the programming assignment *and* edit the code to implement that solution. -- **Percent using correct edit format** - Measures the percent of coding tasks where the LLM complied with the edit format specified in the system prompt. If the LLM makes edit mistakes, aider will give it feedback and ask for a fixed copy of the edit. But the best models can reliably conform to the edit format, without making errors. +- **Percent using correct edit format** - Measures the percent of coding tasks where the LLM complied with the edit format specified in the system prompt. If the LLM makes edit mistakes, aider will give it feedback and ask for a fixed copy of the edit. The best models can reliably conform to the edit format, without making errors. ## Notes on the edit format