From ffcced814478f4b09af0750e02c4434065df512c Mon Sep 17 00:00:00 2001 From: Paul Gauthier Date: Thu, 25 Jul 2024 09:48:16 +0200 Subject: [PATCH] copy --- aider/website/docs/leaderboards/index.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/aider/website/docs/leaderboards/index.md b/aider/website/docs/leaderboards/index.md index 43b324708..935bbc68d 100644 --- a/aider/website/docs/leaderboards/index.md +++ b/aider/website/docs/leaderboards/index.md @@ -5,6 +5,7 @@ description: Quantitative benchmarks of LLM code editing skill. --- # Aider LLM Leaderboards +{: .no_toc } Aider works best with LLMs which are good at *editing* code, not just good at writing code. @@ -16,10 +17,15 @@ The leaderboards below report the results from a number of popular LLMs. While [aider can connect to almost any LLM](/docs/llms.html), it works best with models that score well on the benchmarks. +See the following sections for benchmark +results and additional information: +- TOC +{:toc} ## Code editing leaderboard -[Aider's code editing benchmark](/docs/benchmarks.html#the-benchmark) asks the LLM to edit python source files to complete 133 small coding exercises. This benchmark measures the LLM's coding ability, but also whether it can consistently emit code edits in the format specified in the system prompt. +[Aider's code editing benchmark](/docs/benchmarks.html#the-benchmark) asks the LLM to edit python source files to complete 133 small coding exercises +from Exercism. This benchmark measures the LLM's coding ability, but also whether it can consistently emit code edits in the format specified in the system prompt.