mirror of
https://github.com/Aider-AI/aider.git
synced 2025-06-04 03:35:00 +00:00
Docs: Make leaderboards intro text more concise
This commit is contained in:
parent
49e4af4fab
commit
19c7c7a9dc
1 changed files with 5 additions and 22 deletions
|
@ -8,32 +8,15 @@ has_children: true
|
||||||
|
|
||||||
# Aider LLM Leaderboards
|
# Aider LLM Leaderboards
|
||||||
|
|
||||||
Aider works best with LLMs which are good at *editing* code, not just good at writing
|
Aider excels with LLMs skilled at *editing* code, not just writing it.
|
||||||
code.
|
These benchmarks evaluate an LLM's ability to follow instructions and edit code successfully.
|
||||||
To evaluate an LLM's editing skill, aider uses benchmarks that
|
The leaderboards show results for popular LLMs. Aider works best with high-scoring models, though it [can connect to almost any LLM](/docs/llms.html).
|
||||||
assess a model's ability to consistently follow the system prompt
|
|
||||||
to successfully edit code.
|
|
||||||
|
|
||||||
The leaderboards report the results from a number of popular LLMs.
|
|
||||||
While [aider can connect to almost any LLM](/docs/llms.html),
|
|
||||||
it works best with models that score well on the benchmarks.
|
|
||||||
|
|
||||||
|
|
||||||
## Polyglot leaderboard
|
## Polyglot leaderboard
|
||||||
|
|
||||||
[Aider's polyglot benchmark](https://aider.chat/2024/12/21/polyglot.html#the-polyglot-benchmark)
|
[Aider's polyglot benchmark](https://aider.chat/2024/12/21/polyglot.html#the-polyglot-benchmark) tests LLMs on 225 challenging Exercism coding exercises across C++, Go, Java, JavaScript, Python, and Rust.
|
||||||
asks the LLM to edit source files to complete 225 coding exercises
|
It measures coding ability in multiple languages, integration with existing code, and successful application of changes without human intervention.
|
||||||
from Exercism.
|
|
||||||
It contains exercises in many popular programming languages:
|
|
||||||
C++, Go, Java, JavaScript, Python and Rust.
|
|
||||||
The 225 exercises were purposely selected to be the *hardest*
|
|
||||||
that Exercism offered in those languages, to provide
|
|
||||||
a strong coding challenge to LLMs.
|
|
||||||
|
|
||||||
This benchmark measures the LLM's coding ability in popular languages,
|
|
||||||
and whether it can
|
|
||||||
write new code that integrates into existing code.
|
|
||||||
The model also has to successfully apply all its changes to the source file without human intervention.
|
|
||||||
|
|
||||||
<input type="text" id="editSearchInput" placeholder="Search..." style="width: 100%; max-width: 800px; margin: 10px auto; padding: 8px; display: block; border: 1px solid #ddd; border-radius: 4px;">
|
<input type="text" id="editSearchInput" placeholder="Search..." style="width: 100%; max-width: 800px; margin: 10px auto; padding: 8px; display: block; border: 1px solid #ddd; border-radius: 4px;">
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue