--- highlight_image: /assets/leaderboard.jpg nav_order: 950 description: Quantitative benchmarks of LLM code editing skill. has_children: true --- # Aider LLM Leaderboards Aider works best with LLMs which are good at *editing* code, not just good at writing code. To evaluate an LLM's editing skill, aider uses benchmarks that assess a model's ability to consistently follow the system prompt to successfully edit code. The leaderboards report the results from a number of popular LLMs. While [aider can connect to almost any LLM](/docs/llms.html), it works best with models that score well on the benchmarks. ## Polyglot leaderboard [Aider's polyglot benchmark](https://aider.chat/2024/12/21/polyglot.html#the-polyglot-benchmark) asks the LLM to edit source files to complete 225 coding exercises from Exercism. It contains exercises in many popular programming languages: C++, Go, Java, JavaScript, Python and Rust. The 225 exercises were purposely selected to be the *hardest* that Exercism offered in those languages, to provide a strong coding challenge to LLMs. This benchmark measures the LLM's coding ability in popular languages, and whether it can write new code that integrates into existing code. The model also has to successfully apply all its changes to the source file without human intervention. {% assign max_cost = 0 %} {% for row in site.data.polyglot_leaderboard %} {% if row.total_cost > max_cost %} {% assign max_cost = row.total_cost %} {% endif %} {% endfor %} {% if max_cost == 0 %}{% assign max_cost = 1 %}{% endif %} {% assign edit_sorted = site.data.polyglot_leaderboard | sort: 'pass_rate_2' | reverse %} {% for row in edit_sorted %} {% assign cost_percent = row.total_cost | times: 100.0 | divided_by: max_cost %} {% endfor %}

Model	Percent correct	Percent using correct edit format	Cost	Command	Edit format
{{ row.model }}	{{ row.pass_rate_2 }}%	{{ row.percent_cases_well_formed }}%	{% if row.total_cost == 0 %}?{% else %}${{ row.total_cost \| times: 1.0 \| round: 2 }}{% endif %}	`{{ row.command }}`	{{ row.edit_format }}

### Aider polyglot benchmark results

By Paul Gauthier, last updated April 12, 2025.