mirror of https://github.com/Aider-AI/aider.git synced 2025-06-01 02:05:00 +00:00

Paul Gauthier (aider) cc1a984c7e feat: Replace clear selection button with select all checkbox

2025-04-13 08:35:37 -07:00

22 KiB

Raw Blame History

highlight_image	nav_order	description	has_children
/assets/leaderboard.jpg	950	Quantitative benchmarks of LLM code editing skill.	true

Aider LLM Leaderboards

Aider excels with LLMs skilled at editing code, not just writing it. These benchmarks evaluate an LLM's ability to follow instructions and edit code successfully without human intervention. Aider works best with high-scoring models, though it can connect to almost any LLM.

Polyglot leaderboard

Aider's polyglot benchmark tests LLMs on 225 challenging Exercism coding exercises across C++, Go, Java, JavaScript, Python, and Rust.

{% assign max_cost = 0 %} {% for row in site.data.polyglot_leaderboard %} {% if row.total_cost > max_cost %} {% assign max_cost = row.total_cost %} {% endif %} {% endfor %} {% if max_cost == 0 %}{% assign max_cost = 1 %}{% endif %} {% assign edit_sorted = site.data.polyglot_leaderboard | sort: 'pass_rate_2' | reverse %} {% for row in edit_sorted %} {% comment %} Add loop index for unique IDs {% endcomment %} {% assign row_index = forloop.index0 %} {% endfor %}

	Model	Percent correct	Cost (log scale)	Command
	{{ row.model }}	{{ row.pass_rate_2 }}%	{% if row.total_cost > 0 %} {% endif %} {% assign rounded_cost = row.total_cost \| times: 1.0 \| round: 2 %} {% if row.total_cost == 0 or rounded_cost == 0.00 %}?{% else %}${{ rounded_cost }}{% endif %}	`{{ row.command }}`
{% for pair in row %} {% if pair[1] != "" and pair[1] != nil %} {{ pair[0] \| replace: '_', ' ' \| capitalize }}: {% if pair[0] == 'command' %}`{{ pair[1] }}`{% else %}{{ pair[1] }}{% endif %} {% endif %} {% endfor %}

By Paul Gauthier, last updated

22 KiB Raw Blame History

Aider LLM Leaderboards

Polyglot leaderboard

22 KiB

Raw Blame History