mirror of
https://github.com/Aider-AI/aider.git
synced 2025-06-01 02:05:00 +00:00
copy
This commit is contained in:
parent
fbc3f0cef5
commit
87a964355b
2 changed files with 4 additions and 4 deletions
|
@ -78,7 +78,7 @@
|
||||||
|
|
||||||
- dirname: 2024-12-21-19-23-03--polyglot-o1-hard-diff
|
- dirname: 2024-12-21-19-23-03--polyglot-o1-hard-diff
|
||||||
test_cases: 224
|
test_cases: 224
|
||||||
model: o1-2024-12-17
|
model: o1-2024-12-17 (high)
|
||||||
edit_format: diff
|
edit_format: diff
|
||||||
commit_hash: a755079-dirty
|
commit_hash: a755079-dirty
|
||||||
pass_rate_1: 23.7
|
pass_rate_1: 23.7
|
||||||
|
|
|
@ -2,18 +2,18 @@
|
||||||
# Aider benchmark harness
|
# Aider benchmark harness
|
||||||
|
|
||||||
Aider uses benchmarks to quantitatively measure how well it works
|
Aider uses benchmarks to quantitatively measure how well it works
|
||||||
various LLMs.
|
with various LLMs.
|
||||||
This directory holds the harness and tools needed to run the benchmarking suite.
|
This directory holds the harness and tools needed to run the benchmarking suite.
|
||||||
|
|
||||||
## Background
|
## Background
|
||||||
|
|
||||||
The benchmark is based on the [Exercism](https://github.com/exercism/python) coding exercises.
|
The benchmark is based on the [Exercism](https://github.com/exercism/python) coding exercises.
|
||||||
This
|
This
|
||||||
benchmark evaluates how effectively aider and GPT can translate a
|
benchmark evaluates how effectively aider and LLMs can translate a
|
||||||
natural language coding request into executable code saved into
|
natural language coding request into executable code saved into
|
||||||
files that pass unit tests.
|
files that pass unit tests.
|
||||||
It provides an end-to-end evaluation of not just
|
It provides an end-to-end evaluation of not just
|
||||||
GPT's coding ability, but also its capacity to *edit existing code*
|
the LLM's coding ability, but also its capacity to *edit existing code*
|
||||||
and *format those code edits* so that aider can save the
|
and *format those code edits* so that aider can save the
|
||||||
edits to the local source files.
|
edits to the local source files.
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue