This commit is contained in:
Paul Gauthier 2024-09-12 14:12:48 -07:00
parent 297b51b997
commit 71f3f3a22b
3 changed files with 28 additions and 2 deletions

View file

@ -1085,4 +1085,27 @@
date: 2024-09-11
versions: 0.56.1.dev
seconds_per_case: 7.6
total_cost: 0.0000
total_cost: 0.0000
- dirname: 2024-09-12-19-57-35--o1-mini-whole
test_cases: 133
model: o1-mini
edit_format: whole
commit_hash: 36fa773-dirty, 291b456
pass_rate_1: 49.6
pass_rate_2: 70.7
percent_cases_well_formed: 90.0
error_outputs: 0
num_malformed_responses: 0
num_with_malformed_responses: 0
user_asks: 17
lazy_comments: 0
syntax_errors: 0
indentation_errors: 0
exhausted_context_windows: 0
test_timeouts: 1
command: (not currently supported)
date: 2024-09-12
versions: 0.56.1.dev
seconds_per_case: 103.0
total_cost: 5.3725

View file

@ -87,7 +87,7 @@
indentation_errors: 0
exhausted_context_windows: 0
test_timeouts: 1
command: aider --model openai/o1-mini
command: (not currently supported)
date: 2024-09-12
versions: 0.56.1.dev
seconds_per_case: 103.0

View file

@ -27,6 +27,9 @@ efficiently edit the source code.
The whole format requires the o1-mini to return a fresh copy of the entire file,
increasing costs and latency.
OpenAI o1-mini does not support system prompts or temperature settings,
so aider was modified to complete this benchmark.
The released aider will not yet work with o1-mini.
{: .note }
> These are *preliminiary* benchmark results, which will be updated as