This commit is contained in:
Paul Gauthier 2024-09-12 15:17:32 -07:00
parent 84b1c1031a
commit 291d3509eb
2 changed files with 9 additions and 9 deletions

View file

@ -9,9 +9,9 @@ nav_exclude: true
# Benchmark results for OpenAI o1-mini
OpenAI o1-mini is priced similarly to GPT-4o and Claude 3.5 Sonnet.
o1-mini scored below those models
when using the simple "whole" editing format.
OpenAI o1-mini is priced similarly to GPT-4o and Claude 3.5 Sonnet,
but scored below those models
when using the "whole" editing format.
It was close enough to GPT-4o to be within the margin of error.
The o1-mini model had trouble following the very simple whole editing format.
@ -21,8 +21,8 @@ the response format.
Note that o1-mini's "whole" score is compared against GPT-4o and Sonnet
"diff" results.
Using diff is more challenging for GPT-4o and Sonnet,
but it allows them to return search/replace blocks to
Using diff is more challenging,
but allows the model to return search/replace blocks to
efficiently edit the source code.
The whole format requires the o1-mini to return a fresh copy of the entire file,
increasing costs and latency.