copy

2025-06-10 22:55:00 +00:00 · 2024-09-12 15:41:02 -07:00 · 2024-09-12 15:41:02 -07:00 · 72f52bdef0
commit 72f52bdef0
parent c00ac80909
1 changed files with 13 additions and 10 deletions
--- a/aider/website/_posts/2024-09-12-o1.md
+++ b/aider/website/_posts/2024-09-12-o1.md
@ -9,6 +9,17 @@ nav_exclude: true
 # Benchmark results for OpenAI o1-mini
 <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
 {% assign edit_sorted = site.data.o1_results | sort: 'pass_rate_2' | reverse %}
 {% include leaderboard_graph.html
  chart_id="editChart" 
  data=edit_sorted 
  row_prefix="edit-row" 
  pass_rate_key="pass_rate_2"
 %}
 OpenAI o1-mini is priced similarly to GPT-4o and Claude 3.5 Sonnet,
 but scored below those models.
@ -24,10 +35,10 @@ efficiently edit the source code, saving time and token costs.
 The o1-mini model had trouble conforming to both the whole and diff edit formats.
 Aider is extremely permissive and tries hard to accept anything close
 to the correct formats.
 It's possible that o1-mini would get better scores if aider prompted with
 more examples or was adapted to parse o1-mini's favorite ways to mangle
 the response formats.
 Over time it may be possible to better harness o1-mini's capabilities through
 different prompting and editing formats.
@ -49,6 +60,7 @@ aider --model o1-preview
 > These are *preliminiary* benchmark results, which will be updated as
 > additional benchmark runs complete and rate limits open up.
 <table style="width: 100%; max-width: 800px; margin: auto; border-collapse: collapse; box-shadow: 0 2px 4px rgba(0,0,0,0.1); font-size: 14px;">
  <thead style="background-color: #f2f2f2;">
    <tr>
@ -60,7 +72,6 @@ aider --model o1-preview
    </tr>
  </thead>
  <tbody>
    {% assign edit_sorted = site.data.o1_results | sort: 'pass_rate_2' | reverse %}
    {% for row in edit_sorted %}
      <tr style="border-bottom: 1px solid #ddd;">
        <td style="padding: 8px;">{{ row.model }}</td>
@ -73,14 +84,6 @@ aider --model o1-preview
  </tbody>
 </table>
 <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
 {% include leaderboard_graph.html
  chart_id="editChart" 
  data=edit_sorted 
  row_prefix="edit-row" 
  pass_rate_key="pass_rate_2"
 %}
 <style>
  tr.selected {