From 87a964355b13b6fde6eb9ff2a75fcd8d17b62ae4 Mon Sep 17 00:00:00 2001
From: Paul Gauthier <paul@aider.chat>
Date: Mon, 23 Dec 2024 08:00:25 -0500
Subject: [PATCH] copy

---
 aider/website/_data/polyglot_leaderboard.yml | 2 +-
 benchmark/README.md                          | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/aider/website/_data/polyglot_leaderboard.yml b/aider/website/_data/polyglot_leaderboard.yml
index c2024d1dc..9badd7a85 100644
--- a/aider/website/_data/polyglot_leaderboard.yml
+++ b/aider/website/_data/polyglot_leaderboard.yml
@@ -78,7 +78,7 @@
 
 - dirname: 2024-12-21-19-23-03--polyglot-o1-hard-diff
   test_cases: 224
-  model: o1-2024-12-17
+  model: o1-2024-12-17 (high)
   edit_format: diff
   commit_hash: a755079-dirty
   pass_rate_1: 23.7
diff --git a/benchmark/README.md b/benchmark/README.md
index 6b20c3797..b9e1b1e43 100644
--- a/benchmark/README.md
+++ b/benchmark/README.md
@@ -2,18 +2,18 @@
 # Aider benchmark harness
 
 Aider uses benchmarks to quantitatively measure how well it works
-various LLMs.
+with various LLMs.
 This directory holds the harness and tools needed to run the benchmarking suite.
 
 ## Background
 
 The benchmark is based on the [Exercism](https://github.com/exercism/python) coding exercises.
 This
-benchmark evaluates how effectively aider and GPT can translate a
+benchmark evaluates how effectively aider and LLMs can translate a
 natural language coding request into executable code saved into
 files that pass unit tests.
 It provides an end-to-end evaluation of not just
-GPT's coding ability, but also its capacity to *edit existing code*
+the LLM's coding ability, but also its capacity to *edit existing code*
 and *format those code edits* so that aider can save the
 edits to the local source files.