From e6123624d87cd436108ef6442deceb5a8c6e4c59 Mon Sep 17 00:00:00 2001
From: Paul Gauthier <aider@paulg.org>
Date: Fri, 30 Jun 2023 13:44:58 -0700
Subject: [PATCH] copy

---
 docs/benchmarks.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/benchmarks.md b/docs/benchmarks.md
index f76c7dd3c..2d4fd235f 100644
--- a/docs/benchmarks.md
+++ b/docs/benchmarks.md
@@ -70,7 +70,7 @@ The goal is to read the instructions, implement the provided functions/class ske
 and pass all the unit tests. The benchmark measures what percentage of
 the 133 exercises are completed successfully, with all the associated unit tests passing.
 
-To run the test, aider sends GPT the Exercism instructions followed by:
+To complete an exercise, aider sends GPT the Exercism instructions followed by:
 
 ```
 Use the above instructions to modify the supplied files: {file_list}
@@ -90,7 +90,7 @@ Fix the code in {file_list} to resolve the errors.
 
 GPT gets this second chance to fix the implementation because
 many of the unit tests check for specifics that are not
-clearly called out in the instructions.
+called out in the instructions.
 For example, many tests want to see
 [specific phrases in ValueErrors](https://github.com/exercism/python/blob/f6caa44faa8fb7d0de9a54ddb5c6183e027429c6/exercises/practice/queen-attack/queen_attack_test.py#L31)
 raised by
@@ -190,7 +190,7 @@ format requests original/updated edits to be returned using the function call AP
 }       
 ```
 
-## ChatGPT function calls
+## GPT-3.5 hallucinates function calls?
 
 GPT-3.5 was very prone to ignoring the JSON Schema that specified valid functions,
 and would often return a completely invalid `function_call` fragment with `name="python"`.