From a6f39f63a3e3a3b4dfa4fb12198ba146a227c05d Mon Sep 17 00:00:00 2001 From: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Date: Tue, 7 Nov 2023 11:38:16 -0800 Subject: [PATCH] Update benchmarks-1106.md --- docs/benchmarks-1106.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/benchmarks-1106.md b/docs/benchmarks-1106.md index ef75fb35c..2af43bde6 100644 --- a/docs/benchmarks-1106.md +++ b/docs/benchmarks-1106.md @@ -47,7 +47,7 @@ This is the edit format that aider uses by default with gpt-4. - **It seems better at producing correct code on the first try**. It gets ~57% of the coding exercises correct, without needing to see errors from the test suite. Previous models only get 46-47% of the exercises correct on the first try. - The new model seems to perform similar (66%) to the old models (63-64%) after being given a second chance to correct bugs by reviewing test suite error output. -**These results are preliminiary.** +**These are preliminary results.** OpenAI is enforcing very low rate limits on the new GPT-4 model. The limits are so low, that I have only been able to attempt @@ -71,4 +71,4 @@ The comments below only focus on comparing the `whole` edit format results: ### Updates -I will update the results on this page as quickly my rate limit allows. +I will update the results on this page as quickly as my rate limit allows.