From 5b103329e653597efb5e96cc24864d72fe4176ad Mon Sep 17 00:00:00 2001 From: Paul Gauthier Date: Tue, 7 Nov 2023 05:25:03 -0800 Subject: [PATCH] copy --- docs/benchmarks-1106.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/benchmarks-1106.md b/docs/benchmarks-1106.md index e6cd3fdd4..f08759734 100644 --- a/docs/benchmarks-1106.md +++ b/docs/benchmarks-1106.md @@ -33,10 +33,10 @@ and a test suite to evaluate whether the coder has correctly solved the problem. The benchmark gives aider two tries to complete the task: -1. On the first try, aider gives GPT the stub code file to edit and the natural language instructions that describe the problem. This reflects how you code with aider. You add your source code files to the chat and ask for changes, which are automatically applied. +1. On the first try, aider gives GPT the stub code file to edit and the natural language instructions that describe the problem. This reflects how you code with aider. You add your source code files to the chat and ask for changes, which are automatically applied. 2. If the test suite fails after the first try, aider gives GPT the test error output and asks it to fix the code. Aider supports this sort of interaction using a command like `/run pytest` to run and share pytest results in the chat with GPT. You can `/run` whatever tests/linters/etc make sense for your language/framework/situation. -## Benchmark results on the new "1106" models +## Benchmark results ### gpt-4-1106-preview