mirror of
https://github.com/Aider-AI/aider.git
synced 2025-06-01 18:25:00 +00:00
copy
This commit is contained in:
parent
26405457c5
commit
92a09f3deb
1 changed files with 4 additions and 4 deletions
|
@ -105,7 +105,7 @@ Only use standard python libraries, don't suggest installing any packages.
|
||||||
|
|
||||||
Aider updates the implementation file based on GPT's reply and runs
|
Aider updates the implementation file based on GPT's reply and runs
|
||||||
the unit tests. If all tests pass, the exercise is considered
|
the unit tests. If all tests pass, the exercise is considered
|
||||||
complete. If some tests fail, Aider sends GPT a second message with
|
complete. If some tests fail, aider sends GPT a second message with
|
||||||
the test error output. It only sends the first 50 lines of test errors
|
the test error output. It only sends the first 50 lines of test errors
|
||||||
to try and avoid exceeding the context window of the smaller models. Aider
|
to try and avoid exceeding the context window of the smaller models. Aider
|
||||||
also includes this final instruction:
|
also includes this final instruction:
|
||||||
|
@ -119,7 +119,7 @@ Fix the code in <implementation file> to resolve the errors.
|
||||||
Requiring GPT to fix its first implementation in response to test failures
|
Requiring GPT to fix its first implementation in response to test failures
|
||||||
is another way in which this benchmark stresses code editing skill.
|
is another way in which this benchmark stresses code editing skill.
|
||||||
This second chance is also important because it
|
This second chance is also important because it
|
||||||
gives GPT a chance to adjust if the
|
gives GPT the opportunity to adjust if the
|
||||||
instructions were imprecise with respect to the
|
instructions were imprecise with respect to the
|
||||||
specific requirements of the unit tests.
|
specific requirements of the unit tests.
|
||||||
Many of the exercises have multiple paragraphs of instructions,
|
Many of the exercises have multiple paragraphs of instructions,
|
||||||
|
@ -135,14 +135,14 @@ the intermediate performance after the first coding attempt,
|
||||||
without the benefit of the second try.
|
without the benefit of the second try.
|
||||||
|
|
||||||
It's worth noting that GPT never gets to see the source code of the
|
It's worth noting that GPT never gets to see the source code of the
|
||||||
unit tests during the benchmarking. It only sees the error output from
|
unit tests during the benchmark. It only sees the error output from
|
||||||
failed tests. Of course, all of this code was probably part of its
|
failed tests. Of course, all of this code was probably part of its
|
||||||
original training data!
|
original training data!
|
||||||
|
|
||||||
In summary, passing an exercise means GPT was able to:
|
In summary, passing an exercise means GPT was able to:
|
||||||
|
|
||||||
- Write the required code (possibly after reviewing test error output),
|
- Write the required code (possibly after reviewing test error output),
|
||||||
- Correctly package all of the code edits into the edit format so that Aider can process and save it to the implementation file.
|
- Correctly package all of the code edits into the edit format so that aider can process and save it to the implementation file.
|
||||||
|
|
||||||
Conversely, failing an exercise only requires a breakdown in one of
|
Conversely, failing an exercise only requires a breakdown in one of
|
||||||
those steps. In practice, GPT fails at different steps in different
|
those steps. In practice, GPT fails at different steps in different
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue