This commit is contained in:
Paul Gauthier 2023-12-19 15:30:15 -08:00
parent d36c18f9dc
commit 7028a533f1

View file

@ -3,7 +3,6 @@
![robot flowchart](../assets/benchmarks-udiff.svg)
Aider now asks GPT-4 Turbo to use
[unified diffs](#choose-a-familiar-editing-format)
to edit your code.
@ -29,9 +28,10 @@ This new laziness benchmark produced the following results with `gpt-4-1106-prev
- **Aider's new unified diff edit format raised the score to 61%**. Using this format reduced laziness by 3X, with GPT-4 Turbo only using lazy comments on 4 of the tasks.
- **It's worse to add a prompt that the user is blind, has no hands, will tip $2000 and fears truncated code trauma.**
The widely circulated "blind with no hands" type of folk remedies
performed worse on the benchmark when added to the system prompt.
The benchmark scores dropped
These widely circulated "emotional appeal" folk remedies
produced worse benchmark scores.
Adding *all* of these claims to the system prompt
resulted in worse benchmark scores
for the baseline SEARCH/REPLACE and new unified diff editing formats.
These prompts did somewhat reduce the amount of laziness when used
with the SEARCH/REPLACE edit format,