diff --git a/docs/unified-diffs.md b/docs/unified-diffs.md index 3bd56835d..a6ab10235 100644 --- a/docs/unified-diffs.md +++ b/docs/unified-diffs.md @@ -28,9 +28,9 @@ This new laziness benchmark produced the following results with `gpt-4-1106-prev - **Aider's new unified diff edit format raised the score to 61%**. Using this format reduced laziness by 3X, with GPT-4 Turbo only using lazy comments on 4 of the tasks. - **It's worse to add a prompt that says the user is blind, has no hands, will tip $2000 and fears truncated code trauma.** -These widely circulated "emotional appeal" folk remedies +Widely circulated "emotional appeal" folk remedies produced worse benchmark scores. -Adding *all* of these claims to the system prompt +Adding *all* of the various emotional statements to the system prompt resulted in worse benchmark scores for the baseline SEARCH/REPLACE and new unified diff editing formats. These prompts did somewhat reduce the amount of laziness when used