From 7028a533f1837aff0d7fa35dec3c8b0cdac7858a Mon Sep 17 00:00:00 2001 From: Paul Gauthier Date: Tue, 19 Dec 2023 15:30:15 -0800 Subject: [PATCH] copy --- docs/unified-diffs.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/unified-diffs.md b/docs/unified-diffs.md index a44cded33..b4a2fb27b 100644 --- a/docs/unified-diffs.md +++ b/docs/unified-diffs.md @@ -3,7 +3,6 @@ ![robot flowchart](../assets/benchmarks-udiff.svg) - Aider now asks GPT-4 Turbo to use [unified diffs](#choose-a-familiar-editing-format) to edit your code. @@ -29,9 +28,10 @@ This new laziness benchmark produced the following results with `gpt-4-1106-prev - **Aider's new unified diff edit format raised the score to 61%**. Using this format reduced laziness by 3X, with GPT-4 Turbo only using lazy comments on 4 of the tasks. - **It's worse to add a prompt that the user is blind, has no hands, will tip $2000 and fears truncated code trauma.** -The widely circulated "blind with no hands" type of folk remedies -performed worse on the benchmark when added to the system prompt. -The benchmark scores dropped +These widely circulated "emotional appeal" folk remedies +produced worse benchmark scores. +Adding *all* of these claims to the system prompt +resulted in worse benchmark scores for the baseline SEARCH/REPLACE and new unified diff editing formats. These prompts did somewhat reduce the amount of laziness when used with the SEARCH/REPLACE edit format,