From fa097537e404d3a451898c0049d13e4c487858a0 Mon Sep 17 00:00:00 2001 From: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Date: Wed, 20 Dec 2023 17:58:41 -0600 Subject: [PATCH 1/2] Update unified-diffs.md --- docs/unified-diffs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/unified-diffs.md b/docs/unified-diffs.md index be5b5b42e..00a95406f 100644 --- a/docs/unified-diffs.md +++ b/docs/unified-diffs.md @@ -1,5 +1,5 @@ -# Reducing GPT-4 Turbo laziness with unified diffs +# Unified diffs make GPT-4 Turbo less lazy ![robot flowchart](../assets/benchmarks-udiff.svg) From 5c4f8a4b764c77e12934019424d9993cca8cdc1c Mon Sep 17 00:00:00 2001 From: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Date: Wed, 20 Dec 2023 17:59:39 -0600 Subject: [PATCH 2/2] Update unified-diffs.md --- docs/unified-diffs.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/unified-diffs.md b/docs/unified-diffs.md index 00a95406f..8ab14eddb 100644 --- a/docs/unified-diffs.md +++ b/docs/unified-diffs.md @@ -18,8 +18,7 @@ Aider's new "laziness" benchmark suite is designed to both provoke and quantify lazy coding. It consists of 89 python refactoring tasks -which tend to make GPT-4 Turbo lazy -and write comments like +which tend to make GPT-4 Turbo write lazy comments like "...include original method body...". This new laziness benchmark produced the following results with `gpt-4-1106-preview`: