From e5c831d1b689db187c923056eeb9c357e8c73456 Mon Sep 17 00:00:00 2001 From: Paul Gauthier Date: Sat, 1 Jun 2024 19:01:47 -0700 Subject: [PATCH] copy --- _posts/2024-05-31-both-swe-bench.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_posts/2024-05-31-both-swe-bench.md b/_posts/2024-05-31-both-swe-bench.md index 7a3d95385..e58bae40c 100644 --- a/_posts/2024-05-31-both-swe-bench.md +++ b/_posts/2024-05-31-both-swe-bench.md @@ -22,8 +22,8 @@ This result on the main SWE Bench is in addition to [![SWE Bench results](/assets/swe_bench.svg)](https://aider.chat/assets/swe_bench.svg) Aider was benchmarked on the same -[randomly selected 570](https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs) -of the 2,294 SWE Bench problems that were used in the +[randomly selected 570 problems](https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs) +from SWE Bench that were used in the [Devin evaluation](https://www.cognition.ai/post/swe-bench-technical-report). Please see the [references](#references) for more details on the data presented in this chart.