diff --git a/_posts/2024-05-22-swe-bench-lite.md b/_posts/2024-05-22-swe-bench-lite.md index 3bf0a08d2..dac61a3d4 100644 --- a/_posts/2024-05-22-swe-bench-lite.md +++ b/_posts/2024-05-22-swe-bench-lite.md @@ -67,6 +67,8 @@ The held out "acceptance tests" were *only* used after benchmarking to compute statistics on which problems aider correctly resolved. +The [full harness to run aider on SWE Bench Lite is available on GitHub](https://github.com/paul-gauthier/aider-swe-bench). + The benchmarking process was similar to how a developer might use aider to resolve a GitHub issue: