copy

2025-06-01 02:05:00 +00:00 · 2024-06-03 11:23:22 -07:00 · 2024-06-03 11:23:22 -07:00 · 5d08c69ba0
commit 5d08c69ba0
parent a0025a42d9
1 changed files with 1 additions and 0 deletions
--- a/_posts/2024-06-02-main-swe-bench.md
+++ b/_posts/2024-06-02-main-swe-bench.md
@ -238,6 +238,7 @@ The "aider agent" internally makes multiple "attempts" at solving the problem,
 but it picks and returns one single candidate solution.
 Only that one candidate solution is evaluated with the acceptance tests
 and contributes to the benchmark score.
+Thus it is a pass@1 result.

 This is contrast to a pass@N result for N>1, where N attempts are made
 and all N solutions are evaluated by the acceptance tests.