This commit is contained in:
Paul Gauthier 2024-05-31 13:32:20 -07:00
parent 2c6b472946
commit 6a2d7e08c2
5 changed files with 207 additions and 173 deletions

View file

@ -16,9 +16,9 @@ from Amazon Q Developer Agent.
The best result reported elsewhere seems to be The best result reported elsewhere seems to be
[13.9% from Devin](https://www.cognition.ai/post/swe-bench-technical-report). [13.9% from Devin](https://www.cognition.ai/post/swe-bench-technical-report).
This is in addition to This result on the main SWE Bench is in addition to
[aider's SOTA result on the easier SWE Bench Lite](https://aider.chat/2024/05/22/swe-bench-lite.html) [aider's SOTA result on the easier SWE Bench Lite](https://aider.chat/2024/05/22/swe-bench-lite.html)
that was reported last week. that was reported recently.
[![SWE Bench results](/assets/swe_bench.svg)](https://aider.chat/assets/swe_bench.svg) [![SWE Bench results](/assets/swe_bench.svg)](https://aider.chat/assets/swe_bench.svg)
@ -57,11 +57,10 @@ with the problem statement
submitted as the opening chat message from "the user". submitted as the opening chat message from "the user".
- After that aider ran as normal, except all of aider's - After that aider ran as normal, except all of aider's
suggestions were always accepted without user approval. suggestions were always accepted without user approval.
- A simple harness was used to retry the SWE Bench problem if aider produced code that wasn't *plausibly correct*. - A [simple harness](https://github.com/paul-gauthier/aider-swe-bench#the-aider-agent) was used to retry the SWE Bench problem if aider produced code that wasn't *plausibly correct*.
Plausibly correct means that aider reported that it had successfully edited the repo Plausibly correct means that aider reported that it had successfully edited the repo
without causing syntax errors or breaking any *pre-existing* tests. without causing syntax errors or breaking any *pre-existing* tests.
- If the solution from aider with GPT-4o isn't plausible, the harness launches aider to try again from scratch, - If the solution from aider with GPT-4o wasn't plausible, the harness launched aider to try again from scratch, this time using Claude 3 Opus.
this time using Claude 3 Opus.
- If no plausible solution is found after those two tries, the harness picks the "most plausible" solution with the fewest edit/lint/test problems. - If no plausible solution is found after those two tries, the harness picks the "most plausible" solution with the fewest edit/lint/test problems.
It's important to be clear that It's important to be clear that
@ -73,20 +72,22 @@ correctly resolved.
This is the same methodology This is the same methodology
that was used for [aider's recent SOTA result on SWE Bench Lite](https://aider.chat/2024/05/22/swe-bench-lite.html). that was used for [aider's recent SOTA result on SWE Bench Lite](https://aider.chat/2024/05/22/swe-bench-lite.html).
The only difference is that for this result Aider alternated between GPT-4o and Opus for up to 6 total attempts
at most two tries were attempted instead of six, on the Lite benchmark.
due to the increased token costs involved in this benchmark. Due to the increased token costs involved in running
The SWE Bench problems are more difficult and involve edits to the main SWE Bench benchmark, aider was limited to 2 total attempts.
Problems from the main SWE Bench dataset
are more difficult and involve edits to
more than one source file, more than one source file,
which increased the cost of solving each problem. which increased the token costs of solving each problem.
Further, aider was benchmarked on 570 SWE Bench problems, Further, aider was benchmarked on 570 SWE Bench problems
versus only 300 Lite problems, versus only 300 Lite problems,
adding another factor of ~two to the costs. adding another factor of ~two to the costs.
For a detailed discussion of the methodology, please see the For a detailed discussion of the methodology, please see the
[article about aider's SWE Bench Lite results](https://aider.chat/2024/05/22/swe-bench-lite.html). [article about aider's SWE Bench Lite results](https://aider.chat/2024/05/22/swe-bench-lite.html).
The [aider SWE Bench repository on GitHub](https://github.com/paul-gauthier/aider-swe-bench) also contains The [aider SWE Bench repository on GitHub](https://github.com/paul-gauthier/aider-swe-bench) also contains
the harness and reporting code used for the benchmarks. the harness and analysis code used for the benchmarks.
The benchmarking process was similar to how a developer might use aider to The benchmarking process was similar to how a developer might use aider to
resolve a GitHub issue: resolve a GitHub issue:
@ -103,8 +104,7 @@ so it's always easy to revert AI changes that don't pan out.
## Aider with GPT-4o alone was SOTA ## Aider with GPT-4o alone was SOTA
Running the benchmark harness Using aider with GPT-4o to make a single attempt at solving each problem
only using aider with GPT-4o to find plausible solutions with a single attempt
achieved a score of 17.0%. achieved a score of 17.0%.
This was itself a state-of-the-art result, before being surpassed by the main This was itself a state-of-the-art result, before being surpassed by the main
result being reported here result being reported here
@ -112,13 +112,13 @@ that used aider with both GPT-4o & Opus.
## Aider with GPT-4o & Opus ## Aider with GPT-4o & Opus
The benchmark harness started by running aider with GPT-4o once to try The benchmark harness ran aider with GPT-4o to try
and solve the problem. If and solve the problem. If
no plausible solution was found, it then used aider with Opus no plausible solution was found, it ran aider with Opus
once to try and solve the problem. to try and solve the problem.
The table below breaks down the proposed solutions that The table below breaks down the proposed solutions that
were found for the 570 problems. were found from each attempt for the 570 problems.
A proposed solution is either: A proposed solution is either:
- A plausible solution where - A plausible solution where
@ -137,22 +137,55 @@ verified as correctly resolving their issue.
## Non-plausible but correct solutions? ## Non-plausible but correct solutions?
It's worth noting that the first row of the table above A solution doesn't have to be plausible in order to correctly resolve the issue.
only scored 15.3% on the benchmark, Recall that plausible is simply defined as aider
which differs from the 17.0% result reported above for aider with just GPT-4o. reporting that it successfully edited files,
This is because making additional attempts is not guaranteed to repaired and resolved any linting errors
monotonically increase the number of resolved issues. and repaired tests so that they all passed.
Later attempts may propose solutions which But there are lots of reasons why aider might fail to do those things
seem "more plausible" than prior attempts, and yet the solution is still a correct solution that will pass
but which are actually worse solutions. acceptance testing:
Luckily the later attempts usually provide a net increase in the overall
- There could be pre-existing failing tests in the repo,
before aider even starts working on the SWE Bench problem.
Aider may not resolve such issues, and yet they may turn out not to be
relevant to the acceptance testing.
The SWE Bench acceptance testing just confirms that tests pass or fail
in the same pattern as the "gold patch" developed by a human to solve the
problem.
Some tests may still fail, and that's ok as long they fail for the gold
patch too.
- There could be pre-existing linting problems in the repo,
which are in code paths that are irrelevant to the problem being solved
and to acceptance testing.
If aider is unable to resolve them, the solution may still be valid
and pass acceptance testing.
- Aider may report editing errors because it doesn't think it was
able to successfully apply all the edits the LLM specified.
In this scenario, the LLM has specified edits in an invalid
format that doesn't comply with its
system prompt instructions.
So it may be that the LLM was asking for redundant or otherwise
irrelevant edits, such that outstanding edit errors are actually not fatal.
This is why the first row in the table above
shows GPT-4o accounting for 15.3% of the benchmark score,
which is different than the 17.0% result reported earlier
for aider with just GPT-4o.
The second attempt from Opus may propose solutions which
are "more plausible" than some of GPT-4's non-plausible solutions,
but which are actually incorrect solutions.
These more plausible but incorrect solutions can
eclipse the earlier non-plausible correct
solution.
Luckily the full set of later attempts usually provide a net increase in the overall
number of resolved solutions, as is the case here. number of resolved solutions, as is the case here.
This table breaks down the plausibility of each solution proposed by The table below breaks down the plausibility of each solution proposed by
aider with GPT-4o and with Opus, as well as whether it was actually aider with GPT-4o and with Opus, and indicates which were actually
a correct solution. correct solutions.
|Row|GPT-4o<br>solution<br>plausible?|GPT-4o<br>solution<br>resolved issue?|Opus<br>solution<br>plausible?|Opus<br>solution<br>resolved issue?|Count| |Row|Aider<br>w/GPT-4o<br>solution<br>plausible?|Aider<br>w/GPT-4o<br>solution<br>resolved<br>issue?|Aider<br>w/Opus<br>solution<br>plausible?|Aider<br>w/Opus<br>solution<br>resolved<br>issue?|Count|
|---:|--:|--:|--:|--:|--:| |---:|--:|--:|--:|--:|--:|
| 1 | plausible | resolved | n/a | n/a | 73 | | 1 | plausible | resolved | n/a | n/a | 73 |
| 2 | plausible | not resolved | n/a | n/a | 181 | | 2 | plausible | not resolved | n/a | n/a | 181 |
@ -173,16 +206,12 @@ at solving these problems, because the harness stopped once a
plausible solution was found. plausible solution was found.
The remaining rows consider cases where aider with GPT-4o The remaining rows consider cases where aider with GPT-4o
did not find a plausible solution, so Opus had a turn to try and solve. did not find a plausible solution, so Opus got a turn to try and solve.
Rows 3-6 are cases where GPT-4o's non-plausible solutions were Rows 3-6 are cases where GPT-4o's non-plausible solutions were
actually found to be correct in hindsight, actually found to be correct in hindsight,
but in rows 4 we can see that aider with Opus overrides but in row 4 we can see that aider with Opus overrides
2 of them with a plausible-but-incorrect 2 of them with a plausible-but-incorrect
solution. solution.
The original correct solutions from GPT-4o may not have been
plausible because of pre-existing or otherwise
unresolved editing, linting or testing errors which were unrelated
to the SWE Bench issue or which turned out to be non-fatal.
In rows 5-6 & 9-10 we can see that both GPT-4o and Opus In rows 5-6 & 9-10 we can see that both GPT-4o and Opus
produced non-plausible solutions, produced non-plausible solutions,

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 50 KiB

Before After
Before After

View file

@ -6,7 +6,7 @@
<rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://creativecommons.org/ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://creativecommons.org/ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<cc:Work> <cc:Work>
<dc:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/> <dc:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/>
<dc:date>2024-05-31T11:28:28.622491</dc:date> <dc:date>2024-05-31T11:41:49.017547</dc:date>
<dc:format>image/svg+xml</dc:format> <dc:format>image/svg+xml</dc:format>
<dc:creator> <dc:creator>
<cc:Agent> <cc:Agent>
@ -30,8 +30,8 @@ z
</g> </g>
<g id="axes_1"> <g id="axes_1">
<g id="patch_2"> <g id="patch_2">
<path d="M 77 302.561 <path d="M 77 307.03625
L 690 302.561 L 690 307.03625
L 690 50.4 L 690 50.4
L 77 50.4 L 77 50.4
z z
@ -41,17 +41,17 @@ z
<g id="xtick_1"> <g id="xtick_1">
<g id="line2d_1"> <g id="line2d_1">
<defs> <defs>
<path id="ma801677d50" d="M 0 0 <path id="m3c08837b00" d="M 0 0
L 0 3.5 L 0 3.5
" style="stroke: #000000; stroke-width: 0.8"/> " style="stroke: #000000; stroke-width: 0.8"/>
</defs> </defs>
<g> <g>
<use xlink:href="#ma801677d50" x="137.644385" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m3c08837b00" x="137.644385" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_1"> <g id="text_1">
<!-- SWE- --> <!-- SWE- -->
<g style="fill: #555555" transform="translate(115.451416 321.755844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(116.756885 325.51375) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-53" d="M 894 1481 <path id="Helvetica-53" d="M 894 1481
Q 916 1091 1078 847 Q 916 1091 1078 847
@ -135,7 +135,7 @@ z
<use xlink:href="#Helvetica-2d" x="227.783203"/> <use xlink:href="#Helvetica-2d" x="227.783203"/>
</g> </g>
<!-- Agent --> <!-- Agent -->
<g style="fill: #555555" transform="translate(115.432823 339.933094) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(116.739385 342.62175) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-41" d="M 2844 1881 <path id="Helvetica-41" d="M 2844 1881
L 2147 3909 L 2147 3909
@ -275,7 +275,7 @@ z
<use xlink:href="#Helvetica-74" x="233.544922"/> <use xlink:href="#Helvetica-74" x="233.544922"/>
</g> </g>
<!-- + Opus --> <!-- + Opus -->
<g style="fill: #555555" transform="translate(110.003448 358.726594) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(111.629385 360.30975) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-2b" d="M 288 1369 <path id="Helvetica-2b" d="M 288 1369
L 288 1894 L 288 1894
@ -408,7 +408,7 @@ z
<use xlink:href="#Helvetica-73" x="275.195312"/> <use xlink:href="#Helvetica-73" x="275.195312"/>
</g> </g>
<!-- (2294) --> <!-- (2294) -->
<g style="fill: #555555" transform="translate(113.076729 377.053656) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(114.521885 377.55875) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-28" d="M 1894 4666 <path id="Helvetica-28" d="M 1894 4666
Q 1403 3713 1256 3263 Q 1403 3713 1256 3263
@ -527,12 +527,12 @@ z
<g id="xtick_2"> <g id="xtick_2">
<g id="line2d_2"> <g id="line2d_2">
<g> <g>
<use xlink:href="#ma801677d50" x="219.596257" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m3c08837b00" x="219.596257" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_2"> <g id="text_2">
<!-- AutoCode --> <!-- Auto -->
<g style="fill: #555555" transform="translate(181.792507 321.755844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(203.140007 325.51375) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-6f" d="M 1741 363 <path id="Helvetica-6f" d="M 1741 363
Q 2300 363 2508 786 Q 2300 363 2508 786
@ -558,6 +558,15 @@ M 1744 3428
L 1744 3428 L 1744 3428
z z
" transform="scale(0.015625)"/> " transform="scale(0.015625)"/>
</defs>
<use xlink:href="#Helvetica-41"/>
<use xlink:href="#Helvetica-75" x="66.699219"/>
<use xlink:href="#Helvetica-74" x="122.314453"/>
<use xlink:href="#Helvetica-6f" x="150.097656"/>
</g>
<!-- Code -->
<g style="fill: #555555" transform="translate(200.472507 342.62175) scale(0.16 -0.16)">
<defs>
<path id="Helvetica-43" d="M 2422 4716 <path id="Helvetica-43" d="M 2422 4716
Q 3294 4716 3775 4256 Q 3294 4716 3775 4256
Q 4256 3797 4309 3213 Q 4256 3797 4309 3213
@ -609,17 +618,13 @@ Q 922 3406 1616 3406
z z
" transform="scale(0.015625)"/> " transform="scale(0.015625)"/>
</defs> </defs>
<use xlink:href="#Helvetica-41"/> <use xlink:href="#Helvetica-43"/>
<use xlink:href="#Helvetica-75" x="66.699219"/> <use xlink:href="#Helvetica-6f" x="72.216797"/>
<use xlink:href="#Helvetica-74" x="122.314453"/> <use xlink:href="#Helvetica-64" x="127.832031"/>
<use xlink:href="#Helvetica-6f" x="150.097656"/> <use xlink:href="#Helvetica-65" x="183.447266"/>
<use xlink:href="#Helvetica-43" x="205.712891"/>
<use xlink:href="#Helvetica-6f" x="277.929688"/>
<use xlink:href="#Helvetica-64" x="333.544922"/>
<use xlink:href="#Helvetica-65" x="389.160156"/>
</g> </g>
<!-- Rover --> <!-- Rover -->
<g style="fill: #555555" transform="translate(196.923835 339.933094) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(198.257507 359.72975) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-52" d="M 2622 2488 <path id="Helvetica-52" d="M 2622 2488
Q 3059 2488 3314 2663 Q 3059 2488 3314 2663
@ -689,7 +694,7 @@ z
<use xlink:href="#Helvetica-72" x="233.447266"/> <use xlink:href="#Helvetica-72" x="233.447266"/>
</g> </g>
<!-- (2294) --> <!-- (2294) -->
<g style="fill: #555555" transform="translate(195.0286 358.260156) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(196.473757 376.97875) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-28"/> <use xlink:href="#Helvetica-28"/>
<use xlink:href="#Helvetica-32" x="33.300781"/> <use xlink:href="#Helvetica-32" x="33.300781"/>
<use xlink:href="#Helvetica-32" x="88.916016"/> <use xlink:href="#Helvetica-32" x="88.916016"/>
@ -702,19 +707,19 @@ z
<g id="xtick_3"> <g id="xtick_3">
<g id="line2d_3"> <g id="line2d_3">
<g> <g>
<use xlink:href="#ma801677d50" x="301.548128" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m3c08837b00" x="301.548128" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_3"> <g id="text_3">
<!-- SWE- --> <!-- SWE- -->
<g style="fill: #555555" transform="translate(279.35516 321.755844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(280.660628 325.51375) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-53"/> <use xlink:href="#Helvetica-53"/>
<use xlink:href="#Helvetica-57" x="66.699219"/> <use xlink:href="#Helvetica-57" x="66.699219"/>
<use xlink:href="#Helvetica-45" x="161.083984"/> <use xlink:href="#Helvetica-45" x="161.083984"/>
<use xlink:href="#Helvetica-2d" x="227.783203"/> <use xlink:href="#Helvetica-2d" x="227.783203"/>
</g> </g>
<!-- Agent --> <!-- Agent -->
<g style="fill: #555555" transform="translate(279.336566 339.933094) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(280.643128 342.62175) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-41"/> <use xlink:href="#Helvetica-41"/>
<use xlink:href="#Helvetica-67" x="66.699219"/> <use xlink:href="#Helvetica-67" x="66.699219"/>
<use xlink:href="#Helvetica-65" x="122.314453"/> <use xlink:href="#Helvetica-65" x="122.314453"/>
@ -722,7 +727,7 @@ z
<use xlink:href="#Helvetica-74" x="233.544922"/> <use xlink:href="#Helvetica-74" x="233.544922"/>
</g> </g>
<!-- + GPT-4 --> <!-- + GPT-4 -->
<g style="fill: #555555" transform="translate(269.192347 358.328156) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(271.095628 359.93475) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-47" d="M 2472 4709 <path id="Helvetica-47" d="M 2472 4709
Q 3119 4709 3591 4459 Q 3119 4709 3591 4459
@ -795,7 +800,7 @@ z
<use xlink:href="#Helvetica-34" x="325.048828"/> <use xlink:href="#Helvetica-34" x="325.048828"/>
</g> </g>
<!-- (2294) --> <!-- (2294) -->
<g style="fill: #555555" transform="translate(276.980472 376.655219) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(278.425628 377.18375) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-28"/> <use xlink:href="#Helvetica-28"/>
<use xlink:href="#Helvetica-32" x="33.300781"/> <use xlink:href="#Helvetica-32" x="33.300781"/>
<use xlink:href="#Helvetica-32" x="88.916016"/> <use xlink:href="#Helvetica-32" x="88.916016"/>
@ -808,12 +813,12 @@ z
<g id="xtick_4"> <g id="xtick_4">
<g id="line2d_4"> <g id="line2d_4">
<g> <g>
<use xlink:href="#ma801677d50" x="383.5" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m3c08837b00" x="383.5" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_4"> <g id="text_4">
<!-- Amazon Q --> <!-- Amazon Q -->
<g style="fill: #555555" transform="translate(343.346797 321.755844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(345.70875 325.51375) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-6d" d="M 413 3347 <path id="Helvetica-6d" d="M 413 3347
L 969 3347 L 969 3347
@ -949,7 +954,7 @@ z
<use xlink:href="#Helvetica-51" x="394.628906"/> <use xlink:href="#Helvetica-51" x="394.628906"/>
</g> </g>
<!-- Developer --> <!-- Developer -->
<g style="fill: #555555" transform="translate(344.758594 339.933094) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(347.0375 342.62175) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-44" d="M 2250 531 <path id="Helvetica-44" d="M 2250 531
Q 2566 531 2769 597 Q 2566 531 2769 597
@ -991,7 +996,7 @@ z
<use xlink:href="#Helvetica-72" x="422.509766"/> <use xlink:href="#Helvetica-72" x="422.509766"/>
</g> </g>
<!-- Agent --> <!-- Agent -->
<g style="fill: #555555" transform="translate(361.288437 358.110344) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(362.595 359.72975) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-41"/> <use xlink:href="#Helvetica-41"/>
<use xlink:href="#Helvetica-67" x="66.699219"/> <use xlink:href="#Helvetica-67" x="66.699219"/>
<use xlink:href="#Helvetica-65" x="122.314453"/> <use xlink:href="#Helvetica-65" x="122.314453"/>
@ -999,7 +1004,7 @@ z
<use xlink:href="#Helvetica-74" x="233.544922"/> <use xlink:href="#Helvetica-74" x="233.544922"/>
</g> </g>
<!-- (2294) --> <!-- (2294) -->
<g style="fill: #555555" transform="translate(358.932344 376.655219) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(360.3775 377.18375) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-28"/> <use xlink:href="#Helvetica-28"/>
<use xlink:href="#Helvetica-32" x="33.300781"/> <use xlink:href="#Helvetica-32" x="33.300781"/>
<use xlink:href="#Helvetica-32" x="88.916016"/> <use xlink:href="#Helvetica-32" x="88.916016"/>
@ -1012,12 +1017,12 @@ z
<g id="xtick_5"> <g id="xtick_5">
<g id="line2d_5"> <g id="line2d_5">
<g> <g>
<use xlink:href="#ma801677d50" x="465.451872" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m3c08837b00" x="465.451872" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_5"> <g id="text_5">
<!-- Devin --> <!-- Devin -->
<g style="fill: #555555" transform="translate(443.72109 321.755844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(444.999372 325.51375) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-69" d="M 413 3331 <path id="Helvetica-69" d="M 413 3331
L 984 3331 L 984 3331
@ -1040,7 +1045,7 @@ z
<use xlink:href="#Helvetica-6e" x="200.048828"/> <use xlink:href="#Helvetica-6e" x="200.048828"/>
</g> </g>
<!-- (570) --> <!-- (570) -->
<g style="fill: #555555" transform="translate(445.611012 340.082906) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(446.778122 342.76275) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-35" d="M 791 1141 <path id="Helvetica-35" d="M 791 1141
Q 847 659 1238 475 Q 847 659 1238 475
@ -1115,12 +1120,12 @@ z
<g id="xtick_6"> <g id="xtick_6">
<g id="line2d_6"> <g id="line2d_6">
<g> <g>
<use xlink:href="#ma801677d50" x="547.403743" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m3c08837b00" x="547.403743" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_6"> <g id="text_6">
<!-- Aider --> <!-- Aider -->
<g style="fill: #555555" transform="translate(527.561556 321.755844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(528.728743 325.51375) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-41"/> <use xlink:href="#Helvetica-41"/>
<use xlink:href="#Helvetica-69" x="66.699219"/> <use xlink:href="#Helvetica-69" x="66.699219"/>
<use xlink:href="#Helvetica-64" x="88.916016"/> <use xlink:href="#Helvetica-64" x="88.916016"/>
@ -1128,7 +1133,7 @@ z
<use xlink:href="#Helvetica-72" x="200.146484"/> <use xlink:href="#Helvetica-72" x="200.146484"/>
</g> </g>
<!-- GPT-4o --> <!-- GPT-4o -->
<g style="fill: #555555" transform="translate(517.647103 339.933094) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(519.397493 342.62175) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-47"/> <use xlink:href="#Helvetica-47"/>
<use xlink:href="#Helvetica-50" x="77.783203"/> <use xlink:href="#Helvetica-50" x="77.783203"/>
<use xlink:href="#Helvetica-54" x="144.482422"/> <use xlink:href="#Helvetica-54" x="144.482422"/>
@ -1137,7 +1142,7 @@ z
<use xlink:href="#Helvetica-6f" x="294.482422"/> <use xlink:href="#Helvetica-6f" x="294.482422"/>
</g> </g>
<!-- (570) --> <!-- (570) -->
<g style="fill: #555555" transform="translate(527.562884 358.260156) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(528.729993 359.87075) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-28"/> <use xlink:href="#Helvetica-28"/>
<use xlink:href="#Helvetica-35" x="33.300781"/> <use xlink:href="#Helvetica-35" x="33.300781"/>
<use xlink:href="#Helvetica-37" x="88.916016"/> <use xlink:href="#Helvetica-37" x="88.916016"/>
@ -1149,12 +1154,12 @@ z
<g id="xtick_7"> <g id="xtick_7">
<g id="line2d_7"> <g id="line2d_7">
<g> <g>
<use xlink:href="#ma801677d50" x="629.355615" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m3c08837b00" x="629.355615" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_7"> <g id="text_7">
<!-- Aider --> <!-- Aider -->
<g style="fill: #555555" transform="translate(609.513427 321.755844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(610.680615 325.51375) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-41"/> <use xlink:href="#Helvetica-41"/>
<use xlink:href="#Helvetica-69" x="66.699219"/> <use xlink:href="#Helvetica-69" x="66.699219"/>
<use xlink:href="#Helvetica-64" x="88.916016"/> <use xlink:href="#Helvetica-64" x="88.916016"/>
@ -1162,7 +1167,7 @@ z
<use xlink:href="#Helvetica-72" x="200.146484"/> <use xlink:href="#Helvetica-72" x="200.146484"/>
</g> </g>
<!-- GPT-4o --> <!-- GPT-4o -->
<g style="fill: #555555" transform="translate(599.598974 339.933094) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(601.349365 342.62175) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-47"/> <use xlink:href="#Helvetica-47"/>
<use xlink:href="#Helvetica-50" x="77.783203"/> <use xlink:href="#Helvetica-50" x="77.783203"/>
<use xlink:href="#Helvetica-54" x="144.482422"/> <use xlink:href="#Helvetica-54" x="144.482422"/>
@ -1171,7 +1176,7 @@ z
<use xlink:href="#Helvetica-6f" x="294.482422"/> <use xlink:href="#Helvetica-6f" x="294.482422"/>
</g> </g>
<!-- &amp; Opus --> <!-- &amp; Opus -->
<g style="fill: #555555" transform="translate(601.009443 358.508781) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(602.676865 360.10475) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-26" d="M 1828 2806 <path id="Helvetica-26" d="M 1828 2806
Q 2125 3016 2238 3147 Q 2125 3016 2238 3147
@ -1227,7 +1232,7 @@ z
<use xlink:href="#Helvetica-73" x="283.496094"/> <use xlink:href="#Helvetica-73" x="283.496094"/>
</g> </g>
<!-- (570) --> <!-- (570) -->
<g style="fill: #555555" transform="translate(609.514756 376.835844) scale(0.17 -0.17)"> <g style="fill: #555555" transform="translate(610.681865 377.35375) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-28"/> <use xlink:href="#Helvetica-28"/>
<use xlink:href="#Helvetica-35" x="33.300781"/> <use xlink:href="#Helvetica-35" x="33.300781"/>
<use xlink:href="#Helvetica-37" x="88.916016"/> <use xlink:href="#Helvetica-37" x="88.916016"/>
@ -1240,23 +1245,23 @@ z
<g id="matplotlib.axis_2"> <g id="matplotlib.axis_2">
<g id="ytick_1"> <g id="ytick_1">
<g id="line2d_8"> <g id="line2d_8">
<path d="M 77 302.561 <path d="M 77 307.03625
L 690 302.561 L 690 307.03625
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_9"> <g id="line2d_9">
<defs> <defs>
<path id="m2a97a87f10" d="M 0 0 <path id="m167bb8a136" d="M 0 0
L -3.5 0 L -3.5 0
" style="stroke: #000000; stroke-width: 0.8"/> " style="stroke: #000000; stroke-width: 0.8"/>
</defs> </defs>
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="302.561" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="307.03625" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_8"> <g id="text_8">
<!-- 0.0 --> <!-- 0.0 -->
<g transform="translate(56.1 306.147719) scale(0.1 -0.1)"> <g transform="translate(56.1 310.622969) scale(0.1 -0.1)">
<defs> <defs>
<path id="Helvetica-2e" d="M 547 681 <path id="Helvetica-2e" d="M 547 681
L 1200 681 L 1200 681
@ -1274,18 +1279,18 @@ z
</g> </g>
<g id="ytick_2"> <g id="ytick_2">
<g id="line2d_10"> <g id="line2d_10">
<path d="M 77 270.625716 <path d="M 77 274.534192
L 690 270.625716 L 690 274.534192
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_11"> <g id="line2d_11">
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="270.625716" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="274.534192" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_9"> <g id="text_9">
<!-- 2.5 --> <!-- 2.5 -->
<g transform="translate(56.1 274.212435) scale(0.1 -0.1)"> <g transform="translate(56.1 278.120911) scale(0.1 -0.1)">
<use xlink:href="#Helvetica-32"/> <use xlink:href="#Helvetica-32"/>
<use xlink:href="#Helvetica-2e" x="55.615234"/> <use xlink:href="#Helvetica-2e" x="55.615234"/>
<use xlink:href="#Helvetica-35" x="83.398438"/> <use xlink:href="#Helvetica-35" x="83.398438"/>
@ -1294,18 +1299,18 @@ L 690 270.625716
</g> </g>
<g id="ytick_3"> <g id="ytick_3">
<g id="line2d_12"> <g id="line2d_12">
<path d="M 77 238.690433 <path d="M 77 242.032134
L 690 238.690433 L 690 242.032134
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_13"> <g id="line2d_13">
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="238.690433" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="242.032134" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_10"> <g id="text_10">
<!-- 5.0 --> <!-- 5.0 -->
<g transform="translate(56.1 242.277151) scale(0.1 -0.1)"> <g transform="translate(56.1 245.618853) scale(0.1 -0.1)">
<use xlink:href="#Helvetica-35"/> <use xlink:href="#Helvetica-35"/>
<use xlink:href="#Helvetica-2e" x="55.615234"/> <use xlink:href="#Helvetica-2e" x="55.615234"/>
<use xlink:href="#Helvetica-30" x="83.398438"/> <use xlink:href="#Helvetica-30" x="83.398438"/>
@ -1314,18 +1319,18 @@ L 690 238.690433
</g> </g>
<g id="ytick_4"> <g id="ytick_4">
<g id="line2d_14"> <g id="line2d_14">
<path d="M 77 206.755149 <path d="M 77 209.530076
L 690 206.755149 L 690 209.530076
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_15"> <g id="line2d_15">
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="206.755149" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="209.530076" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_11"> <g id="text_11">
<!-- 7.5 --> <!-- 7.5 -->
<g transform="translate(56.1 210.341868) scale(0.1 -0.1)"> <g transform="translate(56.1 213.116795) scale(0.1 -0.1)">
<use xlink:href="#Helvetica-37"/> <use xlink:href="#Helvetica-37"/>
<use xlink:href="#Helvetica-2e" x="55.615234"/> <use xlink:href="#Helvetica-2e" x="55.615234"/>
<use xlink:href="#Helvetica-35" x="83.398438"/> <use xlink:href="#Helvetica-35" x="83.398438"/>
@ -1334,18 +1339,18 @@ L 690 206.755149
</g> </g>
<g id="ytick_5"> <g id="ytick_5">
<g id="line2d_16"> <g id="line2d_16">
<path d="M 77 174.819865 <path d="M 77 177.028018
L 690 174.819865 L 690 177.028018
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_17"> <g id="line2d_17">
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="174.819865" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="177.028018" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_12"> <g id="text_12">
<!-- 10.0 --> <!-- 10.0 -->
<g transform="translate(50.539063 178.406584) scale(0.1 -0.1)"> <g transform="translate(50.539063 180.614737) scale(0.1 -0.1)">
<defs> <defs>
<path id="Helvetica-31" d="M 613 3169 <path id="Helvetica-31" d="M 613 3169
L 613 3600 L 613 3600
@ -1368,18 +1373,18 @@ z
</g> </g>
<g id="ytick_6"> <g id="ytick_6">
<g id="line2d_18"> <g id="line2d_18">
<path d="M 77 142.884582 <path d="M 77 144.52596
L 690 142.884582 L 690 144.52596
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_19"> <g id="line2d_19">
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="142.884582" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="144.52596" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_13"> <g id="text_13">
<!-- 12.5 --> <!-- 12.5 -->
<g transform="translate(50.539063 146.4713) scale(0.1 -0.1)"> <g transform="translate(50.539063 148.112679) scale(0.1 -0.1)">
<use xlink:href="#Helvetica-31"/> <use xlink:href="#Helvetica-31"/>
<use xlink:href="#Helvetica-32" x="55.615234"/> <use xlink:href="#Helvetica-32" x="55.615234"/>
<use xlink:href="#Helvetica-2e" x="111.230469"/> <use xlink:href="#Helvetica-2e" x="111.230469"/>
@ -1389,18 +1394,18 @@ L 690 142.884582
</g> </g>
<g id="ytick_7"> <g id="ytick_7">
<g id="line2d_20"> <g id="line2d_20">
<path d="M 77 110.949298 <path d="M 77 112.023902
L 690 110.949298 L 690 112.023902
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_21"> <g id="line2d_21">
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="110.949298" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="112.023902" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_14"> <g id="text_14">
<!-- 15.0 --> <!-- 15.0 -->
<g transform="translate(50.539063 114.536017) scale(0.1 -0.1)"> <g transform="translate(50.539063 115.610621) scale(0.1 -0.1)">
<use xlink:href="#Helvetica-31"/> <use xlink:href="#Helvetica-31"/>
<use xlink:href="#Helvetica-35" x="55.615234"/> <use xlink:href="#Helvetica-35" x="55.615234"/>
<use xlink:href="#Helvetica-2e" x="111.230469"/> <use xlink:href="#Helvetica-2e" x="111.230469"/>
@ -1410,18 +1415,18 @@ L 690 110.949298
</g> </g>
<g id="ytick_8"> <g id="ytick_8">
<g id="line2d_22"> <g id="line2d_22">
<path d="M 77 79.014014 <path d="M 77 79.521844
L 690 79.014014 L 690 79.521844
" clip-path="url(#pf552f9dc48)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#p1ec2c53f8e)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_23"> <g id="line2d_23">
<g> <g>
<use xlink:href="#m2a97a87f10" x="77" y="79.014014" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m167bb8a136" x="77" y="79.521844" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_15"> <g id="text_15">
<!-- 17.5 --> <!-- 17.5 -->
<g transform="translate(50.539063 82.600733) scale(0.1 -0.1)"> <g transform="translate(50.539063 83.108563) scale(0.1 -0.1)">
<use xlink:href="#Helvetica-31"/> <use xlink:href="#Helvetica-31"/>
<use xlink:href="#Helvetica-37" x="55.615234"/> <use xlink:href="#Helvetica-37" x="55.615234"/>
<use xlink:href="#Helvetica-2e" x="111.230469"/> <use xlink:href="#Helvetica-2e" x="111.230469"/>
@ -1431,7 +1436,7 @@ L 690 79.014014
</g> </g>
<g id="text_16"> <g id="text_16">
<!-- Instances resolved (%) --> <!-- Instances resolved (%) -->
<g style="fill: #555555" transform="translate(42.787188 268.013312) rotate(-90) scale(0.18 -0.18)"> <g style="fill: #555555" transform="translate(42.787188 270.250937) rotate(-90) scale(0.18 -0.18)">
<defs> <defs>
<path id="Helvetica-49" d="M 628 4591 <path id="Helvetica-49" d="M 628 4591
L 1256 4591 L 1256 4591
@ -1538,18 +1543,18 @@ z
</g> </g>
</g> </g>
<g id="patch_3"> <g id="patch_3">
<path d="M 77 302.561 <path d="M 77 307.03625
L 77 50.4 L 77 50.4
" style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/> " style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/>
</g> </g>
<g id="patch_4"> <g id="patch_4">
<path d="M 690 302.561 <path d="M 690 307.03625
L 690 50.4 L 690 50.4
" style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/> " style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/>
</g> </g>
<g id="patch_5"> <g id="patch_5">
<path d="M 77 302.561 <path d="M 77 307.03625
L 690 302.561 L 690 307.03625
" style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/> " style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/>
</g> </g>
<g id="patch_6"> <g id="patch_6">
@ -1558,64 +1563,64 @@ L 690 50.4
" style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/> " style="fill: none; stroke: #dddddd; stroke-width: 0.5; stroke-linejoin: miter; stroke-linecap: square"/>
</g> </g>
<g id="patch_7"> <g id="patch_7">
<path d="M 104.863636 302.561 <path d="M 104.863636 307.03625
L 170.425134 302.561 L 170.425134 307.03625
L 170.425134 168.432809 L 170.425134 170.527606
L 104.863636 168.432809 L 104.863636 170.527606
z z
" clip-path="url(#pf552f9dc48)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#p1ec2c53f8e)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_8"> <g id="patch_8">
<path d="M 186.815508 302.561 <path d="M 186.815508 307.03625
L 252.377005 302.561 L 252.377005 307.03625
L 252.377005 167.155397 L 252.377005 169.227524
L 186.815508 167.155397 L 186.815508 169.227524
z z
" clip-path="url(#pf552f9dc48)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#p1ec2c53f8e)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_9"> <g id="patch_9">
<path d="M 268.76738 302.561 <path d="M 268.76738 307.03625
L 334.328877 302.561 L 334.328877 307.03625
L 334.328877 142.884582 L 334.328877 144.52596
L 268.76738 142.884582 L 268.76738 144.52596
z z
" clip-path="url(#pf552f9dc48)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#p1ec2c53f8e)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_10"> <g id="patch_10">
<path d="M 350.719251 302.561 <path d="M 350.719251 307.03625
L 416.280749 302.561 L 416.280749 307.03625
L 416.280749 126.278234 L 416.280749 127.62489
L 350.719251 126.278234 L 350.719251 127.62489
z z
" clip-path="url(#pf552f9dc48)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#p1ec2c53f8e)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_11"> <g id="patch_11">
<path d="M 432.671123 302.561 <path d="M 432.671123 307.03625
L 498.23262 302.561 L 498.23262 307.03625
L 498.23262 125.000823 L 498.23262 126.324807
L 432.671123 125.000823 L 432.671123 126.324807
z z
" clip-path="url(#pf552f9dc48)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#p1ec2c53f8e)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_12"> <g id="patch_12">
<path d="M 514.622995 302.561 <path d="M 514.622995 307.03625
L 580.184492 302.561 L 580.184492 307.03625
L 580.184492 85.401071 L 580.184492 86.022256
L 514.622995 85.401071 L 514.622995 86.022256
z z
" clip-path="url(#pf552f9dc48)" style="fill: #17965a; opacity: 0.9"/> " clip-path="url(#p1ec2c53f8e)" style="fill: #17965a; opacity: 0.9"/>
</g> </g>
<g id="patch_13"> <g id="patch_13">
<path d="M 596.574866 302.561 <path d="M 596.574866 307.03625
L 662.136364 302.561 L 662.136364 307.03625
L 662.136364 62.407667 L 662.136364 62.620774
L 596.574866 62.407667 L 596.574866 62.620774
z z
" clip-path="url(#pf552f9dc48)" style="fill: #17965a; opacity: 0.9"/> " clip-path="url(#p1ec2c53f8e)" style="fill: #17965a; opacity: 0.9"/>
</g> </g>
<g id="text_17"> <g id="text_17">
<!-- 10.5% --> <!-- 10.5% -->
<g style="fill: #555555" transform="translate(114.961885 192.684422) scale(0.16 -0.16)"> <g style="fill: #555555" transform="translate(114.961885 195.00593) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-31"/> <use xlink:href="#Helvetica-31"/>
<use xlink:href="#Helvetica-30" x="55.615234"/> <use xlink:href="#Helvetica-30" x="55.615234"/>
<use xlink:href="#Helvetica-2e" x="111.230469"/> <use xlink:href="#Helvetica-2e" x="111.230469"/>
@ -1625,7 +1630,7 @@ z
</g> </g>
<g id="text_18"> <g id="text_18">
<!-- 10.6% --> <!-- 10.6% -->
<g style="fill: #555555" transform="translate(196.913757 191.407011) scale(0.16 -0.16)"> <g style="fill: #555555" transform="translate(196.913757 193.705847) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-36" d="M 1872 4494 <path id="Helvetica-36" d="M 1872 4494
Q 2622 4494 2917 4105 Q 2622 4494 2917 4105
@ -1667,7 +1672,7 @@ z
</g> </g>
<g id="text_19"> <g id="text_19">
<!-- 12.5% --> <!-- 12.5% -->
<g style="fill: #555555" transform="translate(278.865628 167.136195) scale(0.16 -0.16)"> <g style="fill: #555555" transform="translate(278.865628 169.004283) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-31"/> <use xlink:href="#Helvetica-31"/>
<use xlink:href="#Helvetica-32" x="55.615234"/> <use xlink:href="#Helvetica-32" x="55.615234"/>
<use xlink:href="#Helvetica-2e" x="111.230469"/> <use xlink:href="#Helvetica-2e" x="111.230469"/>
@ -1677,7 +1682,7 @@ z
</g> </g>
<g id="text_20"> <g id="text_20">
<!-- 13.8% --> <!-- 13.8% -->
<g style="fill: #555555" transform="translate(360.8175 150.529848) scale(0.16 -0.16)"> <g style="fill: #555555" transform="translate(360.8175 152.103213) scale(0.16 -0.16)">
<defs> <defs>
<path id="Helvetica-33" d="M 1663 -122 <path id="Helvetica-33" d="M 1663 -122
Q 869 -122 511 314 Q 869 -122 511 314
@ -1762,7 +1767,7 @@ z
</g> </g>
<g id="text_21"> <g id="text_21">
<!-- 13.9% --> <!-- 13.9% -->
<g style="fill: #555555" transform="translate(442.769372 149.252436) scale(0.16 -0.16)"> <g style="fill: #555555" transform="translate(442.769372 150.803131) scale(0.16 -0.16)">
<use xlink:href="#Helvetica-31"/> <use xlink:href="#Helvetica-31"/>
<use xlink:href="#Helvetica-33" x="55.615234"/> <use xlink:href="#Helvetica-33" x="55.615234"/>
<use xlink:href="#Helvetica-2e" x="111.230469"/> <use xlink:href="#Helvetica-2e" x="111.230469"/>
@ -1772,7 +1777,7 @@ z
</g> </g>
<g id="text_22"> <g id="text_22">
<!-- 17.0% --> <!-- 17.0% -->
<g style="fill: #eeeeee" transform="translate(519.649993 110.332684) scale(0.16 -0.16)"> <g style="fill: #eeeeee" transform="translate(519.649993 111.180579) scale(0.16 -0.16)">
<defs> <defs>
<path id="DejaVuSans-Bold-31" d="M 750 831 <path id="DejaVuSans-Bold-31" d="M 750 831
L 1813 831 L 1813 831
@ -1883,7 +1888,7 @@ z
</g> </g>
<g id="text_23"> <g id="text_23">
<!-- 18.8% --> <!-- 18.8% -->
<g style="fill: #eeeeee" transform="translate(601.601865 87.33928) scale(0.16 -0.16)"> <g style="fill: #eeeeee" transform="translate(601.601865 87.779097) scale(0.16 -0.16)">
<defs> <defs>
<path id="DejaVuSans-Bold-38" d="M 2228 2088 <path id="DejaVuSans-Bold-38" d="M 2228 2088
Q 1891 2088 1709 1903 Q 1891 2088 1709 1903
@ -2205,8 +2210,8 @@ z
</g> </g>
</g> </g>
<defs> <defs>
<clipPath id="pf552f9dc48"> <clipPath id="p1ec2c53f8e">
<rect x="77" y="50.4" width="613" height="252.161"/> <rect x="77" y="50.4" width="613" height="256.63625"/>
</clipPath> </clipPath>
</defs> </defs>
</svg> </svg>

Before

Width:  |  Height:  |  Size: 56 KiB

After

Width:  |  Height:  |  Size: 57 KiB

Before After
Before After

View file

@ -3,5 +3,5 @@
13.9% Devin|(570) 13.9% Devin|(570)
13.8% Amazon Q|Developer|Agent|(2294) 13.8% Amazon Q|Developer|Agent|(2294)
12.5% SWE-|Agent|+ GPT-4|(2294) 12.5% SWE-|Agent|+ GPT-4|(2294)
10.6% AutoCode|Rover|(2294) 10.6% Auto|Code|Rover|(2294)
10.5% SWE-|Agent|+ Opus|(2294) 10.5% SWE-|Agent|+ Opus|(2294)

View file

@ -76,7 +76,7 @@ def plot_swe_bench_lite(data_file):
ax.set_title(title, fontsize=20) ax.set_title(title, fontsize=20)
# ax.set_ylim(0, 29.9) # ax.set_ylim(0, 29.9)
plt.xticks( plt.xticks(
fontsize=17, fontsize=16,
color=font_color, color=font_color,
) )