This commit is contained in:
Paul Gauthier 2024-06-01 16:10:55 -07:00
parent 47a3cb8adf
commit 2cb9a8ddc8
4 changed files with 87 additions and 117 deletions

View file

@ -23,8 +23,8 @@ that was reported recently.
[![SWE Bench results](/assets/swe_bench.svg)](https://aider.chat/assets/swe_bench.svg) [![SWE Bench results](/assets/swe_bench.svg)](https://aider.chat/assets/swe_bench.svg)
Aider was benchmarked on the same Aider was benchmarked on the same
[random 570](https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs) [randomly selected 570](https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs)
of the 2294 SWE Bench problems that were used in the of the 2,294 SWE Bench problems that were used in the
[Devin evaluation](https://www.cognition.ai/post/swe-bench-technical-report). [Devin evaluation](https://www.cognition.ai/post/swe-bench-technical-report).
Please see the [references](#references) Please see the [references](#references)
for more details on the data presented in this chart. for more details on the data presented in this chart.
@ -187,68 +187,20 @@ are "more plausible" than some of GPT-4o's non-plausible solutions.
These more plausible, incorrect solutions can These more plausible, incorrect solutions can
eclipse some of eclipse some of
the earlier non-plausible correct solutions that GPT-4o generated. the earlier non-plausible correct solutions that GPT-4o generated.
This reduces GPT-4o's score in the table (15.3%) from the combined GPT-4o & Opus This is why GPT-4o's score in the table
benchmark, showing the combined GPT-4o & Opus results (15.3%)
as compared to the results from just one try using aider with GPT-4o (17.0%). is lower than the result from just one try using aider with GPT-4o (17.0%).
For these reasons, adding additional attempts is not guaranteed to monotonically For these reasons, adding additional attempts is not guaranteed to monotonically
increase the number of resolved problems. increase the number of resolved problems.
New solutions may resolve some new problems but they may also New solutions may resolve some new problems but they may also
eclipse and discard some of the previous non-plausible correct solutions. eclipse and discard some of the previous non-plausible correct solutions.
Luckily, additional attempts usually provide a net increase in the overall Luckily, additional attempts usually provide a net increase in the overall
number of resolved solutions. number of resolved solutions.
This was the case for both this main SWE Bench result and the This was the case for both this main SWE Bench result and the
earlier Lite result. earlier Lite result.
The table below breaks down the benchmark outcome of each problem,
showing whether aider with GPT-4o and with Opus
produced plausible and/or correct solutions.
|Row|Aider<br>w/GPT-4o<br>solution<br>plausible?|Aider<br>w/GPT-4o<br>solution<br>resolved<br>issue?|Aider<br>w/Opus<br>solution<br>plausible?|Aider<br>w/Opus<br>solution<br>resolved<br>issue?|Number of<br>problems<br>with this<br>outcome|Number of<br>problems<br>resolved|
|:--:|:--:|:--:|:--:|:--:|--:|--:|
| A | **plausible** | **resolved** | n/a | n/a | 73 | 73 |
| B | **plausible** | no | n/a | n/a | 181 | 0 |
| C | no | no | **plausible** | no | 53 | 0 |
| D | no | no | **plausible** | **resolved** | 12 | 12 |
| E | no | **resolved** | **plausible** | no | 2 | 0 |
| F | no | **resolved** | **plausible** | **resolved** | 1 | 1 |
| G | no | no | no | no | 216 | 0 |
| H | no | no | no | **resolved** | 4 | 2 |
| I | no | **resolved** | no | no | 4 | 3 |
| J | no | **resolved** | no | **resolved** | 17 | 17 |
| K | no | no | n/a | n/a | 7 | 0 |
|Total|||||570|108|
Rows A-B show the cases where
aider with GPT-4o found a plausible solution during the first attempt.
Of those, 73 went on to be deemed as resolving the issue,
while 181 were not in fact correct solutions.
The second attempt with Opus never happened,
because the harness stopped once a
plausible solution was found.
Rows C-F consider the straightforward cases where aider with GPT-4o
didn't find a plausible solution but Opus did.
So Opus' solutions were adopted and they
went on to be deemed correct for 13 problems
and incorrect for 55.
In that group, Row E is an interesting special case, where GPT-4o found 2
non-plausible but correct solutions.
We can see that Opus overrides
them with plausible-but-incorrect
solutions resulting in 0 resolved problems from that row.
Rows G-K cover the cases where neither model
produced plausible solutions.
Which solution was ultimately selected for each problem depends on
[details about which solution the harness considered "most plausible"](https://aider.chat/2024/05/22/swe-bench-lite.html#finding-a-plausible-solution).
Row K contains cases where Opus returned errors due to context window
exhaustion or other problems.
In these cases aider with Opus was unable to produce any solutions
so GPT-4o's solutions were adopted.
## Computing the benchmark score ## Computing the benchmark score
The benchmark harness produced one proposed solution for each of The benchmark harness produced one proposed solution for each of
@ -289,11 +241,11 @@ making it faster, easier, and more reliable to run the acceptance tests.
Below are the references for the SWE-Bench results Below are the references for the SWE-Bench results
displayed in the graph at the beginning of this article. displayed in the graph at the beginning of this article.
- [13.9% Devin (benchmarked on 570 instances)](https://www.cognition.ai/post/swe-bench-technical-report) - [13.9% Devin, benchmarked on 570 instances.](https://www.cognition.ai/post/swe-bench-technical-report)
- [13.8% Amazon Q Developer Agent (benchmarked on 2294 instances)](https://www.swebench.com) - [13.8% Amazon Q Developer Agent, benchmarked on 2,294 instances.](https://www.swebench.com)
- [12.5% SWE- Agent + GPT-4 (benchmarked on 2294 instances)](https://www.swebench.com) - [12.5% SWE- Agent + GPT-4, benchmarked on 2,294 instances.](https://www.swebench.com)
- [10.6% AutoCode Rover (benchmarked on 2294 instances)](https://arxiv.org/pdf/2404.05427v2) - [10.6% AutoCode Rover, benchmarked on 2,294 instances.](https://arxiv.org/pdf/2404.05427v2)
- [10.5% SWE- Agent + Opus (benchmarked on 2294 instances)](https://www.swebench.com) - [10.5% SWE- Agent + Opus, benchmarked on 2,294 instances.](https://www.swebench.com)
The graph contains average pass@1 results for AutoCodeRover. The graph contains average pass@1 results for AutoCodeRover.
The [AutoCodeRover GitHub page](https://github.com/nus-apr/auto-code-rover) The [AutoCodeRover GitHub page](https://github.com/nus-apr/auto-code-rover)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 43 KiB

After

Width:  |  Height:  |  Size: 43 KiB

Before After
Before After

View file

@ -6,7 +6,7 @@
<rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://creativecommons.org/ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://creativecommons.org/ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<cc:Work> <cc:Work>
<dc:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/> <dc:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/>
<dc:date>2024-06-01T14:55:22.797792</dc:date> <dc:date>2024-06-01T16:00:26.751322</dc:date>
<dc:format>image/svg+xml</dc:format> <dc:format>image/svg+xml</dc:format>
<dc:creator> <dc:creator>
<cc:Agent> <cc:Agent>
@ -41,12 +41,12 @@ z
<g id="xtick_1"> <g id="xtick_1">
<g id="line2d_1"> <g id="line2d_1">
<defs> <defs>
<path id="m8f37288b3b" d="M 0 0 <path id="m9698733033" d="M 0 0
L 0 3.5 L 0 3.5
" style="stroke: #000000; stroke-width: 0.8"/> " style="stroke: #000000; stroke-width: 0.8"/>
</defs> </defs>
<g> <g>
<use xlink:href="#m8f37288b3b" x="137.644385" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m9698733033" x="137.644385" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_1"> <g id="text_1">
@ -412,7 +412,7 @@ z
<g id="xtick_2"> <g id="xtick_2">
<g id="line2d_2"> <g id="line2d_2">
<g> <g>
<use xlink:href="#m8f37288b3b" x="219.596257" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m9698733033" x="219.596257" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_2"> <g id="text_2">
@ -583,7 +583,7 @@ z
<g id="xtick_3"> <g id="xtick_3">
<g id="line2d_3"> <g id="line2d_3">
<g> <g>
<use xlink:href="#m8f37288b3b" x="301.548128" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m9698733033" x="301.548128" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_3"> <g id="text_3">
@ -699,7 +699,7 @@ z
<g id="xtick_4"> <g id="xtick_4">
<g id="line2d_4"> <g id="line2d_4">
<g> <g>
<use xlink:href="#m8f37288b3b" x="383.5" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m9698733033" x="383.5" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_4"> <g id="text_4">
@ -894,7 +894,7 @@ z
<g id="xtick_5"> <g id="xtick_5">
<g id="line2d_5"> <g id="line2d_5">
<g> <g>
<use xlink:href="#m8f37288b3b" x="465.451872" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m9698733033" x="465.451872" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_5"> <g id="text_5">
@ -926,7 +926,7 @@ z
<g id="xtick_6"> <g id="xtick_6">
<g id="line2d_6"> <g id="line2d_6">
<g> <g>
<use xlink:href="#m8f37288b3b" x="547.403743" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m9698733033" x="547.403743" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_6"> <g id="text_6">
@ -1157,7 +1157,7 @@ z
<g id="xtick_7"> <g id="xtick_7">
<g id="line2d_7"> <g id="line2d_7">
<g> <g>
<use xlink:href="#m8f37288b3b" x="629.355615" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#m9698733033" x="629.355615" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_7"> <g id="text_7">
@ -1339,16 +1339,16 @@ z
<g id="line2d_8"> <g id="line2d_8">
<path d="M 77 307.664 <path d="M 77 307.664
L 690 307.664 L 690 307.664
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_9"> <g id="line2d_9">
<defs> <defs>
<path id="mc542aa3404" d="M 0 0 <path id="mc771fb8b0f" d="M 0 0
L -3.5 0 L -3.5 0
" style="stroke: #000000; stroke-width: 0.8"/> " style="stroke: #000000; stroke-width: 0.8"/>
</defs> </defs>
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="307.664" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="307.664" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_8"> <g id="text_8">
@ -1394,11 +1394,11 @@ z
<g id="line2d_10"> <g id="line2d_10">
<path d="M 77 275.254829 <path d="M 77 275.254829
L 690 275.254829 L 690 275.254829
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_11"> <g id="line2d_11">
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="275.254829" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="275.254829" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_9"> <g id="text_9">
@ -1467,11 +1467,11 @@ z
<g id="line2d_12"> <g id="line2d_12">
<path d="M 77 242.845658 <path d="M 77 242.845658
L 690 242.845658 L 690 242.845658
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_13"> <g id="line2d_13">
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="242.845658" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="242.845658" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_10"> <g id="text_10">
@ -1487,11 +1487,11 @@ L 690 242.845658
<g id="line2d_14"> <g id="line2d_14">
<path d="M 77 210.436487 <path d="M 77 210.436487
L 690 210.436487 L 690 210.436487
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_15"> <g id="line2d_15">
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="210.436487" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="210.436487" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_11"> <g id="text_11">
@ -1523,11 +1523,11 @@ z
<g id="line2d_16"> <g id="line2d_16">
<path d="M 77 178.027316 <path d="M 77 178.027316
L 690 178.027316 L 690 178.027316
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_17"> <g id="line2d_17">
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="178.027316" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="178.027316" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_12"> <g id="text_12">
@ -1557,11 +1557,11 @@ z
<g id="line2d_18"> <g id="line2d_18">
<path d="M 77 145.618145 <path d="M 77 145.618145
L 690 145.618145 L 690 145.618145
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_19"> <g id="line2d_19">
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="145.618145" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="145.618145" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_13"> <g id="text_13">
@ -1578,11 +1578,11 @@ L 690 145.618145
<g id="line2d_20"> <g id="line2d_20">
<path d="M 77 113.208974 <path d="M 77 113.208974
L 690 113.208974 L 690 113.208974
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_21"> <g id="line2d_21">
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="113.208974" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="113.208974" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_14"> <g id="text_14">
@ -1599,11 +1599,11 @@ L 690 113.208974
<g id="line2d_22"> <g id="line2d_22">
<path d="M 77 80.799802 <path d="M 77 80.799802
L 690 80.799802 L 690 80.799802
" clip-path="url(#p8c34e9879c)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/> " clip-path="url(#pb8819c8324)" style="fill: none; stroke: #b0b0b0; stroke-width: 0.2; stroke-linecap: square"/>
</g> </g>
<g id="line2d_23"> <g id="line2d_23">
<g> <g>
<use xlink:href="#mc542aa3404" x="77" y="80.799802" style="stroke: #000000; stroke-width: 0.8"/> <use xlink:href="#mc771fb8b0f" x="77" y="80.799802" style="stroke: #000000; stroke-width: 0.8"/>
</g> </g>
</g> </g>
<g id="text_15"> <g id="text_15">
@ -1780,7 +1780,7 @@ L 170.425134 307.664
L 170.425134 171.545481 L 170.425134 171.545481
L 104.863636 171.545481 L 104.863636 171.545481
z z
" clip-path="url(#p8c34e9879c)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#pb8819c8324)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_8"> <g id="patch_8">
<path d="M 186.815508 307.664 <path d="M 186.815508 307.664
@ -1788,7 +1788,7 @@ L 252.377005 307.664
L 252.377005 170.249115 L 252.377005 170.249115
L 186.815508 170.249115 L 186.815508 170.249115
z z
" clip-path="url(#p8c34e9879c)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#pb8819c8324)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_9"> <g id="patch_9">
<path d="M 268.76738 307.664 <path d="M 268.76738 307.664
@ -1796,7 +1796,7 @@ L 334.328877 307.664
L 334.328877 145.618145 L 334.328877 145.618145
L 268.76738 145.618145 L 268.76738 145.618145
z z
" clip-path="url(#p8c34e9879c)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#pb8819c8324)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_10"> <g id="patch_10">
<path d="M 350.719251 307.664 <path d="M 350.719251 307.664
@ -1804,7 +1804,7 @@ L 416.280749 307.664
L 416.280749 128.765376 L 416.280749 128.765376
L 350.719251 128.765376 L 350.719251 128.765376
z z
" clip-path="url(#p8c34e9879c)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#pb8819c8324)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_11"> <g id="patch_11">
<path d="M 432.671123 307.664 <path d="M 432.671123 307.664
@ -1812,7 +1812,7 @@ L 498.23262 307.664
L 498.23262 127.469009 L 498.23262 127.469009
L 432.671123 127.469009 L 432.671123 127.469009
z z
" clip-path="url(#p8c34e9879c)" style="fill: #b3d1e6; opacity: 0.3"/> " clip-path="url(#pb8819c8324)" style="fill: #b3d1e6; opacity: 0.3"/>
</g> </g>
<g id="patch_12"> <g id="patch_12">
<path d="M 514.622995 307.664 <path d="M 514.622995 307.664
@ -1820,7 +1820,7 @@ L 580.184492 307.664
L 580.184492 87.281637 L 580.184492 87.281637
L 514.622995 87.281637 L 514.622995 87.281637
z z
" clip-path="url(#p8c34e9879c)" style="fill: #1a75c2; opacity: 0.9"/> " clip-path="url(#pb8819c8324)" style="fill: #1a75c2; opacity: 0.9"/>
</g> </g>
<g id="patch_13"> <g id="patch_13">
<path d="M 596.574866 307.664 <path d="M 596.574866 307.664
@ -1828,7 +1828,7 @@ L 662.136364 307.664
L 662.136364 62.650667 L 662.136364 62.650667
L 596.574866 62.650667 L 596.574866 62.650667
z z
" clip-path="url(#p8c34e9879c)" style="fill: #1a75c2; opacity: 0.9"/> " clip-path="url(#pb8819c8324)" style="fill: #1a75c2; opacity: 0.9"/>
</g> </g>
<g id="text_17"> <g id="text_17">
<!-- 10.5% --> <!-- 10.5% -->
@ -2212,8 +2212,8 @@ z
</g> </g>
</g> </g>
<g id="text_24"> <g id="text_24">
<!-- of 2294 --> <!-- of 2,294 -->
<g style="fill: #555555" transform="translate(117.627823 212.562778) scale(0.12 -0.12)"> <g style="fill: #555555" transform="translate(115.960948 212.562778) scale(0.12 -0.12)">
<defs> <defs>
<path id="Helvetica-66" d="M 553 3856 <path id="Helvetica-66" d="M 553 3856
Q 566 4206 675 4369 Q 566 4206 675 4369
@ -2236,51 +2236,69 @@ L 88 3331
L 553 3331 L 553 3331
L 553 3856 L 553 3856
z z
" transform="scale(0.015625)"/>
<path id="Helvetica-2c" d="M 531 -653
Q 747 -616 834 -350
Q 881 -209 881 -78
Q 881 -56 879 -39
Q 878 -22 872 0
L 531 0
L 531 681
L 1200 681
L 1200 50
Q 1200 -322 1050 -603
Q 900 -884 531 -950
L 531 -653
z
" transform="scale(0.015625)"/> " transform="scale(0.015625)"/>
</defs> </defs>
<use xlink:href="#Helvetica-6f"/> <use xlink:href="#Helvetica-6f"/>
<use xlink:href="#Helvetica-66" x="55.615234"/> <use xlink:href="#Helvetica-66" x="55.615234"/>
<use xlink:href="#Helvetica-20" x="83.398438"/> <use xlink:href="#Helvetica-20" x="83.398438"/>
<use xlink:href="#Helvetica-32" x="111.181641"/> <use xlink:href="#Helvetica-32" x="111.181641"/>
<use xlink:href="#Helvetica-32" x="166.796875"/> <use xlink:href="#Helvetica-2c" x="166.796875"/>
<use xlink:href="#Helvetica-39" x="222.412109"/> <use xlink:href="#Helvetica-32" x="194.580078"/>
<use xlink:href="#Helvetica-34" x="278.027344"/> <use xlink:href="#Helvetica-39" x="250.195312"/>
<use xlink:href="#Helvetica-34" x="305.810547"/>
</g> </g>
</g> </g>
<g id="text_25"> <g id="text_25">
<!-- of 2294 --> <!-- of 2,294 -->
<g style="fill: #555555" transform="translate(199.579694 211.266411) scale(0.12 -0.12)"> <g style="fill: #555555" transform="translate(197.912819 211.266411) scale(0.12 -0.12)">
<use xlink:href="#Helvetica-6f"/> <use xlink:href="#Helvetica-6f"/>
<use xlink:href="#Helvetica-66" x="55.615234"/> <use xlink:href="#Helvetica-66" x="55.615234"/>
<use xlink:href="#Helvetica-20" x="83.398438"/> <use xlink:href="#Helvetica-20" x="83.398438"/>
<use xlink:href="#Helvetica-32" x="111.181641"/> <use xlink:href="#Helvetica-32" x="111.181641"/>
<use xlink:href="#Helvetica-32" x="166.796875"/> <use xlink:href="#Helvetica-2c" x="166.796875"/>
<use xlink:href="#Helvetica-39" x="222.412109"/> <use xlink:href="#Helvetica-32" x="194.580078"/>
<use xlink:href="#Helvetica-34" x="278.027344"/> <use xlink:href="#Helvetica-39" x="250.195312"/>
<use xlink:href="#Helvetica-34" x="305.810547"/>
</g> </g>
</g> </g>
<g id="text_26"> <g id="text_26">
<!-- of 2294 --> <!-- of 2,294 -->
<g style="fill: #555555" transform="translate(281.531566 186.635441) scale(0.12 -0.12)"> <g style="fill: #555555" transform="translate(279.864691 186.635441) scale(0.12 -0.12)">
<use xlink:href="#Helvetica-6f"/> <use xlink:href="#Helvetica-6f"/>
<use xlink:href="#Helvetica-66" x="55.615234"/> <use xlink:href="#Helvetica-66" x="55.615234"/>
<use xlink:href="#Helvetica-20" x="83.398438"/> <use xlink:href="#Helvetica-20" x="83.398438"/>
<use xlink:href="#Helvetica-32" x="111.181641"/> <use xlink:href="#Helvetica-32" x="111.181641"/>
<use xlink:href="#Helvetica-32" x="166.796875"/> <use xlink:href="#Helvetica-2c" x="166.796875"/>
<use xlink:href="#Helvetica-39" x="222.412109"/> <use xlink:href="#Helvetica-32" x="194.580078"/>
<use xlink:href="#Helvetica-34" x="278.027344"/> <use xlink:href="#Helvetica-39" x="250.195312"/>
<use xlink:href="#Helvetica-34" x="305.810547"/>
</g> </g>
</g> </g>
<g id="text_27"> <g id="text_27">
<!-- of 2294 --> <!-- of 2,294 -->
<g style="fill: #555555" transform="translate(363.483437 169.782672) scale(0.12 -0.12)"> <g style="fill: #555555" transform="translate(361.816562 169.782672) scale(0.12 -0.12)">
<use xlink:href="#Helvetica-6f"/> <use xlink:href="#Helvetica-6f"/>
<use xlink:href="#Helvetica-66" x="55.615234"/> <use xlink:href="#Helvetica-66" x="55.615234"/>
<use xlink:href="#Helvetica-20" x="83.398438"/> <use xlink:href="#Helvetica-20" x="83.398438"/>
<use xlink:href="#Helvetica-32" x="111.181641"/> <use xlink:href="#Helvetica-32" x="111.181641"/>
<use xlink:href="#Helvetica-32" x="166.796875"/> <use xlink:href="#Helvetica-2c" x="166.796875"/>
<use xlink:href="#Helvetica-39" x="222.412109"/> <use xlink:href="#Helvetica-32" x="194.580078"/>
<use xlink:href="#Helvetica-34" x="278.027344"/> <use xlink:href="#Helvetica-39" x="250.195312"/>
<use xlink:href="#Helvetica-34" x="305.810547"/>
</g> </g>
</g> </g>
<g id="text_28"> <g id="text_28">
@ -2386,7 +2404,7 @@ z
</g> </g>
</g> </g>
<defs> <defs>
<clipPath id="p8c34e9879c"> <clipPath id="pb8819c8324">
<rect x="77" y="50.4" width="613" height="257.264"/> <rect x="77" y="50.4" width="613" height="257.264"/>
</clipPath> </clipPath>
</defs> </defs>

Before

Width:  |  Height:  |  Size: 57 KiB

After

Width:  |  Height:  |  Size: 58 KiB

Before After
Before After

View file

@ -1,7 +1,7 @@
18.9% Aider|GPT-4o|& Opus|(570) 18.9% Aider|GPT-4o|& Opus|(570)
17.0% Aider|GPT-4o|(570) 17.0% Aider|GPT-4o|(570)
13.9% Devin|(570) 13.9% Devin|(570)
13.8% Amazon Q|Developer|Agent|(2294) 13.8% Amazon Q|Developer|Agent|(2,294)
12.5% SWE-|Agent|+ GPT-4|(2294) 12.5% SWE-|Agent|+ GPT-4|(2,294)
10.6% Auto|Code|Rover|(2294) 10.6% Auto|Code|Rover|(2,294)
10.5% SWE-|Agent|+ Opus|(2294) 10.5% SWE-|Agent|+ Opus|(2,294)