mirror of
https://github.com/Aider-AI/aider.git
synced 2025-06-06 04:35:00 +00:00
copy
This commit is contained in:
parent
d5ea078f24
commit
acd7309b78
2 changed files with 13 additions and 10 deletions
|
@ -1,6 +1,6 @@
|
||||||
- dirname: 2025-05-08-03-20-24--qwen3-32b-default
|
- dirname: 2025-05-08-03-20-24--qwen3-32b-default
|
||||||
test_cases: 225
|
test_cases: 225
|
||||||
model: Qwen3 32B on OpenRouter, all providers, default settings (thinking)
|
model: Qwen3 32B diff on OpenRouter, all providers, default settings (thinking)
|
||||||
edit_format: diff
|
edit_format: diff
|
||||||
commit_hash: aaacee5-dirty, aeaf259
|
commit_hash: aaacee5-dirty, aeaf259
|
||||||
pass_rate_1: 14.2
|
pass_rate_1: 14.2
|
||||||
|
@ -28,7 +28,7 @@
|
||||||
|
|
||||||
- dirname: 2025-05-08-03-22-37--qwen3-235b-defaults
|
- dirname: 2025-05-08-03-22-37--qwen3-235b-defaults
|
||||||
test_cases: 225
|
test_cases: 225
|
||||||
model: Qwen3 235B A22B on OpenRouter, all providers, default settings (thinking)
|
model: Qwen3 235B A22B diff on OpenRouter, all providers, default settings (thinking)
|
||||||
edit_format: diff
|
edit_format: diff
|
||||||
commit_hash: aaacee5-dirty
|
commit_hash: aaacee5-dirty
|
||||||
pass_rate_1: 17.3
|
pass_rate_1: 17.3
|
||||||
|
@ -57,7 +57,7 @@
|
||||||
|
|
||||||
- dirname: 2025-05-08-17-39-14--qwen3-235b-or-together-only
|
- dirname: 2025-05-08-17-39-14--qwen3-235b-or-together-only
|
||||||
test_cases: 225
|
test_cases: 225
|
||||||
model: Qwen3 235B A22B on OpenRouter only TogetherAI, recommended /no_think settings
|
model: Qwen3 235B A22B diff on OpenRouter only TogetherAI, recommended /no_think settings
|
||||||
edit_format: diff
|
edit_format: diff
|
||||||
commit_hash: 328584e
|
commit_hash: 328584e
|
||||||
pass_rate_1: 28.0
|
pass_rate_1: 28.0
|
||||||
|
@ -86,7 +86,7 @@
|
||||||
|
|
||||||
- dirname: 2025-04-30-04-49-37--Qwen3-235B-A22B-whole-nothink
|
- dirname: 2025-04-30-04-49-37--Qwen3-235B-A22B-whole-nothink
|
||||||
test_cases: 225
|
test_cases: 225
|
||||||
model: Qwen3-235B-A22B with VLLM, bfloat16, recommended /no_think settings
|
model: Qwen3-235B-A22B whole with VLLM, bfloat16, recommended /no_think settings
|
||||||
edit_format: whole
|
edit_format: whole
|
||||||
commit_hash: 0c383df-dirty
|
commit_hash: 0c383df-dirty
|
||||||
pass_rate_1: 28.0
|
pass_rate_1: 28.0
|
||||||
|
@ -112,7 +112,7 @@
|
||||||
|
|
||||||
- dirname: 2025-04-30-04-49-50--Qwen3-235B-A22B-diff-nothink
|
- dirname: 2025-04-30-04-49-50--Qwen3-235B-A22B-diff-nothink
|
||||||
test_cases: 225
|
test_cases: 225
|
||||||
model: Qwen3-235B-A22B with VLLM, bfloat16, recommended /no_think settings
|
model: Qwen3-235B-A22B diff with VLLM, bfloat16, recommended /no_think settings
|
||||||
edit_format: diff
|
edit_format: diff
|
||||||
commit_hash: 0c383df-dirty
|
commit_hash: 0c383df-dirty
|
||||||
pass_rate_1: 29.8
|
pass_rate_1: 29.8
|
||||||
|
@ -138,7 +138,7 @@
|
||||||
|
|
||||||
- dirname: 2025-04-30-04-08-41--Qwen3-32B-whole-nothink
|
- dirname: 2025-04-30-04-08-41--Qwen3-32B-whole-nothink
|
||||||
test_cases: 225
|
test_cases: 225
|
||||||
model: Qwen3-32B with VLLM, bfloat16, recommended /no_think settings
|
model: Qwen3-32B whole with VLLM, bfloat16, recommended /no_think settings
|
||||||
edit_format: whole
|
edit_format: whole
|
||||||
commit_hash: 0c383df-dirty
|
commit_hash: 0c383df-dirty
|
||||||
pass_rate_1: 20.4
|
pass_rate_1: 20.4
|
||||||
|
@ -164,7 +164,7 @@
|
||||||
|
|
||||||
- dirname: 2025-04-30-04-08-51--Qwen3-32B-diff-nothink
|
- dirname: 2025-04-30-04-08-51--Qwen3-32B-diff-nothink
|
||||||
test_cases: 225
|
test_cases: 225
|
||||||
model: Qwen3-32B with VLLM, bfloat16, recommended /no_think settings
|
model: Qwen3-32B diff with VLLM, bfloat16, recommended /no_think settings
|
||||||
edit_format: diff
|
edit_format: diff
|
||||||
commit_hash: 0c383df-dirty
|
commit_hash: 0c383df-dirty
|
||||||
pass_rate_1: 20.4
|
pass_rate_1: 20.4
|
||||||
|
|
|
@ -14,10 +14,13 @@ Open source models are wonderful because anyone can serve them,
|
||||||
but API providers can use very different inference settings, quantizations, etc.
|
but API providers can use very different inference settings, quantizations, etc.
|
||||||
|
|
||||||
Below are collection of aider polyglot benchmark results for the new Qwen3 models.
|
Below are collection of aider polyglot benchmark results for the new Qwen3 models.
|
||||||
Results are presented with various settings against various API providers,
|
Results are presented using both "diff" and "whole"
|
||||||
with the hope of showcasing the strengths of these models and their providers.
|
[edit formats](https://aider.chat/docs/more/edit-formats.html),
|
||||||
|
with various models settings, against various API providers.
|
||||||
|
|
||||||
See details for configuring Qwen3 after the results table.
|
See details on the
|
||||||
|
[model settings](https://aider.chat/docs/config/adv-model-settings.html#model-settings)
|
||||||
|
used after the results table.
|
||||||
|
|
||||||
{: .note }
|
{: .note }
|
||||||
This article is being updated as new results become available.
|
This article is being updated as new results become available.
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue