mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-29 16:54:59 +00:00
copy
This commit is contained in:
parent
c550422168
commit
aee94a0584
2 changed files with 36 additions and 6 deletions
|
@ -274,3 +274,26 @@
|
||||||
versions: 0.64.2.dev
|
versions: 0.64.2.dev
|
||||||
seconds_per_case: 110.0
|
seconds_per_case: 110.0
|
||||||
total_cost: 0.1763
|
total_cost: 0.1763
|
||||||
|
|
||||||
|
- dirname: 2024-11-24-15-00-50--qwen25-32b-or-deepinfra
|
||||||
|
test_cases: 133
|
||||||
|
model: "Deepinfra via OpenRouter: BF16"
|
||||||
|
edit_format: diff
|
||||||
|
commit_hash: c2f184f
|
||||||
|
pass_rate_1: 57.1
|
||||||
|
pass_rate_2: 69.9
|
||||||
|
percent_cases_well_formed: 89.5
|
||||||
|
error_outputs: 35
|
||||||
|
num_malformed_responses: 31
|
||||||
|
num_with_malformed_responses: 14
|
||||||
|
user_asks: 11
|
||||||
|
lazy_comments: 0
|
||||||
|
syntax_errors: 1
|
||||||
|
indentation_errors: 1
|
||||||
|
exhausted_context_windows: 4
|
||||||
|
test_timeouts: 1
|
||||||
|
command: aider --model openrouter/qwen/qwen-2.5-coder-32b-instruct
|
||||||
|
date: 2024-11-24
|
||||||
|
versions: 0.64.2.dev
|
||||||
|
seconds_per_case: 28.5
|
||||||
|
total_cost: 0.1390
|
|
@ -18,11 +18,6 @@ can impact code editing skill.
|
||||||
Heavily quantized models are often used by cloud API providers
|
Heavily quantized models are often used by cloud API providers
|
||||||
and local model servers like Ollama or MLX.
|
and local model servers like Ollama or MLX.
|
||||||
|
|
||||||
<canvas id="quantChart" width="800" height="500" style="margin: 20px 0"></canvas>
|
|
||||||
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
|
|
||||||
<script>
|
|
||||||
{% include quant-chart.js %}
|
|
||||||
</script>
|
|
||||||
|
|
||||||
The graph above compares different versions of the Qwen 2.5 Coder 32B Instruct model,
|
The graph above compares different versions of the Qwen 2.5 Coder 32B Instruct model,
|
||||||
served both locally and from cloud providers.
|
served both locally and from cloud providers.
|
||||||
|
@ -34,11 +29,23 @@ served both locally and from cloud providers.
|
||||||
- Other API providers.
|
- Other API providers.
|
||||||
|
|
||||||
The best version of the model rivals GPT-4o, while the worst performer
|
The best version of the model rivals GPT-4o, while the worst performer
|
||||||
is more like GPT-4 Turbo level.
|
is worse than GPT-3.5 Turbo.
|
||||||
|
|
||||||
|
Hyperbolic via OpenRouter in particular is confusing.
|
||||||
|
Their direct API produces excellent results, but the performance
|
||||||
|
through OpenRouter is very poor.
|
||||||
|
It's unclear why this is happening to just this provider.
|
||||||
|
The other providers available through OpenRouter perform similarly
|
||||||
|
when their API is accessed directly.
|
||||||
|
|
||||||
{: .note }
|
{: .note }
|
||||||
This article is being updated as additional benchmark runs complete.
|
This article is being updated as additional benchmark runs complete.
|
||||||
|
|
||||||
|
<canvas id="quantChart" width="800" height="600" style="margin: 20px 0"></canvas>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
|
||||||
|
<script>
|
||||||
|
{% include quant-chart.js %}
|
||||||
|
</script>
|
||||||
|
|
||||||
<input type="text" id="quantSearchInput" placeholder="Search..." style="width: 100%; max-width: 800px; margin: 10px auto; padding: 8px; display: block; border: 1px solid #ddd; border-radius: 4px;">
|
<input type="text" id="quantSearchInput" placeholder="Search..." style="width: 100%; max-width: 800px; margin: 10px auto; padding: 8px; display: block; border: 1px solid #ddd; border-radius: 4px;">
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue