mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-31 01:35:00 +00:00
fix ollama models included in quant blog
This commit is contained in:
parent
dbd7f51f5c
commit
ebba8f5110
2 changed files with 5 additions and 29 deletions
|
@ -67,26 +67,3 @@
|
|||
versions: 0.64.2.dev
|
||||
seconds_per_case: 86.7
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-11-22-03-33-30--ollama-qwen25-coder-krith-instruct
|
||||
test_cases: 133
|
||||
model: ollama/krith/qwen2.5-coder-32b-instruct:IQ2_M
|
||||
edit_format: diff
|
||||
commit_hash: fbadfcf-dirty
|
||||
pass_rate_1: 16.5
|
||||
pass_rate_2: 21.1
|
||||
percent_cases_well_formed: 60.9
|
||||
error_outputs: 1169
|
||||
num_malformed_responses: 148
|
||||
num_with_malformed_responses: 52
|
||||
user_asks: 58
|
||||
lazy_comments: 0
|
||||
syntax_errors: 3
|
||||
indentation_errors: 1
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 4
|
||||
command: aider --model ollama/krith/qwen2.5-coder-32b-instruct:IQ2_M
|
||||
date: 2024-11-22
|
||||
versions: 0.64.2.dev
|
||||
seconds_per_case: 169.7
|
||||
total_cost: 0.00
|
|
@ -11,7 +11,7 @@ nav_exclude: true
|
|||
|
||||
# Quantization matters
|
||||
|
||||
Open source models like Qwen 2.5 32B are performing very well on
|
||||
Open source models like Qwen 2.5 32B Instruct are performing very well on
|
||||
aider's code editing benchmark, rivaling closed source frontier models.
|
||||
But pay attention to how your model is being quantized, as it
|
||||
can strongly impact code editing skill.
|
||||
|
@ -24,16 +24,15 @@ and local model servers like Ollama.
|
|||
{% include quant-chart.js %}
|
||||
</script>
|
||||
|
||||
The graph above compares 3 different versions of the Qwen 2.5 Coder 32B model,
|
||||
The graph above compares 3 different versions of the Qwen 2.5 Coder 32B Instruct model,
|
||||
served both locally and from cloud providers.
|
||||
|
||||
- The [HuggingFace weights](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) served via [glhf.chat](https://glhf.chat).
|
||||
- The results from [OpenRouter's mix of providers](https://openrouter.ai/qwen/qwen-2.5-coder-32b-instruct/providers) which serve the model with different levels of quantization.
|
||||
- Ollama locally serving [qwen2.5-coder:32b-instruct-q4_K_M)](https://ollama.com/library/qwen2.5-coder:32b-instruct-q4_K_M), which has `Q4_K_M` quantization.
|
||||
- Ollama locally serving [krith/qwen2.5-coder-32b-instruct:IQ2_M](https://ollama.com/krith/qwen2.5-coder-32b-instruct), which has IQ2_M quantization.
|
||||
|
||||
The best version of the model rivals GPT-4o, while the worst performers
|
||||
are more like GPT-3.5 Turbo level to completely useless.
|
||||
The best version of the model rivals GPT-4o, while the worst performer
|
||||
is more like GPT-3.5 Turbo level.
|
||||
|
||||
## Choosing providers with OpenRouter
|
||||
|
||||
|
@ -44,4 +43,4 @@ undesirable providers.
|
|||
|
||||
{: .note }
|
||||
The original version of this article included incorrect Ollama models
|
||||
that were not Qwen 2.5 Coder 32B.
|
||||
that were not Qwen 2.5 Coder 32B Instruct.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue