mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-30 17:24:59 +00:00
Article updated to reflect change in mlx test
This commit is contained in:
parent
3a331e55dc
commit
100744a952
1 changed files with 2 additions and 4 deletions
|
@ -16,7 +16,7 @@ aider's code editing benchmark, rivaling closed source frontier models.
|
|||
But pay attention to how your model is being quantized, as it
|
||||
can strongly impact code editing skill.
|
||||
Heavily quantized models are often used by cloud API providers
|
||||
and local model servers like Ollama.
|
||||
and local model servers like Ollama or MLX.
|
||||
|
||||
<canvas id="quantChart" width="800" height="600" style="margin: 20px 0"></canvas>
|
||||
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
|
||||
|
@ -29,9 +29,7 @@ served both locally and from cloud providers.
|
|||
|
||||
- The [HuggingFace BF16 weights](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) served via [glhf.chat](https://glhf.chat).
|
||||
- Hyperbolic labs API for [qwen2-5-coder-32b-instruct](https://app.hyperbolic.xyz/models/qwen2-5-coder-32b-instruct), which is using BF16. This result is probably within the expected variance of the HF result.
|
||||
- A [4bit quant for mlx](https://t.co/cwX3DYX35D).
|
||||
This is the only model which was benchmarked using the "whole" [edit format](https://aider.chat/docs/more/edit-formats.html).
|
||||
The rest were benchmarked with the much more practical and challenging "diff"edit format.
|
||||
- A [4bit quant for mlx](https://t.co/cwX3DYX35D).
|
||||
- The results from [OpenRouter's mix of providers](https://openrouter.ai/qwen/qwen-2.5-coder-32b-instruct/providers) which serve the model with different levels of quantization.
|
||||
- Ollama locally serving [qwen2.5-coder:32b-instruct-q4_K_M)](https://ollama.com/library/qwen2.5-coder:32b-instruct-q4_K_M), which has `Q4_K_M` quantization, with Ollama's default 2k context window.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue