copy

2025-05-30 17:24:59 +00:00 · 2024-11-24 07:56:12 -08:00 · 2024-11-24 07:56:12 -08:00 · dc8761763d
commit dc8761763d
parent 4894914db1
2 changed files with 10 additions and 2 deletions
--- a/aider/website/_posts/2024-11-21-quantization.md
+++ b/aider/website/_posts/2024-11-21-quantization.md
@ -10,6 +10,7 @@ nav_exclude: true
 {% endif %}

 # Quantization matters
+{: .no_toc }

 Open source models like Qwen 2.5 32B Instruct are performing very well on
 aider's code editing benchmark, rivaling closed source frontier models.
@ -18,8 +19,7 @@ can impact code editing skill.
 Heavily quantized models are often used by cloud API providers
 and local model servers like Ollama or MLX.

-
-The graph above compares different versions of the Qwen 2.5 Coder 32B Instruct model,
+The graph and table below compares different versions of the Qwen 2.5 Coder 32B Instruct model,
 served both locally and from cloud providers.

 - The [HuggingFace BF16 weights](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) served via [glhf.chat](https://glhf.chat).
@ -38,9 +38,17 @@ It's unclear why this is happening to just this provider.
 The other providers available through OpenRouter perform similarly
 when their API is accessed directly.

+### Sections
+{: .no_toc }
+
+- TOC
+{:toc}
+
 {: .note }
 This article is being updated as additional benchmark runs complete.

+## Benchmark results
+
 <canvas id="quantChart" width="800" height="600" style="margin: 20px 0"></canvas>
 <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
 <script>
--- a/aider/website/assets/quantization.jpg
+++ b/aider/website/assets/quantization.jpg