mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-28 16:25:00 +00:00
init
This commit is contained in:
parent
6c42ee4edf
commit
8eda09533d
2 changed files with 27 additions and 0 deletions
27
aider/website/_posts/2024-11-21-quantization.md
Normal file
27
aider/website/_posts/2024-11-21-quantization.md
Normal file
|
@ -0,0 +1,27 @@
|
|||
---
|
||||
title: Quantization matters
|
||||
excerpt: Open source LLMs are becoming very powerful, but pay attention to how you (or your) provider is quantizing the model. It strongly affects code editing skill.
|
||||
highlight_image: /assets/quantization.jpg
|
||||
draft: false
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Quantization matters
|
||||
|
||||
Open source models like Qwen 2.5 32B are performing very well on
|
||||
aider's code editing benchmark, rivaling closed source frontier models.
|
||||
But pay attention to how your model is being quantized, as it
|
||||
can strongly impact code editing skill.
|
||||
Heavily quantized models are often used by cloud API providers
|
||||
and local model servers like ollama.
|
||||
|
||||
The graph below compares 4 different versions of the Qwen 2.5 32B model,
|
||||
served both locally and from cloud providers:
|
||||
|
||||
- Qwen2.5-Coder-32B-Instruct
|
||||
- ollama/qwen2.5:32b
|
||||
- ollama/qwen2.5:32b-instruct-q8_0
|
||||
- openrouter/qwen/qwen-2.5-coder-32b-instruct
|
BIN
aider/website/assets/quantization.jpg
Normal file
BIN
aider/website/assets/quantization.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 83 KiB |
Loading…
Add table
Add a link
Reference in a new issue