mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-29 08:44:59 +00:00
Merge pull request #2440 from ivanfioravanti/main
MLX 4bit and 8bit diff added
This commit is contained in:
commit
80f5b60e1d
2 changed files with 38 additions and 17 deletions
|
@ -70,25 +70,48 @@
|
|||
|
||||
- dirname: 2024-11-22-17-53-35--qwen25-coder-32b-Instruct-4bit
|
||||
test_cases: 133
|
||||
model: mlx-community/Qwen2.5-Coder-32B-Instruct-4bit (whole)
|
||||
edit_format: whole
|
||||
commit_hash: 0ccf04a-dirty
|
||||
pass_rate_1: 57.1
|
||||
pass_rate_2: 69.2
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 70
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 0
|
||||
model: mlx-community/Qwen2.5-Coder-32B-Instruct-4bit
|
||||
edit_format: diff
|
||||
commit_hash: a16dcab-dirty
|
||||
pass_rate_1: 60.2
|
||||
pass_rate_2: 72.2
|
||||
percent_cases_well_formed: 88.7
|
||||
error_outputs: 31
|
||||
num_malformed_responses: 30
|
||||
num_with_malformed_responses: 15
|
||||
user_asks: 6
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
exhausted_context_windows: 1
|
||||
test_timeouts: 0
|
||||
command: aider --model openai/mlx-community/Qwen2.5-Coder-32B-Instruct-4bit
|
||||
date: 2024-11-22
|
||||
date: 2024-11-23
|
||||
versions: 0.64.2.dev
|
||||
seconds_per_case: 173.7
|
||||
seconds_per_case: 53.4
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-11-23-15-07-20--qwen25-coder-32b-Instruct-8bit
|
||||
test_cases: 133
|
||||
model: mlx-community/Qwen2.5-Coder-32B-Instruct-8bit
|
||||
edit_format: diff
|
||||
commit_hash: a16dcab-dirty
|
||||
pass_rate_1: 59.4
|
||||
pass_rate_2: 72.2
|
||||
percent_cases_well_formed: 92.5
|
||||
error_outputs: 20
|
||||
num_malformed_responses: 15
|
||||
num_with_malformed_responses: 10
|
||||
user_asks: 7
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 5
|
||||
test_timeouts: 2
|
||||
command: aider --model openai/mlx-community/Qwen2.5-Coder-32B-Instruct-8bit
|
||||
date: 2024-11-23
|
||||
versions: 0.64.2.dev
|
||||
seconds_per_case: 98.4
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-11-20-15-17-37--qwen25-32b-or-diff
|
||||
|
|
|
@ -16,7 +16,7 @@ aider's code editing benchmark, rivaling closed source frontier models.
|
|||
But pay attention to how your model is being quantized, as it
|
||||
can strongly impact code editing skill.
|
||||
Heavily quantized models are often used by cloud API providers
|
||||
and local model servers like Ollama.
|
||||
and local model servers like Ollama or MLX.
|
||||
|
||||
<canvas id="quantChart" width="800" height="600" style="margin: 20px 0"></canvas>
|
||||
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
|
||||
|
@ -30,8 +30,6 @@ served both locally and from cloud providers.
|
|||
- The [HuggingFace BF16 weights](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) served via [glhf.chat](https://glhf.chat).
|
||||
- Hyperbolic labs API for [qwen2-5-coder-32b-instruct](https://app.hyperbolic.xyz/models/qwen2-5-coder-32b-instruct), which is using BF16. This result is probably within the expected variance of the HF result.
|
||||
- A [4bit quant for mlx](https://t.co/cwX3DYX35D).
|
||||
This is the only model which was benchmarked using the "whole" [edit format](https://aider.chat/docs/more/edit-formats.html).
|
||||
The rest were benchmarked with the much more practical and challenging "diff"edit format.
|
||||
- The results from [OpenRouter's mix of providers](https://openrouter.ai/qwen/qwen-2.5-coder-32b-instruct/providers) which serve the model with different levels of quantization.
|
||||
- Ollama locally serving [qwen2.5-coder:32b-instruct-q4_K_M)](https://ollama.com/library/qwen2.5-coder:32b-instruct-q4_K_M), which has `Q4_K_M` quantization, with Ollama's default 2k context window.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue