mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-30 01:04:59 +00:00
copy
This commit is contained in:
parent
ddb02adbb4
commit
ee5d72301a
3 changed files with 40 additions and 6 deletions
|
@ -841,7 +841,7 @@ MODEL_SETTINGS = [
|
||||||
use_repo_map=True,
|
use_repo_map=True,
|
||||||
streaming=False,
|
streaming=False,
|
||||||
use_temperature=False,
|
use_temperature=False,
|
||||||
extra_params=dict(extra_body=dict(reasoning_effort="high")),
|
# extra_params=dict(extra_body=dict(reasoning_effort="high")),
|
||||||
),
|
),
|
||||||
ModelSettings(
|
ModelSettings(
|
||||||
"openrouter/qwen/qwen-2.5-coder-32b-instruct",
|
"openrouter/qwen/qwen-2.5-coder-32b-instruct",
|
||||||
|
|
|
@ -23,7 +23,7 @@
|
||||||
exhausted_context_windows: 0
|
exhausted_context_windows: 0
|
||||||
test_timeouts: 5
|
test_timeouts: 5
|
||||||
total_tests: 225
|
total_tests: 225
|
||||||
command: aider --model deepseek/deepseek-reasoner
|
command: aider --architect --model r1 --editor-model sonnet
|
||||||
date: 2025-01-23
|
date: 2025-01-23
|
||||||
versions: 0.72.3.dev
|
versions: 0.72.3.dev
|
||||||
seconds_per_case: 251.6
|
seconds_per_case: 251.6
|
||||||
|
@ -49,7 +49,7 @@
|
||||||
exhausted_context_windows: 1
|
exhausted_context_windows: 1
|
||||||
test_timeouts: 5
|
test_timeouts: 5
|
||||||
total_tests: 225
|
total_tests: 225
|
||||||
command: aider --model deepseek/deepseek-reasoner
|
command: aider --model r1
|
||||||
date: 2025-01-20
|
date: 2025-01-20
|
||||||
versions: 0.71.2.dev
|
versions: 0.71.2.dev
|
||||||
seconds_per_case: 113.7
|
seconds_per_case: 113.7
|
||||||
|
@ -76,7 +76,7 @@
|
||||||
exhausted_context_windows: 0
|
exhausted_context_windows: 0
|
||||||
test_timeouts: 2
|
test_timeouts: 2
|
||||||
total_tests: 225
|
total_tests: 225
|
||||||
command: aider --model openrouter/openai/o1
|
command: aider --model o1
|
||||||
date: 2024-12-21
|
date: 2024-12-21
|
||||||
versions: 0.69.2.dev
|
versions: 0.69.2.dev
|
||||||
seconds_per_case: 133.2
|
seconds_per_case: 133.2
|
||||||
|
@ -103,7 +103,7 @@
|
||||||
exhausted_context_windows: 0
|
exhausted_context_windows: 0
|
||||||
test_timeouts: 8
|
test_timeouts: 8
|
||||||
total_tests: 225
|
total_tests: 225
|
||||||
command: aider --model deepseek/deepseek-chat
|
command: aider --model deepseek
|
||||||
date: 2024-12-25
|
date: 2024-12-25
|
||||||
versions: 0.69.2.dev
|
versions: 0.69.2.dev
|
||||||
seconds_per_case: 34.8
|
seconds_per_case: 34.8
|
||||||
|
@ -131,7 +131,7 @@
|
||||||
exhausted_context_windows: 1
|
exhausted_context_windows: 1
|
||||||
test_timeouts: 8
|
test_timeouts: 8
|
||||||
total_tests: 225
|
total_tests: 225
|
||||||
command: aider --model claude-3-5-sonnet-20241022
|
command: aider --model sonnet
|
||||||
date: 2025-01-17
|
date: 2025-01-17
|
||||||
versions: 0.71.2.dev
|
versions: 0.71.2.dev
|
||||||
seconds_per_case: 21.4
|
seconds_per_case: 21.4
|
||||||
|
|
|
@ -29,6 +29,40 @@ This is in contrast to the first wave of thinking models like o1-preview and o1-
|
||||||
which improved when paired with many different editor models.
|
which improved when paired with many different editor models.
|
||||||
|
|
||||||
|
|
||||||
|
## Try it
|
||||||
|
|
||||||
|
Once you [install aider](https://aider.chat/docs/install.html),
|
||||||
|
you can use aider, R1 and Sonnet like this:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export DEEPSEEK_API_KEY=<your-key>
|
||||||
|
export ANTHROPIC_API_KEY=<your-key>
|
||||||
|
|
||||||
|
aider --architect --model r1 --editor-model sonnet
|
||||||
|
```
|
||||||
|
|
||||||
|
Or if you have an [OpenRouter](https://openrouter.ai) account:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export OPENROUTER_API_KEY=<your-key>
|
||||||
|
|
||||||
|
aider --architect --model openrouter/deepseek/deepseek-r1 --editor-model openrouter/anthropic/claude-3.5-sonnet
|
||||||
|
```
|
||||||
|
|
||||||
|
## Thinking output
|
||||||
|
|
||||||
|
There has been
|
||||||
|
[some recent discussion](https://github.com/Aider-AI/aider/pull/2973)
|
||||||
|
about extracting the `<think>` tokens from R1's responses
|
||||||
|
and feeding them to Sonnet.
|
||||||
|
That was an interesting experiment, for sure.
|
||||||
|
|
||||||
|
To be clear, the results above are *not* using R1's thinking tokens, just the normal
|
||||||
|
final output.
|
||||||
|
R1 is configured in aider's standard architect role with Sonnet as editor.
|
||||||
|
The benchmark results that used the thinking tokens appear to be worse than
|
||||||
|
the architect/editor results shared here.
|
||||||
|
|
||||||
## Results
|
## Results
|
||||||
|
|
||||||
<table style="width: 100%; max-width: 800px; margin: auto; border-collapse: collapse; box-shadow: 0 2px 4px rgba(0,0,0,0.1); font-size: 14px;">
|
<table style="width: 100%; max-width: 800px; margin: auto; border-collapse: collapse; box-shadow: 0 2px 4px rgba(0,0,0,0.1); font-size: 14px;">
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue