This commit is contained in:
Paul Gauthier 2025-01-24 08:47:25 -08:00
parent ddb02adbb4
commit ee5d72301a
3 changed files with 40 additions and 6 deletions

View file

@ -841,7 +841,7 @@ MODEL_SETTINGS = [
use_repo_map=True,
streaming=False,
use_temperature=False,
extra_params=dict(extra_body=dict(reasoning_effort="high")),
# extra_params=dict(extra_body=dict(reasoning_effort="high")),
),
ModelSettings(
"openrouter/qwen/qwen-2.5-coder-32b-instruct",

View file

@ -23,7 +23,7 @@
exhausted_context_windows: 0
test_timeouts: 5
total_tests: 225
command: aider --model deepseek/deepseek-reasoner
command: aider --architect --model r1 --editor-model sonnet
date: 2025-01-23
versions: 0.72.3.dev
seconds_per_case: 251.6
@ -49,7 +49,7 @@
exhausted_context_windows: 1
test_timeouts: 5
total_tests: 225
command: aider --model deepseek/deepseek-reasoner
command: aider --model r1
date: 2025-01-20
versions: 0.71.2.dev
seconds_per_case: 113.7
@ -76,7 +76,7 @@
exhausted_context_windows: 0
test_timeouts: 2
total_tests: 225
command: aider --model openrouter/openai/o1
command: aider --model o1
date: 2024-12-21
versions: 0.69.2.dev
seconds_per_case: 133.2
@ -103,7 +103,7 @@
exhausted_context_windows: 0
test_timeouts: 8
total_tests: 225
command: aider --model deepseek/deepseek-chat
command: aider --model deepseek
date: 2024-12-25
versions: 0.69.2.dev
seconds_per_case: 34.8
@ -131,7 +131,7 @@
exhausted_context_windows: 1
test_timeouts: 8
total_tests: 225
command: aider --model claude-3-5-sonnet-20241022
command: aider --model sonnet
date: 2025-01-17
versions: 0.71.2.dev
seconds_per_case: 21.4

View file

@ -29,6 +29,40 @@ This is in contrast to the first wave of thinking models like o1-preview and o1-
which improved when paired with many different editor models.
## Try it
Once you [install aider](https://aider.chat/docs/install.html),
you can use aider, R1 and Sonnet like this:
```bash
export DEEPSEEK_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>
aider --architect --model r1 --editor-model sonnet
```
Or if you have an [OpenRouter](https://openrouter.ai) account:
```bash
export OPENROUTER_API_KEY=<your-key>
aider --architect --model openrouter/deepseek/deepseek-r1 --editor-model openrouter/anthropic/claude-3.5-sonnet
```
## Thinking output
There has been
[some recent discussion](https://github.com/Aider-AI/aider/pull/2973)
about extracting the `<think>` tokens from R1's responses
and feeding them to Sonnet.
That was an interesting experiment, for sure.
To be clear, the results above are *not* using R1's thinking tokens, just the normal
final output.
R1 is configured in aider's standard architect role with Sonnet as editor.
The benchmark results that used the thinking tokens appear to be worse than
the architect/editor results shared here.
## Results
<table style="width: 100%; max-width: 800px; margin: auto; border-collapse: collapse; box-shadow: 0 2px 4px rgba(0,0,0,0.1); font-size: 14px;">