This commit is contained in:
Paul Gauthier 2024-12-22 08:59:38 -05:00
parent c895e99306
commit 0e05b64ebc
5 changed files with 36 additions and 2 deletions

View file

@ -2,6 +2,7 @@
### main branch
- Support for o1 models.
- Support for running without git installed.
- Show hints about AI! and AI? when user makes AI comments.
- Ask 10% of users to opt-in to analytics.

View file

@ -665,6 +665,13 @@ MODEL_SETTINGS = [
examples_as_sys_msg=True,
reminder="sys",
),
ModelSettings(
"openrouter/deepseek/deepseek-chat",
"diff",
use_repo_map=True,
examples_as_sys_msg=True,
reminder="sys",
),
ModelSettings(
"openrouter/openai/gpt-4o",
"diff",

View file

@ -152,4 +152,30 @@
date: 2024-12-21
versions: 0.69.2.dev
seconds_per_case: 31.8
total_cost: 6.0583
total_cost: 6.0583
- dirname: 2024-12-22-13-22-32--polyglot-qwen-diff
test_cases: 225
model: Qwen2.5-Coder-32B-Instruct
edit_format: diff
commit_hash: 6d7e8be-dirty
pass_rate_1: 4.4
pass_rate_2: 8.0
pass_num_1: 10
pass_num_2: 18
percent_cases_well_formed: 71.6
error_outputs: 158
num_malformed_responses: 148
num_with_malformed_responses: 64
user_asks: 132
lazy_comments: 0
syntax_errors: 0
indentation_errors: 0
exhausted_context_windows: 1
test_timeouts: 2
total_tests: 225
command: "aider --model openai/Qwen/Qwen2.5-Coder-32B-Instruct # via hyperbolic"
date: 2024-12-22
versions: 0.69.2.dev
seconds_per_case: 84.4
total_cost: 0.0000

View file

@ -152,7 +152,7 @@ how long it will take for this new benchmark to saturate.
## Benchmark problems
The 225 coding problems are available in the
[aider polyglot benchmark repo]()
[aider polyglot benchmark repo](https://github.com/Aider-AI/polyglot-benchmark)
on GitHub.

Binary file not shown.

After

Width:  |  Height:  |  Size: 157 KiB