mirror of
https://github.com/Aider-AI/aider.git
synced 2025-06-01 18:25:00 +00:00
cleanup
This commit is contained in:
parent
c1dc473ed8
commit
8e84b5c0b1
2 changed files with 4 additions and 30 deletions
|
@ -1225,29 +1225,3 @@
|
||||||
seconds_per_case: 50.1
|
seconds_per_case: 50.1
|
||||||
total_cost: 1.8451
|
total_cost: 1.8451
|
||||||
|
|
||||||
- dirname: 2025-05-06-21-34-36--gemini0506-diff-fenced
|
|
||||||
test_cases: 225
|
|
||||||
model: gemini/gemini-2.5-pro-preview-05-06
|
|
||||||
edit_format: diff-fenced
|
|
||||||
commit_hash: 8159cbf-dirty
|
|
||||||
pass_rate_1: 37.8
|
|
||||||
pass_rate_2: 75.6
|
|
||||||
pass_num_1: 85
|
|
||||||
pass_num_2: 170
|
|
||||||
percent_cases_well_formed: 95.1
|
|
||||||
error_outputs: 11
|
|
||||||
num_malformed_responses: 11
|
|
||||||
num_with_malformed_responses: 11
|
|
||||||
user_asks: 139
|
|
||||||
lazy_comments: 0
|
|
||||||
syntax_errors: 0
|
|
||||||
indentation_errors: 0
|
|
||||||
exhausted_context_windows: 0
|
|
||||||
test_timeouts: 5
|
|
||||||
total_tests: 225
|
|
||||||
command: aider --model gemini/gemini-2.5-pro-preview-05-06
|
|
||||||
date: 2025-05-06
|
|
||||||
versions: 0.82.4.dev
|
|
||||||
seconds_per_case: 158.8
|
|
||||||
total_cost: 41.1744
|
|
||||||
|
|
||||||
|
|
|
@ -8,7 +8,7 @@ nav_exclude: true
|
||||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
|
|
||||||
# Gemini 2.5 Pro Preview 03-25 benchmark pricing
|
# Gemini 2.5 Pro Preview 03-25 benchmark cost
|
||||||
|
|
||||||
The $6.32 cost reported to run the aider polyglot benchmark on
|
The $6.32 cost reported to run the aider polyglot benchmark on
|
||||||
Gemini 2.5 Pro Preview 03-25 was incorrect.
|
Gemini 2.5 Pro Preview 03-25 was incorrect.
|
||||||
|
@ -21,7 +21,7 @@ aider uses to connect to LLM APIs:
|
||||||
|
|
||||||
- The litellm model database had an incorrect price-per-token for Gemini 2.5 Pro Preview 03-25 in their costs database.
|
- The litellm model database had an incorrect price-per-token for Gemini 2.5 Pro Preview 03-25 in their costs database.
|
||||||
This does not appear to be a contributing factor to the incorrect benchmark cost.
|
This does not appear to be a contributing factor to the incorrect benchmark cost.
|
||||||
- The litellm package was incorrectly excluding reasoning tokens from the token counts it reported to aider. This appears to be the cause of the incorrect benchmark cost.
|
- The litellm package was excluding reasoning tokens from the token counts it reported to aider. This appears to be the cause of the incorrect benchmark cost.
|
||||||
|
|
||||||
The incorrect litellm database entry does not appear to have affected the aider benchmark costs.
|
The incorrect litellm database entry does not appear to have affected the aider benchmark costs.
|
||||||
Aider maintains and uses its own database of costs for some models, and it contained
|
Aider maintains and uses its own database of costs for some models, and it contained
|
||||||
|
@ -44,7 +44,7 @@ in commit [9351f37](https://github.com/Aider-AI/aider/commit/9351f37).
|
||||||
That dependency change shipped on May 5, 2025 in aider v0.82.3.
|
That dependency change shipped on May 5, 2025 in aider v0.82.3.
|
||||||
|
|
||||||
The incorrect cost has been removed from the leaderboard.
|
The incorrect cost has been removed from the leaderboard.
|
||||||
Unfortunately, the 03-25 version of Gemini 2.5 Pro Preview is no longer available,
|
Unfortunately the 03-25 version of Gemini 2.5 Pro Preview is no longer available,
|
||||||
so it is not possible to re-run the benchmark to obtain an accurate cost.
|
so it is not possible to re-run the benchmark to obtain an accurate cost.
|
||||||
|
|
||||||
As a possibly relevant comparison, the newer 05-06 version of Gemini 2.5 Pro Preview
|
As a possibly relevant comparison, the newer 05-06 version of Gemini 2.5 Pro Preview
|
||||||
|
@ -67,7 +67,7 @@ model cost database appears not to have been a factor:
|
||||||
- Updating aider's local model database with an absurdly high token cost resulted in an appropriately high benchmark cost report, demonstrating that the local database costs were in effect.
|
- Updating aider's local model database with an absurdly high token cost resulted in an appropriately high benchmark cost report, demonstrating that the local database costs were in effect.
|
||||||
|
|
||||||
This specific build of aider was then updated with various versions of litellm using `git biset`
|
This specific build of aider was then updated with various versions of litellm using `git biset`
|
||||||
to identify the first litellm commit where correct tokens counts were returned.
|
to identify the first litellm commit where reasoning tokens counts were reported.
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue