Commit graph

531 commits

Author SHA1 Message Date
Paul Gauthier (aider)
c189a52e5e style: Organize imports and apply linter formatting 2024-11-21 14:00:24 -08:00
Paul Gauthier (aider)
6d6d763dd3 refactor: Restructure benchmark plotting script for improved maintainability 2024-11-21 14:00:20 -08:00
Paul Gauthier
1f0d26e8c7 better over time plot 2024-11-20 20:19:44 -08:00
Paul Gauthier
8302e9d0dd improved over time plot 2024-11-20 20:16:35 -08:00
Paul Gauthier (aider)
c797af020a refactor: Update fontsize to use LABEL_FONT_SIZE constant in over_time.py 2024-11-20 20:13:46 -08:00
Paul Gauthier (aider)
1c85afa320 feat: Add LABEL_FONT_SIZE constant for dot label font size 2024-11-20 20:13:33 -08:00
Paul Gauthier
eb5317f8e5 fix: Adjust annotation vertical offset for brown color in over_time plot 2024-11-20 20:13:30 -08:00
Paul Gauthier (aider)
8b860615b8 style: Increase font size for scatter plot dot labels 2024-11-20 20:10:40 -08:00
Paul Gauthier (aider)
c15ac341e2 refactor: Remove Opus and Llama model variants from legend labels 2024-11-20 20:07:52 -08:00
Paul Gauthier (aider)
c2c7ee1047 feat: Change Opus label to "Opus" in legend 2024-11-20 20:06:48 -08:00
Paul Gauthier (aider)
72c46ccec6 feat: Add labels for Claude 3 Opus, Sonnet, and O1 Preview models 2024-11-20 20:06:04 -08:00
Paul Gauthier (aider)
dd3bfaee01 style: Format code with consistent indentation and line breaks 2024-11-20 20:05:24 -08:00
Paul Gauthier (aider)
03206ad90e feat: Add line labels directly on first points instead of using legend 2024-11-20 20:05:18 -08:00
Paul Gauthier (aider)
2e00307190 feat: Add color and legend label for o1-preview models 2024-11-20 20:03:49 -08:00
Paul Gauthier (aider)
b3e29ab20e style: Apply linter formatting to benchmark code 2024-11-20 20:02:52 -08:00
Paul Gauthier (aider)
5504ac535b feat: Add simplified model names for legend labels 2024-11-20 20:02:48 -08:00
Paul Gauthier (aider)
4b3dd7f4ea style: Apply linter formatting to over_time.py 2024-11-20 19:59:43 -08:00
Paul Gauthier (aider)
8edf9540d5 feat: Add legend to plot and remove point labels 2024-11-20 19:59:38 -08:00
Paul Gauthier
1c62ecd1b5 style: Adjust x-axis label rotation angle for better readability 2024-11-20 19:59:36 -08:00
Paul Gauthier
7cf3d9f3ce style: Increase annotation font size in benchmark plot 2024-11-20 19:45:42 -08:00
Paul Gauthier
9b5a703307 updated models-over-time 2024-11-20 19:40:59 -08:00
Paul Gauthier (aider)
370993cbed style: Rotate point labels by 45 degrees in benchmark plot 2024-11-20 18:47:30 -08:00
Paul Gauthier
ddc538cdfa refactor: Adjust plot figure size and y-axis limits for better visualization 2024-11-20 18:47:28 -08:00
Paul Gauthier (aider)
062dc43c87 style: Make graph aspect ratio square 2024-11-20 18:43:18 -08:00
Paul Gauthier (aider)
7d9b986c04 feat: Add cyan color and line for Mistral models in visualization 2024-11-20 18:38:06 -08:00
Paul Gauthier
bd2b9a12ed style: Change Qwen model color from purple to darkblue 2024-11-20 18:38:04 -08:00
Paul Gauthier (aider)
2b55707738 feat: Add purple color and line for Qwen models in visualization 2024-11-20 18:35:25 -08:00
Paul Gauthier (aider)
093540507e feat: Add pink color and line for Haiku models in benchmark visualization 2024-11-20 18:33:54 -08:00
Paul Gauthier (aider)
8f1dcfda07 feat: Add brown color for DeepSeek models in benchmark visualization 2024-11-20 18:31:46 -08:00
Paul Gauthier
16b319174b refactor: Simplify model color detection logic for Sonnet models 2024-11-20 18:31:44 -08:00
Paul Gauthier (aider)
35115f5707 feat: Add orange color for Claude 3 Sonnet models in benchmark visualization 2024-11-20 18:30:09 -08:00
Paul Gauthier
4ef7022343 copy 2024-10-04 13:00:01 -07:00
Paul Gauthier
1c67ddcbff copy 2024-10-04 12:58:48 -07:00
fry69
667a58052e feat: change edit format from "senior" to "architect" 2024-09-27 09:03:42 +02:00
fry69
e3e0d57512 chore: update parameter names in args and benchmark 2024-09-27 08:57:22 +02:00
Paul Gauthier
eb21cf2830 architect/editor 2024-09-26 16:10:19 -07:00
Paul Gauthier (aider)
5a78e7d1b8 chore: Run the linter 2024-09-26 11:35:13 -07:00
Paul Gauthier (aider)
1c05192b69 fix: Only record junior_model and junior_edit_format in the results array if edit_format is "senior" 2024-09-26 11:35:09 -07:00
Paul Gauthier
e682eb8669 fix: Add junior model and junior edit format to benchmark results 2024-09-25 16:31:40 -07:00
Paul Gauthier (aider)
ed7503dbbe feat: optimize find_latest_benchmark_dir to check only .md files and limit to one file per subtree 2024-09-25 12:20:45 -07:00
Paul Gauthier (aider)
e21cdafb15 style: run linter and fix code formatting issues 2024-09-25 12:18:43 -07:00
Paul Gauthier (aider)
8d90df1ebc feat: implement automatic selection of the most recently updated benchmark directory when using --stats without dirnames 2024-09-25 12:18:39 -07:00
Paul Gauthier (aider)
24c959af2d feat: Add --junior-model and --junior-edit-format flags to the benchmark 2024-09-25 11:44:34 -07:00
Paul Gauthier
15cc709322 feat: Improve senior coder's edit format handling 2024-09-25 11:42:09 -07:00
Paul Gauthier
65e57df7ea feat: Implement changes to handle files content in Coder and prompts 2024-09-25 09:54:16 -07:00
Paul Gauthier
075bc828f6 stand alone junior message 2024-09-25 08:41:49 -07:00
Paul Gauthier
c912982747 senior-junior 2024-09-25 08:25:11 -07:00
Paul Gauthier
a9e9f9cdbe Merge branch 'main' into ask-plan-simple 2024-09-25 07:46:15 -07:00
Paul Gauthier
412b8e7c3c copy 2024-09-21 10:09:26 -07:00
Paul Gauthier
2753ac6b62 feat: Add new benchmark test case for qwen-2.5-72b-instruct-diff model 2024-09-20 13:27:58 -07:00