Paul Gauthier
8cb83afcc4
ask transient whole, o1-preview deep
2024-09-12 17:21:35 -07:00
Paul Gauthier
83662b7470
Merge branch 'main' into ask-plan-simple
2024-09-12 17:19:14 -07:00
Paul Gauthier
1fbb5079d5
unhack o1 mini
2024-09-12 15:38:28 -07:00
Paul Gauthier
291b456a45
hack for o1-mini: no system prompt, no temperature
2024-09-12 13:05:25 -07:00
Paul Gauthier
5408dcb185
wip
2024-09-11 09:32:14 -07:00
Paul Gauthier
39ae106bb3
wip
2024-09-10 15:21:54 -07:00
Paul Gauthier
abd484bfa7
wip
2024-09-06 12:01:51 -07:00
Paul Gauthier
cc15909629
clean diff edit format
2024-09-06 11:25:20 -07:00
Paul Gauthier
5b584db90c
sonnet-sonnet gets 60.2/84.2
2024-09-06 09:49:01 -07:00
Paul Gauthier
1c73e7d43a
turn off suggest shell commands during benchmarks
2024-09-05 14:35:34 -07:00
Paul Gauthier
05dcbeecac
noop
2024-09-05 14:25:09 -07:00
Paul Gauthier
ff3a75413b
sonnet+deep got 60.9/82.0
2024-09-05 13:30:25 -07:00
Paul Gauthier
1a3d8c4015
wip
2024-08-20 17:45:40 -07:00
Paul Gauthier
821eae16ae
copy
2024-08-19 20:54:10 -07:00
Paul Gauthier
e0a9044118
copy
2024-08-19 20:53:42 -07:00
Paul Gauthier
730d6e0e94
copy
2024-08-19 20:51:03 -07:00
Paul Gauthier
86a7a17d47
copy
2024-08-19 20:47:52 -07:00
Paul Gauthier
2944445340
copy
2024-08-19 20:44:48 -07:00
Paul Gauthier
b61b5f4b74
cleanup before merge
2024-08-16 11:35:30 -07:00
Paul Gauthier
3a2ac02024
Merge branch 'main' into json-coders
2024-08-15 12:15:07 -07:00
Paul Gauthier
822a8ab671
remove gpt-4o-mini from the gpt-4 trendline
2024-08-15 09:52:21 -07:00
Paul Gauthier (aider)
5ccdebf2c0
refactor: Extract color assignment logic into a separate function
2024-08-15 09:50:50 -07:00
Paul Gauthier
bac04a2a3d
no lint
2024-08-15 06:10:46 -07:00
Paul Gauthier (aider)
0a3c6bfbe7
feat: Change blue color to light blue in plot_over_time function
2024-08-14 06:29:48 -07:00
Paul Gauthier (aider)
d2b4846b95
feat: Replace orange color with purple for "-4o" models
2024-08-14 06:29:13 -07:00
Paul Gauthier (aider)
fb0b348bec
fix: Remove unused blue_points
variable
2024-08-14 06:28:28 -07:00
Paul Gauthier (aider)
a7290be843
style: Apply linter formatting changes
2024-08-14 06:27:51 -07:00
Paul Gauthier (aider)
1cdbc76974
feat: Connect model family lines in over_time plot
2024-08-14 06:27:48 -07:00
Paul Gauthier
714fd45f4d
fix: Update color logic and font size in over_time.py
2024-08-14 06:27:47 -07:00
Paul Gauthier (aider)
1f6cadcc66
style: Refactor conditional logic in color assignment
2024-08-14 06:22:51 -07:00
Paul Gauthier (aider)
c4f70d81b7
feat: add new color for all "-4o-" models except "gpt-4o-mini"
2024-08-14 06:22:48 -07:00
Paul Gauthier (aider)
1f59687e9d
style: Format code with linter
2024-08-14 06:21:48 -07:00
Paul Gauthier (aider)
d8c8c51156
The commit message for these changes would be:
...
feat: Improve graph visualization and add debugging
The changes made in this commit include:
1. Adjusting the y-axis limit to 100 to accommodate the higher pass rate values.
2. Rotating the x-axis labels for better readability.
3. Adding debug print statements to track the progress of figure generation and display.
4. Increasing the figure size for better visibility.
5. Adding additional debugging to ensure the data is being plotted correctly.
These improvements should help with the visualization and debugging of the graph generation process.
2024-08-14 06:21:45 -07:00
Paul Gauthier (aider)
d94d5aa3fa
style: format code according to linter rules
2024-08-14 06:20:36 -07:00
Paul Gauthier (aider)
d2479f30f7
fix: Add debug prints and check for empty data in over_time.py
2024-08-14 06:20:32 -07:00
Paul Gauthier
56975d02a1
fix: Update path to edit_leaderboard.yml file
2024-08-14 06:20:31 -07:00
Paul Gauthier
060c8ff89a
override dotenv
2024-08-13 18:06:00 -07:00
Paul Gauthier
139f7992cb
do not pass pretty to coder
2024-08-13 17:43:41 -07:00
Your Name
e799e89ff4
install aider with -e in benchmark docker
2024-07-28 17:23:25 -03:00
Paul Gauthier
e9bde684ae
updated benchmark docker to use [dev]
2024-07-14 17:24:51 +01:00
Paul Gauthier
a8d4be441e
Update benchmark dockerfile
2024-07-11 08:54:47 +01:00
Paul Gauthier
9fa70b4b05
Install the cpu-only version of torch
2024-07-09 09:37:10 +01:00
Paul Gauthier
3ff0c7ce35
copy
2024-06-02 09:28:56 -07:00
Paul Gauthier
2cb9a8ddc8
copy
2024-06-01 16:10:55 -07:00
Paul Gauthier
47a3cb8adf
copy
2024-06-01 15:05:29 -07:00
Paul Gauthier
2febc663f3
copy
2024-06-01 14:48:12 -07:00
Paul Gauthier
bc4d39ddf2
copy
2024-06-01 11:46:53 -07:00
Paul Gauthier
26edbcc8f1
copy
2024-06-01 11:26:16 -07:00
Paul Gauthier
fcc62ebffc
copy
2024-06-01 07:34:30 -07:00
Paul Gauthier
07d36b22c0
copy
2024-06-01 07:23:25 -07:00