Commit graph

232 commits

Author SHA1 Message Date
Paul Gauthier
645e0de971 blame 2025-03-25 11:26:54 -10:00
Paul Gauthier
821087bcce copy 2025-03-25 10:55:17 -10:00
Paul Gauthier
f1955577bc feat: Add Gemini Pro 2.5 test results to leaderboard 2025-03-25 10:43:05 -10:00
Paul Gauthier
059cd23543 feat: add DeepSeek V3 (0324) to polyglot leaderboard 2025-03-24 12:44:25 -10:00
Paul Gauthier
d4e44d7555 copy 2025-03-21 12:09:59 -07:00
Paul Gauthier
cf93869d3f gemma3b 2025-03-17 16:39:45 -07:00
Paul Gauthier
fd21f5195d chore: Update polyglot leaderboard entry with new test results 2025-03-14 18:19:20 -07:00
Paul Gauthier
de5da0e7ce copy 2025-03-13 15:06:00 -07:00
Paul Gauthier
e8a8681a75 chore: Update polyglot leaderboard YAML files with new test case data 2025-03-13 14:16:20 -07:00
Paul Gauthier
d0c8b38ffc copy 2025-03-10 08:58:42 -07:00
Paul Gauthier
f111ab48fb chore: Update polyglot leaderboard data with new model test results 2025-03-07 09:26:32 -08:00
Paul Gauthier
996177ceaf add qwq32b 2025-03-06 11:46:32 -08:00
Paul Gauthier
9a9c34aa18 add gpt-4.5 to leaderboard 2025-02-27 13:07:07 -08:00
Paul Gauthier
2f1384840c feat: Add metadata and settings for GPT-4.5-preview and GPT-4o models 2025-02-27 13:01:52 -08:00
Filip Trplan
b6344951fe
add gemini-2.0-pro-exp-02-05 polyglot benchmark 2025-02-26 15:04:48 +01:00
Paul Gauthier
b6e46d6101 copy 2025-02-24 20:20:55 -08:00
Paul Gauthier
347f75f804 copy 2025-02-24 17:23:14 -08:00
Paul Gauthier
c4fac2d179 added sonnet 37 w/32k think 2025-02-24 15:15:24 -08:00
Paul Gauthier
93edbda984 copy 2025-02-24 13:29:22 -08:00
Paul Gauthier
75bd94d757 updated blame 2025-02-24 12:53:46 -08:00
Paul Gauthier
eed9be5a9e added sonnet 37 to leaderboard 2025-02-24 12:16:14 -08:00
Paul Gauthier
6ffbec969a copy 2025-02-15 12:01:40 -08:00
Paul Gauthier
69fcc3acd7 fix: Change file reading error handling from "ignore" to "replace" 2025-02-15 12:00:39 -08:00
Paul Gauthier
53586d95d0 updated blame 2025-02-07 11:06:30 -08:00
Paul Gauthier
21e96df85a copy 2025-02-06 14:56:58 -08:00
Paul Gauthier
1a6a16e061 chore: Update polyglot leaderboard with new test run data 2025-01-31 15:13:34 -08:00
Paul Gauthier
9dfe85eca3 copy 2025-01-31 14:00:22 -08:00
Paul Gauthier
7f82a33bf5 copy 2025-01-31 13:36:04 -08:00
Paul Gauthier
8d22c0ba90 add o3mini high 2025-01-31 13:32:30 -08:00
Paul Gauthier
c78de41ccf copy 2025-01-31 12:51:33 -08:00
Paul Gauthier
778e54ef32 copy 2025-01-30 08:44:32 -08:00
Paul Gauthier
01d0e13884 chore: Update polyglot leaderboard with Qwen Max test results 2025-01-30 08:38:28 -08:00
Paul Gauthier
a7828809e9 copy 2025-01-28 19:05:05 -08:00
Paul Gauthier
298f713e9b copy 2025-01-28 16:33:25 -08:00
Paul Gauthier
65c8504141 copy 2025-01-28 13:00:35 -08:00
Paul Gauthier
77d2bc58fd copy 2025-01-28 11:30:22 -08:00
Paul Gauthier
bfc57459e1 copy 2025-01-28 11:24:32 -08:00
Paul Gauthier
82d819a6c7 copy 2025-01-28 11:04:01 -08:00
Paul Gauthier
0c13734f7a copy 2025-01-24 15:50:04 -08:00
Paul Gauthier
ee5d72301a copy 2025-01-24 08:47:25 -08:00
Paul Gauthier
d7bb80468b copy 2025-01-24 08:22:13 -08:00
Paul Gauthier
c5e2d80fc0 blame 2025-01-20 14:20:56 -08:00
Paul Gauthier
32d025bcf2 r1 leaderboard 2025-01-20 11:37:09 -08:00
Paul Gauthier
a777f336e1 chore: Update polyglot leaderboard test results and metadata 2025-01-17 13:37:02 -08:00
Paul Gauthier
2ec576e110 use examples_as_sys_msg=True for 4o models 2025-01-13 15:46:36 -08:00
Paul Gauthier
21f20417d6 copy 2025-01-13 11:39:22 -08:00
Paul Gauthier
e1c914d9bb chore: Update polyglot leaderboard with new test results for Codestral 25.01 2025-01-13 11:21:34 -08:00
paul-gauthier
42f6c20ada
Merge branch 'main' into polyglot-qwen2.5-coder-32b-instruct-whole-results 2025-01-03 09:35:14 -04:00
Paul Gauthier
d734dee589 copy 2024-12-28 10:24:17 -04:00
Paul Gauthier
f1f66a9b9d copy 2024-12-26 19:28:14 -04:00