Paul Gauthier (aider)
|
e334cbb5d4
|
fix: Correct indentation in load_results function
|
2024-12-20 16:03:40 -08:00 |
|
Paul Gauthier (aider)
|
e3ac8ab19d
|
feat: Add --stats-languages option to filter results
|
2024-12-20 16:03:19 -08:00 |
|
Paul Gauthier
|
bddf6e9017
|
fix: Handle missing attributes in show_stats and empty models
|
2024-12-20 16:03:19 -08:00 |
|
Paul Gauthier
|
521841b447
|
fix: Skip redoing tests if results exist
|
2024-12-19 16:25:54 -08:00 |
|
Paul Gauthier (aider)
|
c53cd336f9
|
style: Fix linting issues
|
2024-12-19 15:59:03 -08:00 |
|
Paul Gauthier (aider)
|
a8226989c8
|
feat: Remove @Disabled annotations from Java test files
|
2024-12-19 15:58:59 -08:00 |
|
Paul Gauthier
|
114b156d74
|
fix: Use relative paths for ignored files, remove redundant try
|
2024-12-19 15:56:16 -08:00 |
|
Paul Gauthier (aider)
|
370b45bb35
|
feat: Ignore files in .meta and .docs directories
|
2024-12-19 07:23:28 -08:00 |
|
Paul Gauthier
|
616c4a9a53
|
chore: Add comment about ignoring meta and docs files
|
2024-12-19 07:23:27 -08:00 |
|
Paul Gauthier
|
821f7d6694
|
fix: Use extra_body for reasoning_effort, fix test counts
|
2024-12-19 07:10:20 -08:00 |
|
Paul Gauthier
|
c36c06ab99
|
fix: Retry tests on parse or timeout, add gpt-4o params
|
2024-12-18 15:56:38 -08:00 |
|
Paul Gauthier
|
a915c60999
|
feat: Add pass_num to benchmark results, fix hard set percent
|
2024-12-18 13:36:37 -08:00 |
|
Paul Gauthier
|
2aa4615c78
|
feat: Add openrouter/openai/o1 model and update prompts
|
2024-12-18 06:59:14 -08:00 |
|
Paul Gauthier (aider)
|
7dd1346878
|
fix: Remove stray ] causing syntax error
|
2024-12-17 20:34:33 -08:00 |
|
Paul Gauthier (aider)
|
31f8c7d9cb
|
fix: Handle JSON decode errors when loading results
|
2024-12-17 20:34:21 -08:00 |
|
Paul Gauthier
|
914ce0b94d
|
feat: Add total_tests to summary, handle JSON decode errors
|
2024-12-17 20:34:20 -08:00 |
|
Paul Gauthier (aider)
|
664f09111e
|
feat: Pass original_dname to tests, copy test files
|
2024-12-17 20:11:58 -08:00 |
|
Paul Gauthier (aider)
|
6141f414fd
|
chore: Remove comment from run_unit_tests
|
2024-12-17 20:11:29 -08:00 |
|
Paul Gauthier
|
8911f0f217
|
fix: Correctly find benchmark markdown files
|
2024-12-17 20:11:29 -08:00 |
|
Paul Gauthier (aider)
|
5af108ccee
|
style: Format benchmark code with black
|
2024-12-17 20:01:45 -08:00 |
|
Paul Gauthier (aider)
|
94e4169445
|
fix: Update stats to handle nested exercise directories
|
2024-12-17 20:01:40 -08:00 |
|
Paul Gauthier
|
479b5b7064
|
fix: Use shell=True for npm test and fix path
|
2024-12-17 20:01:39 -08:00 |
|
Paul Gauthier
|
12491c4983
|
wip
|
2024-12-17 17:47:17 -08:00 |
|
Paul Gauthier (aider)
|
77d379c021
|
refactor: Use full path for test names in benchmark
|
2024-12-17 17:43:52 -08:00 |
|
Paul Gauthier (aider)
|
1a12a59e91
|
chore: Remove comment about test_dnames
|
2024-12-17 17:41:29 -08:00 |
|
Paul Gauthier
|
0b970dd9c7
|
fix: Ensure test_dnames include full path
|
2024-12-17 17:41:27 -08:00 |
|
Paul Gauthier (aider)
|
93ac2bd53e
|
feat: Copy only practice subdirs with exercises
|
2024-12-17 17:36:03 -08:00 |
|
Paul Gauthier (aider)
|
f9646ac47a
|
chore: Remove comment about practice subdirs
|
2024-12-17 17:35:17 -08:00 |
|
Paul Gauthier
|
e8ed3b9e23
|
chore: Add comment about copying practice subdirs
|
2024-12-17 17:35:16 -08:00 |
|
Paul Gauthier (aider)
|
6238a07c8f
|
style: Run linter on benchmark.py
|
2024-12-17 17:33:28 -08:00 |
|
Paul Gauthier (aider)
|
1fb33f0c47
|
feat: Add language filter and multi-lang support
|
2024-12-17 17:33:23 -08:00 |
|
Paul Gauthier (aider)
|
a842f41627
|
style: Fix linting issues in benchmark.py
|
2024-12-17 16:49:50 -08:00 |
|
Paul Gauthier (aider)
|
c4c135e678
|
refactor: Use dict for test commands based on file extensions
|
2024-12-17 16:49:46 -08:00 |
|
Paul Gauthier (aider)
|
f36f2fdea2
|
style: Fix typo in test file extension check
|
2024-12-17 16:48:37 -08:00 |
|
Paul Gauthier (aider)
|
e3f0a67584
|
feat: Choose test command based on file extensions
|
2024-12-17 16:48:32 -08:00 |
|
Paul Gauthier
|
f6f05fa0c6
|
fix: Use cargo test for rust tests
|
2024-12-17 16:48:31 -08:00 |
|
Paul Gauthier (aider)
|
cf5b38d4f5
|
style: Fix linting issues in benchmark.py
|
2024-12-17 16:35:20 -08:00 |
|
Paul Gauthier (aider)
|
b23669400f
|
fix: Correct syntax error in cleanup_test_output
|
2024-12-17 16:35:16 -08:00 |
|
Paul Gauthier
|
aaacd00ecf
|
refactor: Use pytest instead of unittest for running tests
|
2024-12-17 16:35:08 -08:00 |
|
Paul Gauthier (aider)
|
03aa22ba84
|
feat: Read config.json, copy solution/test files, no fallback
|
2024-12-17 16:18:10 -08:00 |
|
Paul Gauthier
|
1493b8703f
|
fix: Skip unparseable results files in real test
|
2024-12-17 16:18:09 -08:00 |
|
Paul Gauthier
|
4dc3b9072e
|
feat: increase retry timeout for benchmarking
|
2024-12-11 14:26:28 -08:00 |
|
Paul Gauthier (aider)
|
fcb2bacd1e
|
style: format benchmark.py with black
|
2024-12-11 13:09:52 -08:00 |
|
Paul Gauthier (aider)
|
a9401e921e
|
feat: add sleep option between tests in single-threaded mode
|
2024-12-11 13:09:45 -08:00 |
|
Paul Gauthier (aider)
|
6af71951af
|
style: fix whitespace in benchmark.py
|
2024-11-28 14:01:50 -08:00 |
|
Paul Gauthier (aider)
|
3eed45dc3e
|
fix: improve benchmark directory selection based on latest .md file timestamp
|
2024-11-28 14:01:45 -08:00 |
|
Paul Gauthier (aider)
|
320b059bc7
|
perf: optimize benchmark dir search by filtering on timestamp first
|
2024-11-28 14:00:12 -08:00 |
|
Paul Gauthier
|
a89ce06377
|
fix: correct glob pattern for finding latest benchmark directory
|
2024-11-28 14:00:10 -08:00 |
|
Paul Gauthier (aider)
|
2ff3a23606
|
fix: add num_ctx parameter to run_test_real function
|
2024-11-25 19:21:08 -08:00 |
|
Paul Gauthier (aider)
|
c5ce57ea7f
|
style: fix linting issues in benchmark.py
|
2024-11-25 19:20:49 -08:00 |
|