This commit is contained in:
Paul Gauthier 2024-08-19 20:53:42 -07:00
parent 730d6e0e94
commit e0a9044118

View file

@ -138,13 +138,13 @@ This way the `model`, `edit_format` and `commit_hash`
should be enough to reliably reproduce any benchmark run. should be enough to reliably reproduce any benchmark run.
You can see examples of the benchmark report yaml in the You can see examples of the benchmark report yaml in the
[aider leaderboard data files](https://github.com/paul-gauthier/aider/blob/main/website/_data/). [aider leaderboard data files](https://github.com/paul-gauthier/aider/blob/main/aider/website/_data/).
## Limitations, notes ## Limitations, notes
- Benchmarking all 133 exercises against Claude 3.5 Sonnet will cost about $4. - Benchmarking all 133 exercises against Claude 3.5 Sonnet will cost about $4.
- Contributions of benchmark results are welcome! Submit results by opening a PR with edits to the - Contributions of benchmark results are welcome! Submit results by opening a PR with edits to the
[benchmark results data files](https://github.com/paul-gauthier/aider/blob/main/website/_data/). [aider leaderboard data files](https://github.com/paul-gauthier/aider/blob/main/aider/website/_data/).
- These scripts are not intended for use by typical aider end users. - These scripts are not intended for use by typical aider end users.
- Some of these tools are written as `bash` scripts, so it will be hard to use them on Windows. - Some of these tools are written as `bash` scripts, so it will be hard to use them on Windows.