copy

2025-06-20 11:35:00 +00:00 · 2024-08-19 20:51:03 -07:00 · 2024-08-19 20:51:03 -07:00 · 730d6e0e94
commit 730d6e0e94
parent 86a7a17d47
1 changed files with 5 additions and 4 deletions
--- a/benchmark/README.md
+++ b/benchmark/README.md
@ -110,7 +110,6 @@ The benchmark report is a yaml record with statistics about the run:
  pass_rate_2: 77.4
  percent_cases_well_formed: 99.2
  error_outputs: 23
  released: 2024-06-20
  num_malformed_responses: 4
  num_with_malformed_responses: 1
  user_asks: 2
@ -131,9 +130,11 @@ percent of the tasks which had all tests passing.
 There will be multiple of these pass rate stats,
 depending on the value of the `--tries` parameter.
-The yaml also includes all the settings which were in effect for the benchmark run and
+The yaml also includes all the settings which were in effect for the benchmark run.
-the git hash of the repo used to run it.
+It also reports the git hash of the repo at the time that the benchmark was
-The `model`, `edit_format` and `commit_hash`
+run, with `(dirty)` if there were uncommitted changes.
 It's good practice to commit the repo before starting a benchmark run.
 This way the `model`, `edit_format` and `commit_hash`
 should be enough to reliably reproduce any benchmark run.
 You can see examples of the benchmark report yaml in the