Commit graph

605 commits

Author SHA1 Message Date
Paul Gauthier
deb13c060c copy 2024-05-19 15:14:22 -07:00
Paul Gauthier
2eced6ce12 copy 2024-05-17 13:05:29 -07:00
Paul Gauthier
74c0df8df8 svg 2024-05-15 11:54:00 -07:00
Paul Gauthier
590cbbddfd Added models over time to leaderboard page 2024-05-15 11:51:28 -07:00
Paul Gauthier
b2fbafe41b docs and --4-turbo updates 2024-05-13 11:24:52 -07:00
Paul Gauthier
b4e53da9ea copy 2024-05-13 11:13:42 -07:00
Paul Gauthier
30cb06a629 copy 2024-05-13 11:13:08 -07:00
Paul Gauthier
6d75dc62d7 copy 2024-05-13 11:06:02 -07:00
Paul Gauthier
eea86ef4f6 copy 2024-05-13 11:05:05 -07:00
Paul Gauthier
45ea2d5871 copy 2024-05-13 11:03:12 -07:00
Paul Gauthier
5b3e9d7b05 copy 2024-05-13 10:59:14 -07:00
Paul Gauthier
bfba56c5f1 added 4o as default 2024-05-13 10:57:26 -07:00
Paul Gauthier
cd10e0bf03 make all highlight_images jpg 2024-05-09 09:50:05 -07:00
Paul Gauthier
ccb8be7adf use jpg as highlight 2024-05-09 09:07:01 -07:00
Paul Gauthier
b0a512770b updated docs to use deepseek/ prefix 2024-05-08 08:40:28 -07:00
Paul Gauthier
44024c001d added aider.litellm 2024-05-08 08:09:23 -07:00
Paul Gauthier
9adbb0c722 layout 2024-05-08 07:01:58 -07:00
Paul Gauthier
ca0faf7cc7 Merge remote-tracking branch 'origin/main' 2024-05-07 06:27:20 -07:00
Paul Gauthier
ecca737803 added deepseek-chat v2 2024-05-07 06:26:39 -07:00
paul-gauthier
18761b70be
Update index.md 2024-05-06 17:16:29 -07:00
paul-gauthier
6bc18b9591
Update index.md 2024-05-06 17:15:26 -07:00
Paul Gauthier
64938eafee Use highlight as og:image if present 2024-05-06 16:21:22 -07:00
Paul Gauthier
a58bc98f51 copy 2024-05-06 12:42:14 -07:00
Paul Gauthier
92f53d4d7b copy 2024-05-06 12:41:02 -07:00
Paul Gauthier
673be4647c copy 2024-05-06 12:23:06 -07:00
Paul Gauthier
c1e2829d21 copy 2024-05-06 12:21:14 -07:00
Paul Gauthier
201bb7fb54 copy 2024-05-06 12:14:36 -07:00
Paul Gauthier
b5f2dcaeae copy 2024-05-06 12:06:24 -07:00
Paul Gauthier
ac46a7ee7d copy 2024-05-06 12:02:46 -07:00
Paul Gauthier
7a5e88c2e6 finishing row.percent_cases_well_formed 2024-05-06 11:58:16 -07:00
Paul Gauthier
89db1d8673 aider: Added a second bar for row.percent_cases_well_formed to each graph.
# Aider chat conversation:

USER: in each graph, give each model a second bar showing row.percent_cases_well_formed

ASSISTANT: Here are the changes to add a second bar for `row.percent_cases_well_formed` to each graph:

docs/leaderboards/index.md
```markdown
<<<<<<< SEARCH
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code editing tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      }]
    };

    {% for row in edit_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_2 }});
    {% endfor %}
=======
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code editing tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      },
      {
        label: 'Percent cases well formed',
        data: [],
        backgroundColor: 'rgba(255, 99, 132, 0.2)',
        borderColor: 'rgba(255, 99, 132, 1)',
        borderWidth: 1
      }]
    };

    {% for row in edit_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_2 }});
      leaderboardData.datasets[1].data.push({{ row.percent_cases_well_formed }});
    {% endfor %}
>>>>>>> REPLACE
```

docs/leaderboards/index.md
```markdown
<<<<<<< SEARCH
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code refactoring tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      }]
    };

    {% for row in refac_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_1 }});
    {% endfor %}
=======
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code refactoring tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      },
      {
        label: 'Percent cases well formed',
        data: [],
        backgroundColor: 'rgba(255, 99, 132, 0.2)',
        borderColor: 'rgba(255, 99, 132, 1)',
        borderWidth: 1
      }]
    };

    {% for row in refac_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_1 }});
      leaderboardData.datasets[1].data.push({{ row.percent_cases_well_formed }});
    {% endfor %}
>>>>>>> REPLACE
```

This adds a second dataset to each chart's `leaderboardData`, with a label of "Percent cases well formed". It sets different colors for the bars and borders.

Then in the loop that populates the data, it pushes the `row.percent_cases_well_formed` value into the second dataset.

With these changes, each model will have two bars - one for the pass rate and one for the percent of cases that were well formed.
2024-05-06 11:50:10 -07:00
Paul Gauthier
17b5dbe804 moved edit results to yaml 2024-05-06 11:44:29 -07:00
Paul Gauthier
e58ce69154 move refac data to yml 2024-05-06 11:21:38 -07:00
Paul Gauthier
3bb237bdc1 handle tasks with exceptions in the stats output 2024-05-05 08:24:45 -07:00
Paul Gauthier
6b6548bd37 Merge remote-tracking branch 'origin/main' 2024-05-05 08:00:53 -07:00
paul-gauthier
fab8b8ae40
Update index.md 2024-05-04 20:07:32 -07:00
Paul Gauthier
1b35ca25c4 copy 2024-05-04 17:45:08 -07:00
Paul Gauthier
2d91ee8dbb copy 2024-05-04 17:33:25 -07:00
Paul Gauthier
425cb2941b Merge remote-tracking branch 'origin/main' 2024-05-04 16:26:36 -07:00
Paul Gauthier
366743493c renamed leaderboards 2024-05-04 16:25:46 -07:00
Paul Gauthier
812a620711 copy 2024-05-04 16:25:22 -07:00
paul-gauthier
df94f70a57
Update leaderboard.md 2024-05-04 11:32:43 -07:00
Paul Gauthier
b74edcf350 copy 2024-05-04 11:19:32 -07:00
Paul Gauthier
fbb3749270 copy 2024-05-04 11:14:15 -07:00
Paul Gauthier
ec07b6e556 updated refac 2024-05-04 11:11:34 -07:00
Paul Gauthier
e524dd9203 added refac leaderboard 2024-05-04 11:05:32 -07:00
Paul Gauthier
b02320b08f layout 2024-05-04 10:35:49 -07:00
Paul Gauthier
d96dc9ef63 Revert "aider: Change the graph to a horizontal bar graph."
This reverts commit 753ea441ea.
2024-05-04 10:35:26 -07:00
Paul Gauthier
26856c729e horizontal 2024-05-04 10:23:18 -07:00
Paul Gauthier
753ea441ea aider: Change the graph to a horizontal bar graph.
# Aider chat conversation:

USER: make the graph a horizontal bar graph

ASSISTANT: ```diff
--- docs/leaderboard.md
+++ docs/leaderboard.md
@@ ... @@
 <script>
   document.addEventListener('DOMContentLoaded', function () {
     var ctx = document.getElementById('leaderboardChart').getContext('2d');
     var leaderboardData = {
       labels: [],
       datasets: [{
         label: 'Percent correct',
         data: [],
         backgroundColor: 'rgba(54, 162, 235, 0.2)',
         borderColor: 'rgba(54, 162, 235, 1)',
         borderWidth: 1
       }]
     };

     {% for row in sorted %}
       leaderboardData.labels.push('{{ row.model }}');
       leaderboardData.datasets[0].data.push({{ row.second }});
     {% endfor %}

-    var leaderboardChart = new Chart(ctx, {
-      type: 'bar',
+    var leaderboardChart = new Chart(ctx, {
+      type: 'horizontalBar',
       data: leaderboardData,
       options: {
         scales: {
-          yAxes: [{
+          xAxes: [{
             scaleLabel: {
               display: true,
             },
             ticks: {
               beginAtZero: true
             }
           }]
         }
       }
     });
   });
 </script>
```
2024-05-04 10:22:54 -07:00