Commit graph

593 commits

Author SHA1 Message Date
Paul Gauthier
cd10e0bf03 make all highlight_images jpg 2024-05-09 09:50:05 -07:00
Paul Gauthier
ccb8be7adf use jpg as highlight 2024-05-09 09:07:01 -07:00
Paul Gauthier
b0a512770b updated docs to use deepseek/ prefix 2024-05-08 08:40:28 -07:00
Paul Gauthier
44024c001d added aider.litellm 2024-05-08 08:09:23 -07:00
Paul Gauthier
9adbb0c722 layout 2024-05-08 07:01:58 -07:00
Paul Gauthier
ca0faf7cc7 Merge remote-tracking branch 'origin/main' 2024-05-07 06:27:20 -07:00
Paul Gauthier
ecca737803 added deepseek-chat v2 2024-05-07 06:26:39 -07:00
paul-gauthier
18761b70be
Update index.md 2024-05-06 17:16:29 -07:00
paul-gauthier
6bc18b9591
Update index.md 2024-05-06 17:15:26 -07:00
Paul Gauthier
64938eafee Use highlight as og:image if present 2024-05-06 16:21:22 -07:00
Paul Gauthier
a58bc98f51 copy 2024-05-06 12:42:14 -07:00
Paul Gauthier
92f53d4d7b copy 2024-05-06 12:41:02 -07:00
Paul Gauthier
673be4647c copy 2024-05-06 12:23:06 -07:00
Paul Gauthier
c1e2829d21 copy 2024-05-06 12:21:14 -07:00
Paul Gauthier
201bb7fb54 copy 2024-05-06 12:14:36 -07:00
Paul Gauthier
b5f2dcaeae copy 2024-05-06 12:06:24 -07:00
Paul Gauthier
ac46a7ee7d copy 2024-05-06 12:02:46 -07:00
Paul Gauthier
7a5e88c2e6 finishing row.percent_cases_well_formed 2024-05-06 11:58:16 -07:00
Paul Gauthier
89db1d8673 aider: Added a second bar for row.percent_cases_well_formed to each graph.
# Aider chat conversation:

USER: in each graph, give each model a second bar showing row.percent_cases_well_formed

ASSISTANT: Here are the changes to add a second bar for `row.percent_cases_well_formed` to each graph:

docs/leaderboards/index.md
```markdown
<<<<<<< SEARCH
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code editing tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      }]
    };

    {% for row in edit_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_2 }});
    {% endfor %}
=======
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code editing tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      },
      {
        label: 'Percent cases well formed',
        data: [],
        backgroundColor: 'rgba(255, 99, 132, 0.2)',
        borderColor: 'rgba(255, 99, 132, 1)',
        borderWidth: 1
      }]
    };

    {% for row in edit_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_2 }});
      leaderboardData.datasets[1].data.push({{ row.percent_cases_well_formed }});
    {% endfor %}
>>>>>>> REPLACE
```

docs/leaderboards/index.md
```markdown
<<<<<<< SEARCH
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code refactoring tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      }]
    };

    {% for row in refac_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_1 }});
    {% endfor %}
=======
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code refactoring tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      },
      {
        label: 'Percent cases well formed',
        data: [],
        backgroundColor: 'rgba(255, 99, 132, 0.2)',
        borderColor: 'rgba(255, 99, 132, 1)',
        borderWidth: 1
      }]
    };

    {% for row in refac_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_1 }});
      leaderboardData.datasets[1].data.push({{ row.percent_cases_well_formed }});
    {% endfor %}
>>>>>>> REPLACE
```

This adds a second dataset to each chart's `leaderboardData`, with a label of "Percent cases well formed". It sets different colors for the bars and borders.

Then in the loop that populates the data, it pushes the `row.percent_cases_well_formed` value into the second dataset.

With these changes, each model will have two bars - one for the pass rate and one for the percent of cases that were well formed.
2024-05-06 11:50:10 -07:00
Paul Gauthier
17b5dbe804 moved edit results to yaml 2024-05-06 11:44:29 -07:00
Paul Gauthier
e58ce69154 move refac data to yml 2024-05-06 11:21:38 -07:00
Paul Gauthier
3bb237bdc1 handle tasks with exceptions in the stats output 2024-05-05 08:24:45 -07:00
Paul Gauthier
6b6548bd37 Merge remote-tracking branch 'origin/main' 2024-05-05 08:00:53 -07:00
paul-gauthier
fab8b8ae40
Update index.md 2024-05-04 20:07:32 -07:00
Paul Gauthier
1b35ca25c4 copy 2024-05-04 17:45:08 -07:00
Paul Gauthier
2d91ee8dbb copy 2024-05-04 17:33:25 -07:00
Paul Gauthier
425cb2941b Merge remote-tracking branch 'origin/main' 2024-05-04 16:26:36 -07:00
Paul Gauthier
366743493c renamed leaderboards 2024-05-04 16:25:46 -07:00
Paul Gauthier
812a620711 copy 2024-05-04 16:25:22 -07:00
paul-gauthier
df94f70a57
Update leaderboard.md 2024-05-04 11:32:43 -07:00
Paul Gauthier
b74edcf350 copy 2024-05-04 11:19:32 -07:00
Paul Gauthier
fbb3749270 copy 2024-05-04 11:14:15 -07:00
Paul Gauthier
ec07b6e556 updated refac 2024-05-04 11:11:34 -07:00
Paul Gauthier
e524dd9203 added refac leaderboard 2024-05-04 11:05:32 -07:00
Paul Gauthier
b02320b08f layout 2024-05-04 10:35:49 -07:00
Paul Gauthier
d96dc9ef63 Revert "aider: Change the graph to a horizontal bar graph."
This reverts commit 753ea441ea.
2024-05-04 10:35:26 -07:00
Paul Gauthier
26856c729e horizontal 2024-05-04 10:23:18 -07:00
Paul Gauthier
753ea441ea aider: Change the graph to a horizontal bar graph.
# Aider chat conversation:

USER: make the graph a horizontal bar graph

ASSISTANT: ```diff
--- docs/leaderboard.md
+++ docs/leaderboard.md
@@ ... @@
 <script>
   document.addEventListener('DOMContentLoaded', function () {
     var ctx = document.getElementById('leaderboardChart').getContext('2d');
     var leaderboardData = {
       labels: [],
       datasets: [{
         label: 'Percent correct',
         data: [],
         backgroundColor: 'rgba(54, 162, 235, 0.2)',
         borderColor: 'rgba(54, 162, 235, 1)',
         borderWidth: 1
       }]
     };

     {% for row in sorted %}
       leaderboardData.labels.push('{{ row.model }}');
       leaderboardData.datasets[0].data.push({{ row.second }});
     {% endfor %}

-    var leaderboardChart = new Chart(ctx, {
-      type: 'bar',
+    var leaderboardChart = new Chart(ctx, {
+      type: 'horizontalBar',
       data: leaderboardData,
       options: {
         scales: {
-          yAxes: [{
+          xAxes: [{
             scaleLabel: {
               display: true,
             },
             ticks: {
               beginAtZero: true
             }
           }]
         }
       }
     });
   });
 </script>
```
2024-05-04 10:22:54 -07:00
Paul Gauthier
e9228e0ed8 Update leaderboard chart with new label and style. 2024-05-04 10:22:53 -07:00
Paul Gauthier
9b4e643436 graph 2024-05-04 10:20:56 -07:00
Paul Gauthier
bc582723e3 aider: Add y-axis label "Percent correct" to the leaderboard chart.
# Aider chat conversation:

USER: add y-axis label "Percent correct"

ASSISTANT: ```diff
--- docs/leaderboard.md
+++ docs/leaderboard.md
@@ ... @@
 var leaderboardChart = new Chart(ctx, {
   type: 'bar',
   data: leaderboardData,
   options: {
     scales: {
       yAxes: [{
+        scaleLabel: {
+          display: true,
+          labelString: 'Percent correct'
+        },
         ticks: {
           beginAtZero: true
         }
       }]
     }
   }
 });
```
2024-05-04 10:20:42 -07:00
Paul Gauthier
e8ae216bbd graph 2024-05-04 10:19:46 -07:00
Paul Gauthier
98c8395d48 aider: Added Chart.js library to the docs/leaderboard.md file.
# Aider chat conversation:

USER: ReferenceError: Can't find variable: Chart

ASSISTANT: The error message "ReferenceError: Can't find variable: Chart" suggests that the Chart.js library is not being loaded before the script that is trying to use it. To fix this, we need to ensure that the Chart.js library is included in the HTML document before the script that initializes the chart.

Here is the diff to add the Chart.js library to the `docs/leaderboard.md` file:

```diff
--- docs/leaderboard.md
+++ docs/leaderboard.md
@@ ... @@
 <canvas id="leaderboardChart" width="800" height="450"></canvas>
+<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
 <script>
   document.addEventListener('DOMContentLoaded', function () {
     var ctx = document.getElementById('leaderboardChart').getContext('2d');
     var leaderboardData = {
       labels: [],
       datasets: [{
         label: 'Percent correct',
         data: [],
         backgroundColor: 'rgba(54, 162, 235, 0.2)',
         borderColor: 'rgba(54, 162, 235, 1)',
         borderWidth: 1
       }]
     };
@@ ... @@
```

This diff adds the Chart.js library from a CDN before the script tag that contains the chart initialization code. This should resolve the `ReferenceError` by ensuring that the `Chart` variable is defined when the script runs.
2024-05-04 10:16:08 -07:00
Paul Gauthier
9e8e47f1ce aider: Add script to draw a bar graph using Chart.js with model names as labels and percentages as bar heights in the docs/leaderboard.md file.
# Aider chat conversation:

USER: use chart.js to fill draw a bar graph in the canvas. is liquid templating to pull the "model" as bar label and "second" as bar height

ASSISTANT: ```diff
--- docs/leaderboard.md
+++ docs/leaderboard.md
@@ ... @@
-<canvas id="leaderboardChart" width="800" height="450"></canvas>
+<canvas id="leaderboardChart" width="800" height="450"></canvas>
+<script>
+  document.addEventListener('DOMContentLoaded', function () {
+    var ctx = document.getElementById('leaderboardChart').getContext('2d');
+    var leaderboardData = {
+      labels: [],
+      datasets: [{
+        label: 'Percent correct',
+        data: [],
+        backgroundColor: 'rgba(54, 162, 235, 0.2)',
+        borderColor: 'rgba(54, 162, 235, 1)',
+        borderWidth: 1
+      }]
+    };
+
+    {% for row in site.data.leaderboard | sort: 'second' | reverse %}
+      leaderboardData.labels.push('{{ row.model }}');
+      leaderboardData.datasets[0].data.push({{ row.second }});
+    {% endfor %}
+
+    var leaderboardChart = new Chart(ctx, {
+      type: 'bar',
+      data: leaderboardData,
+      options: {
+        scales: {
+          yAxes: [{
+            ticks: {
+              beginAtZero: true
+            }
+          }]
+        }
+      }
+    });
+  });
+</script>
```

This diff adds a script tag to the `docs/leaderboard.md` file that uses Chart.js to draw a bar graph on the canvas element with the id `leaderboardChart`. The script waits for the DOM to be fully loaded before executing. It then initializes the chart with the model names as labels and the corresponding percentages as the bar heights, using Liquid templating to pull the data from the site's data files.
2024-05-04 10:14:12 -07:00
Paul Gauthier
61a4c726a5 Updated leaderboard chart to dynamically fetch and populate data from a CSV file using Chart.js. 2024-05-04 10:14:11 -07:00
Paul Gauthier
c28d22f9b1 fixup 2024-05-04 10:12:06 -07:00
Paul Gauthier
6d9043cf24 aider: Added a bar chart using Chart.js to the leaderboard.md file.
# Aider chat conversation:

USER: add a bar chart using chart.js

ASSISTANT: To add a bar chart using Chart.js to the `leaderboard.md` file, you will need to include the Chart.js library, prepare a canvas element where the chart will be rendered, and write the JavaScript code to create the bar chart. Here's the diff that adds these elements to the `leaderboard.md` file:

```diff
--- docs/leaderboard.md
+++ docs/leaderboard.md
@@ ... @@
 <table style="width: 90%; max-width: 800px; margin: auto; border-collapse: collapse; box-shadow: 0 2px 4px rgba(0,0,0,0.1); font-size: 14px;">
   <!-- Table content -->
 </table>
+
+<canvas id="leaderboardChart" width="800" height="450"></canvas>
+
+<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
+<script>
+document.addEventListener('DOMContentLoaded', function () {
+  var ctx = document.getElementById('leaderboardChart').getContext('2d');
+  var leaderboardData = {
+    labels: [], // Populate with model names
+    datasets: [{
+      label: 'Percent correct',
+      data: [], // Populate with percent correct values
+      backgroundColor: 'rgba(54, 162, 235, 0.2)',
+      borderColor: 'rgba(54, 162, 235, 1)',
+      borderWidth: 1
+    }]
+  };
+
+  // Fetch leaderboard data and populate the chart
+  fetch('/_data/leaderboard.csv')
+    .then(function (response) {
+      return response.text();
+    })
+    .then(function (text) {
+      var rows = text.split('\n');
+      rows.forEach(function (row) {
+        var columns = row.split(',');
+        if (columns.length === 4) {
+          leaderboardData.labels.push(columns[0]);
+          leaderboardData.datasets[0].data.push(parseFloat(columns[1]));
+        }
+      });
+      new Chart(ctx, {
+        type: 'bar',
+        data: leaderboardData,
+        options: {
+          scales: {
+            y: {
+              beginAtZero: true
+            }
+          }
+        }
+      });
+    });
+});
+</script>
```

This diff adds a canvas element for the chart, includes the Chart.js library from a CDN, and provides a script that initializes the bar chart with data from the leaderboard CSV file. The script fetches the CSV data, parses it, and populates the chart with the model names and their corresponding "Percent correct" values.
2024-05-04 10:11:25 -07:00
Paul Gauthier
c9dbca9d0e gpt-4-0314 with examples as sys 2024-05-04 07:52:22 -07:00
Paul Gauthier
f6580fff76 updated all openai models 2024-05-04 07:38:50 -07:00
paul-gauthier
a496abbfa4
Update leaderboard.md 2024-05-03 18:46:35 -07:00