aider/docs/leaderboard.md
Paul Gauthier a2d16208ab aider: Replaced <pre> tag with <code> tag in leaderboard.md to denote literal code without boxing it.
# Aider chat conversation:

USER: what do the styles on the pre tag do?

ASSISTANT: The `<pre>` tag in the Markdown table is styled with the following CSS:

```css
pre {
  overflow-x: auto;
  white-space: pre-wrap;
}
```

Here's what those styles do:

- `overflow-x: auto;` adds a horizontal scrollbar to the `<pre>` block if the content is wider than the available space. This allows the user to scroll to see long commands without them wrapping or overflowing the table cell.

- `white-space: pre-wrap;` allows the content inside the `<pre>` to wrap to new lines if needed. Normally `<pre>` elements only honor line breaks that are explicitly in the content. With this style, the browser will wrap lines as needed to fit the available width, while still preserving the original line breaks and other whitespace.

So in summary, these styles allow the command examples in the table to be scrolled horizontally if long, while wrapping to multiple lines if needed to fit the cell. This provides a good balance of preserving the original formatting while keeping the content readable and not breaking the table layout.

USER: the pre is putting the text in a pale blue box. what's another way to denote literal code without boxing it?

ASSISTANT: Here are a couple options to denote literal code without using a `<pre>` tag that puts it in a box:

1. Use inline code formatting with backticks, like `this`. This works well for short code snippets within a sentence.

2. Use a `<code>` tag instead of `<pre>`. This will format the text in a monospace font to denote code, but won't put it in a box. You can add CSS to style it further if desired.

Here's a *SEARCH/REPLACE* block to make that change:

docs/leaderboard.md
```markdown
<<<<<<< SEARCH
        <td style="padding: 4px;"><pre style="overflow-x: auto; white-space: pre-wrap;">{{ row.command }}</pre></td>
=======
        <td style="padding: 4px;"><code>{{ row.command }}</code></td>
>>>>>>> REPLACE
```

This replaces the `<pre>` tag with a `<code>` tag. The code will now render inline without the pale blue box, but will still be visibly denoted as code.

You can remove the `overflow-x` and `white-space` styles since they are no longer needed on the `<code>` tag. The browser will wrap the code naturally within the table cell.

Let me know if you would like me to make this change or if you have any other questions!
2024-05-03 17:28:35 -07:00

72 lines
3.1 KiB
Markdown

# Aider's LLM leaderboard
Aider works best with LLMs which are good at *editing* code, not just good at writing
code.
Aider works with the LLM to make changes to the existing code in your git repo,
so the LLM needs to be capable of reliably specifying how to edit code.
Aider uses a
[code editing benchmark](https://aider.chat/docs/benchmarks.html#the-benchmark)
to measure an LLM's code editing ability.
This table reports the results from a number of popular LLMs,
to help users select which models to use with aider.
While [aider can connect to almost any LLM](https://aider.chat/docs/llms.html)
it will work best with models that score well on the code editing benchmark.
## Code editing leaderboard
<table style="width: 90%; max-width: 800px; margin: auto; border-collapse: collapse; box-shadow: 0 2px 4px rgba(0,0,0,0.1); font-size: 14px;">
<thead style="background-color: #f2f2f2;">
<tr>
<th style="padding: 4px; text-align: left;">Model</th>
<th style="padding: 4px; text-align: left;">Percent correct</th>
<th style="padding: 4px; text-align: left;">Command</th>
<th style="padding: 4px; text-align: left;">Edit format</th>
</tr>
</thead>
<tbody>
{% for row in site.data.leaderboard %}
<tr style="border-bottom: 1px solid #ddd;">
<td style="padding: 4px;">{{ row.model }}</td>
<td style="padding: 4px;">{{ row.second }}%</td>
<td style="padding: 4px;"><code>{{ row.command }}</code></td>
<td style="padding: 4px;">{{ row.format }}</td>
</tr>
{% endfor %}
</tbody>
</table>
## Edit format
Aider uses different "edit formats" to collect code edits from different LLMs:
- `whole` is a "whole file" editing format, where the model edits a file by returning a full new copy of the file with any changes included.
- `diff` is a more efficient diff style format, where the model specifies blocks of code to search and replace in order to made changes to files.
- `diff-fenced` is similar to diff, but fences the entire diff block including the filename.
- `udiff` is the most efficient editing format, where the model returns unified diffs to apply changes to the file.
Different models work best with different editing formats.
Aider is configured to use the best edit format for the popular OpenAI and Anthropic models
and the [other models recommended on the LLM page](https://aider.chat/docs/llms.html).
For lesser known models aider will default to using the "whole" editing format
since it is the easiest format for an LLM to use.
But it is also the least efficient, requiring the model to output the
entire source file for every set of changes.
This can raise costs and cause errors with LLMs that have smaller
context window sizes.
All of the other diff-like edit formats are much more efficient with their use of tokens.
## Contributing benchmark results
Contributions of benchmark results are welcome!
See the
[benchmark README]()
for information on running aider's code editing benchmark.
Submit results by opening a PR with edits to the
[benchmark results CSV data file](https://github.com/paul-gauthier/aider/blob/main/_data/leaderboard.csv).