Commit graph

4038 commits

Author SHA1 Message Date
Paul Gauthier
44024c001d added aider.litellm 2024-05-08 08:09:23 -07:00
Paul Gauthier
9ff6770a04 refactored litellm to avoid duplicating workarounds 2024-05-08 08:05:15 -07:00
Paul Gauthier
b249177119 added settings for deepseek/deepseek-chat 2024-05-08 07:13:13 -07:00
Paul Gauthier
bec9d49d64 bumped deps 2024-05-08 07:07:55 -07:00
Paul Gauthier
9adbb0c722 layout 2024-05-08 07:01:58 -07:00
Paul Gauthier
ecf6e4df3a set version to 0.32.1-dev 2024-05-07 16:36:51 -07:00
Paul Gauthier
6c17d94101 version bump to 0.32.0 2024-05-07 16:36:15 -07:00
Paul Gauthier
6fc6d5056d copy 2024-05-07 16:28:05 -07:00
Paul Gauthier
85864e01bc added WizardLM-2 8x22B to leaderboard 2024-05-07 14:37:41 -07:00
Paul Gauthier
bbe8639160 renamed qwen 2024-05-07 13:57:45 -07:00
Paul Gauthier
8e272bf7c6 Updated HISTORY 2024-05-07 13:55:22 -07:00
Paul Gauthier
c29d4860ab Added qwen1.5-110b-chat to leaderboard 2024-05-07 13:55:12 -07:00
Paul Gauthier
70b1c0c20c load .env in benchmark.py 2024-05-07 13:32:19 -07:00
Paul Gauthier
dc7e61f3c9 added deepseek-chat v2 (diff) to leaderboard 2024-05-07 09:38:46 -07:00
Paul Gauthier
e9da4016c8 deepseek-chat with sys examples and sys reminder 2024-05-07 08:48:49 -07:00
Paul Gauthier
086ccd64ee updated history 2024-05-07 08:48:28 -07:00
Paul Gauthier
04b224962e send the reminder if unknown context window size 2024-05-07 08:42:33 -07:00
Paul Gauthier
ca0faf7cc7 Merge remote-tracking branch 'origin/main' 2024-05-07 06:27:20 -07:00
Paul Gauthier
ecca737803 added deepseek-chat v2 2024-05-07 06:26:39 -07:00
Paul Gauthier
b1cae73a85 cleaned up csv output 2024-05-07 05:59:31 -07:00
paul-gauthier
6fb4983d88
Update edit_leaderboard.yml 2024-05-06 20:42:23 -07:00
paul-gauthier
18761b70be
Update index.md 2024-05-06 17:16:29 -07:00
paul-gauthier
6bc18b9591
Update index.md 2024-05-06 17:15:26 -07:00
Paul Gauthier
db994fb203 aider: Updated the reference to highlight_image in default.html to use page.highlight_image.
# Aider chat conversation:

USER: in default.html, how should i reference the `highlight_image` declared in the markdown front matter?

ASSISTANT: To reference the `highlight_image` front matter variable in `default.html`, you can use the Liquid syntax `{{ page.highlight_image }}`.

Here's the change in a *SEARCH/REPLACE block*:

_layouts/default.html
<source>html
<<<<<<< SEARCH
    {% if highlight_image %}
    <meta property="og:image" content="{{ site.url }}{{ highlight_image }}">
=======
    {% if page.highlight_image %}
    <meta property="og:image" content="{{ site.url }}{{ page.highlight_image }}">
>>>>>>> REPLACE
</source>

This checks if `page.highlight_image` is defined in the front matter of the current page. If so, it sets the `og:image` meta tag to the full URL of the highlight image by prepending `site.url` to `page.highlight_image`.
2024-05-06 16:26:56 -07:00
Paul Gauthier
0c839b835c layout 2024-05-06 16:24:39 -07:00
Paul Gauthier
64938eafee Use highlight as og:image if present 2024-05-06 16:21:22 -07:00
Paul Gauthier
eab864c518 copy 2024-05-06 14:38:02 -07:00
Paul Gauthier
c67a94ee99 copy 2024-05-06 14:37:03 -07:00
Paul Gauthier
a58bc98f51 copy 2024-05-06 12:42:14 -07:00
Paul Gauthier
92f53d4d7b copy 2024-05-06 12:41:02 -07:00
Paul Gauthier
673be4647c copy 2024-05-06 12:23:06 -07:00
Paul Gauthier
c1e2829d21 copy 2024-05-06 12:21:14 -07:00
Paul Gauthier
201bb7fb54 copy 2024-05-06 12:14:36 -07:00
Paul Gauthier
b5f2dcaeae copy 2024-05-06 12:06:24 -07:00
Paul Gauthier
ac46a7ee7d copy 2024-05-06 12:02:46 -07:00
Paul Gauthier
7a5e88c2e6 finishing row.percent_cases_well_formed 2024-05-06 11:58:16 -07:00
Paul Gauthier
4b903a3bb8 cleanup 2024-05-06 11:50:49 -07:00
Paul Gauthier
89db1d8673 aider: Added a second bar for row.percent_cases_well_formed to each graph.
# Aider chat conversation:

USER: in each graph, give each model a second bar showing row.percent_cases_well_formed

ASSISTANT: Here are the changes to add a second bar for `row.percent_cases_well_formed` to each graph:

docs/leaderboards/index.md
```markdown
<<<<<<< SEARCH
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code editing tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      }]
    };

    {% for row in edit_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_2 }});
    {% endfor %}
=======
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code editing tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      },
      {
        label: 'Percent cases well formed',
        data: [],
        backgroundColor: 'rgba(255, 99, 132, 0.2)',
        borderColor: 'rgba(255, 99, 132, 1)',
        borderWidth: 1
      }]
    };

    {% for row in edit_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_2 }});
      leaderboardData.datasets[1].data.push({{ row.percent_cases_well_formed }});
    {% endfor %}
>>>>>>> REPLACE
```

docs/leaderboards/index.md
```markdown
<<<<<<< SEARCH
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code refactoring tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      }]
    };

    {% for row in refac_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_1 }});
    {% endfor %}
=======
    var leaderboardData = {
      labels: [],
      datasets: [{
        label: 'Percent correct on code refactoring tasks',
        data: [],
        backgroundColor: 'rgba(54, 162, 235, 0.2)',
        borderColor: 'rgba(54, 162, 235, 1)',
        borderWidth: 1
      },
      {
        label: 'Percent cases well formed',
        data: [],
        backgroundColor: 'rgba(255, 99, 132, 0.2)',
        borderColor: 'rgba(255, 99, 132, 1)',
        borderWidth: 1
      }]
    };

    {% for row in refac_sorted %}
      leaderboardData.labels.push('{{ row.model }}');
      leaderboardData.datasets[0].data.push({{ row.pass_rate_1 }});
      leaderboardData.datasets[1].data.push({{ row.percent_cases_well_formed }});
    {% endfor %}
>>>>>>> REPLACE
```

This adds a second dataset to each chart's `leaderboardData`, with a label of "Percent cases well formed". It sets different colors for the bars and borders.

Then in the loop that populates the data, it pushes the `row.percent_cases_well_formed` value into the second dataset.

With these changes, each model will have two bars - one for the pass rate and one for the percent of cases that were well formed.
2024-05-06 11:50:10 -07:00
Paul Gauthier
92ea428b82 cleaned up edit data 2024-05-06 11:48:10 -07:00
Paul Gauthier
17b5dbe804 moved edit results to yaml 2024-05-06 11:44:29 -07:00
Paul Gauthier
fc3a43ef41 completed moving refac to yml 2024-05-06 11:25:14 -07:00
Paul Gauthier
e58ce69154 move refac data to yml 2024-05-06 11:21:38 -07:00
Paul Gauthier
a7b08c7354 format output as yaml 2024-05-06 11:15:19 -07:00
Paul Gauthier
3162d42262 cleanup 2024-05-06 10:46:09 -07:00
Paul Gauthier
5fb7a323ec refactored plots 2024-05-06 10:44:34 -07:00
Paul Gauthier
e61857ef09 summarize as many messages as will fit into the summarizer context 2024-05-06 09:47:14 -07:00
Paul Gauthier
e51e0219ee aider: Add tqdm progress bar to summarize_chat_history_markdown loop.
# Aider chat conversation:

USER: add a tqdm to `summarize_chat_history_markdown` loop

ASSISTANT: ```diff
--- aider/history.py
+++ aider/history.py
@@ ... @@
 from aider.sendchat import simple_send_with_retries
+from tqdm import tqdm

 class ChatSummary:
     def __init__(self, model=None, max_tokens=1024):
@@ ... @@
     def summarize_chat_history_markdown(self, text):
         messages = []
         assistant = []
+        lines = text.splitlines(keepends=True)
+        for line in tqdm(lines, desc="Summarizing chat history"):
-        for line in text.splitlines(keepends=True):
             if line.startswith("# "):
                 continue
             if line.startswith(">"):
                 continue
             if line.startswith("#### /"):
                 continue

             if line.startswith("#### "):
                 if assistant:
                     assistant = "".join(assistant)
                     if assistant.strip():
                         messages.append(dict(role="assistant", content=assistant))
                     assistant = []

                 content = line[5:]
                 if content.strip() and content.strip() != "<blank>":
                     messages.append(dict(role="user", content=line[5:]))
                 continue

             assistant.append(line)

         summary = self.summarize(messages[-40:])
         return summary
```

This diff adds the `tqdm` progress bar to the `summarize_chat_history_markdown` method in the `aider/history.py` file. It wraps the loop that processes the chat history lines with `tqdm`, providing a description for the progress bar.
2024-05-06 09:31:32 -07:00
Paul Gauthier
64470955d4 pass max_chat_history_tokens into ChatSummary 2024-05-06 09:29:21 -07:00
Paul Gauthier
1747525418 aider: Added max_chat_history_tokens as an explicit argument to the Coder.__init__ method and passed args.max_chat_history_tokens when creating the Coder instance in main().
# Aider chat conversation:

USER: don't use kwargs, add it as an explicit arg

ASSISTANT: Here is the *SEARCH/REPLACE block* to add `max_chat_history_tokens` as an explicit argument instead of using `**kwargs`:

aider/coders/base_coder.py
```python
<<<<<<< SEARCH
    def __init__(
        self,
        main_model,
        io,
        fnames=None,
        git_dname=None,
        pretty=True,
        show_diffs=False,
        auto_commits=True,
        dirty_commits=True,
        dry_run=False,
        map_tokens=1024,
        verbose=False,
        assistant_output_color="blue",
        code_theme="default",
        stream=True,
        use_git=True,
        voice_language=None,
        aider_ignore_file=None,
        cur_messages=None,
        done_messages=None,
    ):
=======
    def __init__(
        self,
        main_model,
        io,
        fnames=None,
        git_dname=None,
        pretty=True,
        show_diffs=False,
        auto_commits=True,
        dirty_commits=True,
        dry_run=False,
        map_tokens=1024,
        verbose=False,
        assistant_output_color="blue",
        code_theme="default",
        stream=True,
        use_git=True,
        voice_language=None,
        aider_ignore_file=None,
        cur_messages=None,
        done_messages=None,
        max_chat_history_tokens=None,
    ):
>>>>>>> REPLACE
```

aider/main.py
```python
<<<<<<< SEARCH
        coder = Coder.create(
            main_model=main_model,
            edit_format=args.edit_format,
            io=io,
            ##
            fnames=fnames,
            git_dname=git_dname,
            pretty=args.pretty,
            show_diffs=args.show_diffs,
            auto_commits=args.auto_commits,
            dirty_commits=args.dirty_commits,
            dry_run=args.dry_run,
            map_tokens=args.map_tokens,
            verbose=args.verbose,
            assistant_output_color=args.assistant_output_color,
            code_theme=args.code_theme,
            stream=args.stream,
            use_git=args.git,
            voice_language=args.voice_language,
            aider_ignore_file=args.aiderignore,
        )
=======
        coder = Coder.create(
            main_model=main_model,
            edit_format=args.edit_format,
            io=io,
            ##
            fnames=fnames,
            git_dname=git_dname,
            pretty=args.pretty,
            show_diffs=args.show_diffs,
            auto_commits=args.auto_commits,
            dirty_commits=args.dirty_commits,
            dry_run=args.dry_run,
            map_tokens=args.map_tokens,
            verbose=args.verbose,
            assistant_output_color=args.assistant_output_color,
            code_theme=args.code_theme,
            stream=args.stream,
            use_git=args.git,
            voice_language=args.voice_language,
            aider_ignore_file=args.aiderignore,
            max_chat_history_tokens=args.max_chat_history_tokens,
        )
>>>>>>> REPLACE
```

This adds `max_chat_history_tokens` as an explicit argument to the `Coder.__init__` method, and passes `args.max_chat_history_tokens` when creating the `Coder` instance in `main()`.
2024-05-06 09:25:04 -07:00
Paul Gauthier
3a79b6230b aider: Added a --max-chat-history-tokens switch and passed it to ChatSummary in Coder.
# Aider chat conversation:

USER: add a --max-chat-history-tokens switch, pass it into Coder to `ChatSummary`

ASSISTANT: Here are the changes to add a --max-chat-history-tokens switch and pass it to ChatSummary in Coder:

aider/args.py
```python
<<<<<<< SEARCH
    parser.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Number of tokens to use for the repo map. Set to 0 to disable.",
    )
=======
    parser.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Number of tokens to use for the repo map. Set to 0 to disable.",
    )
    parser.add_argument(
        "--max-chat-history-tokens",
        type=int,
        default=None,
        help="Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.",
    )
>>>>>>> REPLACE
```

aider/coders/base_coder.py
```python
<<<<<<< SEARCH
        self.summarizer = ChatSummary(
            self.main_model.weak_model,
            self.main_model.max_chat_history_tokens,
        )
=======
        max_chat_history_tokens = self.main_model.max_chat_history_tokens
        if kwargs.get("max_chat_history_tokens") is not None:
            max_chat_history_tokens = kwargs["max_chat_history_tokens"]

        self.summarizer = ChatSummary(
            self.main_model.weak_model,
            max_chat_history_tokens,
        )
>>>>>>> REPLACE
```

aider/main.py
```python
<<<<<<< SEARCH
            map_tokens=args.map_tokens,
=======
            map_tokens=args.map_tokens,
            max_chat_history_tokens=args.max_chat_history_tokens,
>>>>>>> REPLACE
```

This adds the new command line argument `--max-chat-history-tokens` which defaults to None. If not specified, it will use the model's default `max_chat_history_tokens`.

It passes this argument from `main()` into the `Coder` constructor.

In the `Coder` constructor, it checks if `max_chat_history_tokens` was provided in the `kwargs`. If so, it uses that value, otherwise it falls back to the model's default.

Let me know if you have any other questions!

ASSISTANT: Here are the changes to add a --max-chat-history-tokens switch and pass it to ChatSummary in Coder:

aider/args.py
```python
<<<<<<< SEARCH
    parser.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Number of tokens to use for the repo map. Set to 0 to disable.",
    )
=======
    parser.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Number of tokens to use for the repo map. Set to 0 to disable.",
    )
    parser.add_argument(
        "--max-chat-history-tokens",
        type=int,
        default=None,
        help="Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.",
    )
>>>>>>> REPLACE
```

aider/coders/base_coder.py
```python
<<<<<<< SEARCH
        self.summarizer = ChatSummary(
            self.main_model.weak_model,
            self.main_model.max_chat_history_tokens,
        )
=======
        max_chat_history_tokens = self.main_model.max_chat_history_tokens
        if kwargs.get("max_chat_history_tokens") is not None:
            max_chat_history_tokens = kwargs["max_chat_history_tokens"]

        self.summarizer = ChatSummary(
            self.main_model.weak_model,
            max_chat_history_tokens,
        )
>>>>>>> REPLACE
```

aider/main.py
```python
<<<<<<< SEARCH
            map_tokens=args.map_tokens,
=======
            map_tokens=args.map_tokens,
            max_chat_history_tokens=args.max_chat_history_tokens,
>>>>>>> REPLACE
```

This adds the new command line argument `--max-chat-history-tokens` which defaults to None. If not specified, it will use the model's default `max_chat_history_tokens`.

It passes this argument from `main()` into the `Coder` constructor.

In the `Coder` constructor, it checks if `max_chat_history_tokens` was provided in the `kwargs`. If so, it uses that value, otherwise it falls back to the model's default.

Let me know if you have any other questions!

USER: InvalidEditBlock: edit failed!

aider/args.py does not contain the *exact chunk* of SEARCH lines you specified.
Try again.
DO NOT skip blank lines, comments, docstrings, etc!
The SEARCH block needs to be EXACTLY the same as the lines in aider/args.py with nothing missing!

aider/args.py does not contain these 6 exact lines in a row:
```
    parser.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Number of tokens to use for the repo map. Set to 0 to disable.",
    )
```

ASSISTANT: I apologize for the mistake. Here is the corrected change for aider/args.py:

aider/args.py
```python
<<<<<<< SEARCH
    group.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Max number of tokens to use for repo map, use 0 to disable (default: 1024)",
    )
=======
    group.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Max number of tokens to use for repo map, use 0 to disable (default: 1024)",
    )
    group.add_argument(
        "--max-chat-history-tokens",
        type=int,
        default=None,
        help="Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.",
    )
>>>>>>> REPLACE
```
2024-05-06 09:23:37 -07:00