Commit graph

4318 commits

Author SHA1 Message Date
Paul Gauthier
a57dd90a49 fix unicode err 2024-05-23 15:02:31 -07:00
Paul Gauthier
cb4384edc7 aider: Set the environment variable PYTHONIOENCODING to utf-8 in the test to prevent UnicodeEncodeError.
# Aider chat conversation:

USER: I am getting this testing error on Windows: ======================================================================
ERROR: test_get_input_is_a_directory_error (test_io.TestInputOutput.test_get_input_is_a_directory_error)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\unittest\mock.py", line 1390, in patched
    return func(*newargs, **newkeywargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\aider\aider\tests\test_io.py", line 60, in test_get_input_is_a_directory_error
    result = io.get_input(root, rel_fnames, addable_rel_fnames, commands)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\aider\aider\aider\io.py", line 186, in get_input
    self.console.rule(**style)
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\console.py", line 1580, in rule
    self.print(rule)
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\console.py", line 1673, in print
    with self:
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\console.py", line 865, in __exit__
    self._exit_buffer()
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\console.py", line 823, in _exit_buffer
    self._check_buffer()
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\console.py", line 2027, in _check_buffer
    legacy_windows_render(buffer, LegacyWindowsTerm(self.file))
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\_windows_renderer.py", line 17, in legacy_windows_render
    term.write_styled(text, style)
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\_win32_console.py", line 442, in write_styled
    self.write_text(text)
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\site-packages\rich\_win32_console.py", line 403, in write_text
    self.write(text)
  File "C:\hostedtoolcache\windows\Python\3.12.3\x64\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-78: character maps to <undefined>

ASSISTANT: The error you're encountering is due to the `charmap` codec on Windows not being able to encode certain Unicode characters. To fix this, we can set the environment variable `PYTHONIOENCODING` to `utf-8` in the test to ensure that the console output uses UTF-8 encoding.

Let's update the test to set this environment variable:

tests/test_io.py
```python
<<<<<<< SEARCH
        # Mock the PromptSession to simulate user input
=======
        # Set the environment variable to ensure UTF-8 encoding
        os.environ["PYTHONIOENCODING"] = "utf-8"

        # Mock the PromptSession to simulate user input
>>>>>>> REPLACE
```

This change will ensure that the console output uses UTF-8 encoding, which should prevent the `UnicodeEncodeError` from occurring.
2024-05-23 14:21:13 -07:00
Paul Gauthier
52ac35fcd2 restore --show-diffs functionality #607 2024-05-23 14:07:20 -07:00
Paul Gauthier
2ed5e45677 fixed #603 2024-05-23 14:01:45 -07:00
Paul Gauthier
9d20b5f43a use swe-bench-lite.txt 2024-05-23 13:56:17 -07:00
Paul Gauthier
a395d9bb5c fix double lines for confirm_ask 2024-05-23 13:55:53 -07:00
Paul Gauthier
670f78a874 aider: Added a test case for the get_input method in the InputOutput class.
# Aider chat conversation:

USER: add a test case for this

ASSISTANT: To add a test case for the `get_input` method in the `InputOutput` class, we need to create a new test file. Let's create a new file named `test_io.py` in the `tests` directory and add the necessary test case.

Here is the *SEARCH/REPLACE* block to create the new test file:

tests/test_io.py
```python
<<<<<<< SEARCH
=======
import unittest
from unittest.mock import patch, MagicMock
from aider.io import InputOutput

class TestInputOutput(unittest.TestCase):
    @patch('aider.io.PromptSession')
    def test_get_input_is_a_directory_error(self, MockPromptSession):
        # Mock the PromptSession to simulate user input
        mock_session = MockPromptSession.return_value
        mock_session.prompt.return_value = "test input"

        io = InputOutput()
        root = "/"
        rel_fnames = ["existing_file.txt"]
        addable_rel_fnames = ["new_file.txt"]
        commands = MagicMock()

        # Simulate IsADirectoryError
        with patch('aider.io.open', side_effect=IsADirectoryError):
            result = io.get_input(root, rel_fnames, addable_rel_fnames, commands)
            self.assertEqual(result, "test input")

if __name__ == '__main__':
    unittest.main()
>>>>>>> REPLACE
```

This test case uses the `unittest` framework and mocks the `PromptSession` to simulate user input. It also patches the `open` function to raise an `IsADirectoryError` and verifies that the `get_input` method handles it correctly.

USER: I added these files to the chat: tests/test_io.py.

If you need to propose edits to other existing files not already added to the chat, you *MUST* tell the me their full path names and ask me to *add the files to the chat*. End your reply and wait for my approval. You can keep asking if you then decide you need to edit more files.

ASSISTANT: Let's add the new test case to the existing `tests/test_io.py` file to ensure that the `get_input` method in the `InputOutput` class handles the `IsADirectoryError` exception appropriately.

Here is the *SEARCH/REPLACE* block to add the new test case:

tests/test_io.py
```python
<<<<<<< SEARCH
import os
import unittest
from pathlib import Path
from unittest.mock import patch

from aider.io import AutoCompleter, InputOutput
from aider.utils import ChdirTemporaryDirectory

class TestInputOutput(unittest.TestCase):
    def test_no_color_environment_variable(self):
        with patch.dict(os.environ, {"NO_COLOR": "1"}):
            io = InputOutput()
            self.assertFalse(io.pretty)

    def test_autocompleter_with_non_existent_file(self):
        root = ""
        rel_fnames = ["non_existent_file.txt"]
        addable_rel_fnames = []
        commands = None
        autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
        self.assertEqual(autocompleter.words, set(rel_fnames))

    def test_autocompleter_with_unicode_file(self):
        with ChdirTemporaryDirectory():
            root = ""
            fname = "file.py"
            rel_fnames = [fname]
            addable_rel_fnames = []
            commands = None
            autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
            self.assertEqual(autocompleter.words, set(rel_fnames))

            Path(fname).write_text("def hello(): pass\n")
            autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
            self.assertEqual(autocompleter.words, set(rel_fnames + ["hello"]))

            encoding = "utf-16"
            some_content_which_will_error_if_read_with_encoding_utf8 = "ÅÍÎÏ".encode(encoding)
            with open(fname, "wb") as f:
                f.write(some_content_which_will_error_if_read_with_encoding_utf8)

            autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
            self.assertEqual(autocompleter.words, set(rel_fnames))

=======
import os
import unittest
from pathlib import Path
from unittest.mock import patch, MagicMock

from aider.io import AutoCompleter, InputOutput
from aider.utils import ChdirTemporaryDirectory

class TestInputOutput(unittest.TestCase):
    def test_no_color_environment_variable(self):
        with patch.dict(os.environ, {"NO_COLOR": "1"}):
            io = InputOutput()
            self.assertFalse(io.pretty)

    def test_autocompleter_with_non_existent_file(self):
        root = ""
        rel_fnames = ["non_existent_file.txt"]
        addable_rel_fnames = []
        commands = None
        autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
        self.assertEqual(autocompleter.words, set(rel_fnames))

    def test_autocompleter_with_unicode_file(self):
        with ChdirTemporaryDirectory():
            root = ""
            fname = "file.py"
            rel_fnames = [fname]
            addable_rel_fnames = []
            commands = None
            autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
            self.assertEqual(autocompleter.words, set(rel_fnames))

            Path(fname).write_text("def hello(): pass\n")
            autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
            self.assertEqual(autocompleter.words, set(rel_fnames + ["hello"]))

            encoding = "utf-16"
            some_content_which_will_error_if_read_with_encoding_utf8 = "ÅÍÎÏ".encode(encoding)
            with open(fname, "wb") as f:
                f.write(some_content_which_will_error_if_read_with_encoding_utf8)

            autocompleter = AutoCompleter(root, rel_fnames, addable_rel_fnames, commands, "utf-8")
            self.assertEqual(autocompleter.words, set(rel_fnames))

    @patch('aider.io.PromptSession')
    def test_get_input_is_a_directory_error(self, MockPromptSession):
        # Mock the PromptSession to simulate user input
        mock_session = MockPromptSession.return_value
        mock_session.prompt.return_value = "test input"

        io = InputOutput()
        root = "/"
        rel_fnames = ["existing_file.txt"]
        addable_rel_fnames = ["new_file.txt"]
        commands = MagicMock()

        # Simulate IsADirectoryError
        with patch('aider.io.open', side_effect=IsADirectoryError):
            result = io.get_input(root, rel_fnames, addable_rel_fnames, commands)
            self.assertEqual(result, "test input")
>>>>>>> REPLACE
```

This will add the new test case to the existing `tests/test_io.py` file.
2024-05-23 13:49:37 -07:00
Paul Gauthier
cda9565cae aider: Handle IsADirectoryError when trying to create a file where a directory exists.
# Aider chat conversation:

USER: https://github.com/paul-gauthier/aider/issues/606

https://github.com/paul-gauthier/aider/issues/606:

[Skip to content](#start-of-content)

## Navigation Menu

Toggle navigation

[](https://github.com/)

[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F606)

-   Product

    -   [](https://github.com/features/actions)

        Actions

        Automate any workflow

    -   [](https://github.com/features/packages)

        Packages

        Host and manage packages

    -   [](https://github.com/features/security)

        Security

        Find and fix vulnerabilities

    -   [](https://github.com/features/codespaces)

        Codespaces

        Instant dev environments

    -   [](https://github.com/features/copilot)

        Copilot

        Write better code with AI

    -   [](https://github.com/features/code-review)

        Code review

        Manage code changes

    -   [](https://github.com/features/issues)

        Issues

        Plan and track work

    -   [](https://github.com/features/discussions)

        Discussions

        Collaborate outside of code

    Explore
    -   [All features](https://github.com/features)
    -   [Documentation](https://docs.github.com)
    -   [GitHub Skills](https://skills.github.com)
    -   [Blog](https://github.blog)

-   Solutions

    For
    -   [Enterprise](https://github.com/enterprise)
    -   [Teams](https://github.com/team)
    -   [Startups](https://github.com/enterprise/startups)
    -   [Education](https://education.github.com)

    By Solution
    -   [CI/CD & Automation](https://github.com/solutions/ci-cd)
    -   [DevOps](https://github.com/solutions/devops)
    -   [DevSecOps](https://resources.github.com/devops/fundamentals/devsecops)

    Resources
    -   [Learning Pathways](https://resources.github.com/learn/pathways)
    -   [White papers, Ebooks, Webinars](https://resources.github.com)
    -   [Customer Stories](https://github.com/customer-stories)
    -   [Partners](https://partner.github.com)

-   Open Source

    -   [](https://github.com/sponsors)

        GitHub Sponsors

        Fund open source developers

    -   [](https://github.com/readme)

        The ReadME Project

        GitHub community articles

    Repositories
    -   [Topics](https://github.com/topics)
    -   [Trending](https://github.com/trending)
    -   [Collections](https://github.com/collections)

-   [Pricing](https://github.com/pricing)

Search or jump to\...

# Search code, repositories, users, issues, pull requests\...

Search

Clear

[Search syntax
tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax)

# Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

# Saved searches

## Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our
[documentation](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax).

Cancel

Create saved search

[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F606)

[Sign
up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=paul-gauthier%2Faider)

You signed in with another tab or window. [Reload]() to refresh your
session. You signed out in another tab or window. [Reload]() to refresh
your session. You switched accounts on another tab or window. [Reload]()
to refresh your session.

Dismiss alert

{{ message }}

::: {role="main"}

[paul-gauthier](/paul-gauthier) / **[aider](/paul-gauthier/aider)**
Public

-   [Notifications](/login?return_to=%2Fpaul-gauthier%2Faider)

-   [Fork 1k](/login?return_to=%2Fpaul-gauthier%2Faider)

-

    [ Star 10.2k](/login?return_to=%2Fpaul-gauthier%2Faider)

-   [Code ](/paul-gauthier/aider)
-   [Issues 55](/paul-gauthier/aider/issues)
-   [Pull requests 8](/paul-gauthier/aider/pulls)
-   [Actions ](/paul-gauthier/aider/actions)
-   [Projects 0](/paul-gauthier/aider/projects)
-   [Security](/paul-gauthier/aider/security)
-   [Insights ](/paul-gauthier/aider/pulse)

Additional navigation options

-   [ Code ](/paul-gauthier/aider)
-   [ Issues ](/paul-gauthier/aider/issues)
-   [ Pull requests ](/paul-gauthier/aider/pulls)
-   [ Actions ](/paul-gauthier/aider/actions)
-   [ Projects ](/paul-gauthier/aider/projects)
-   [ Security ](/paul-gauthier/aider/security)
-   [ Insights ](/paul-gauthier/aider/pulse)

New issue

**Have a question about this project?** Sign up for a free GitHub
account to open an issue and contact its maintainers and the community.

[Sign up for
GitHub](/signup?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)

By clicking "Sign up for GitHub", you agree to our [terms of
service](https://docs.github.com/terms) and [privacy
statement](https://docs.github.com/privacy). We'll occasionally send you
account related emails.

Already on GitHub? [Sign
in](/login?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)
to your account

[Jump to bottom](#issue-comment-box)

# Trying to create a file where a directory exists causes IsADirectoryError and aider to exit #606

Open

[harleypig](/harleypig) opened this issue May 18, 2024 · 0 comments

Open

# [Trying to create a file where a directory exists causes IsADirectoryError and aider to exit](#top) #606

[harleypig](/harleypig) opened this issue May 18, 2024 · 0 comments

Labels

[bug](/paul-gauthier/aider/labels/bug) Something isn\'t working

## Comments

[](/harleypig)

Copy link

###

**[harleypig](/harleypig)** commented [May 18, 2024](#issue-2304113184)
•

edited

+-----------------------------------------------------------------------+
| ### Issue                                                             |
|                                                                       |
| I wasn\'t thinking and asked aider to create a file with the same     |
| name as a directory and got unexpectedly dumped back to the command   |
| line.                                                                 |
|                                                                       |
|                                                                       |
|                                                                       |
|     $ mkdir -p testoops/oops                                          |
|     $ cd testoops                                                     |
|     $ aider                                                           |
|     Aider v0.35.0                                                     |
|                                                                       |
|     Models: openai/gpt-4o with diff edit format, weak model gpt-      |
| 3.5-turbo                                                             |
|     Git repo: .git with 0 files                                       |
|                                                                       |
|     Repo-map: using 2048 tokens                                       |
|                                                                       |
|     Use /help to see in-chat commands, run with --help to see cm      |
| d line args                                                           |
|     ────────────────────────────────────────────────────────────      |
| ───────────────────────────────────────────────────────────────────── |
|     > Create a text file named './oops' that has 'this wont work      |
| '                                                                     |
|                                                                       |
|     Here is the SEARCH/REPLACE block to create the file ./oops w      |
| ith the content this wont work:                                       |
|                                                                       |
|     ./oops                                                            |
|                                                                       |
|                                                                       |
|                                                                       |
|      <<<<<<< SEARCH                                                   |
|                                                                       |
|      =======                                                          |
|                                                                       |
|      this wont work                                                   |
|                                                                       |
|      >>>>>>> REPLACE                                                  |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|     Allow edits to ./oops which was not previously added to chat? y   |
|     Committing ./oops before applying edits.                          |
|     Cmd('git') failed due to: exit code(1)                            |
|       cmdline: git commit -m Add                                      |
| ed ./oops file. --no-verify -- /home/harleypig/projects/testoops/oops |
|       stderr: 'error: pathspec '/home/harl                            |
| eypig/projects/testoops/oops' did not match any file(s) known to git' |
|     ────────────────────────────────────────────────────────────      |
| ───────────────────────────────────────────────────────────────────── |
|     Traceback (most recent call last):                                |
|       File "/usr/local/bin/aider", line 8, in <module>                |
|         sys.exit(main())                                              |
|       File "/home/harleypig/.local/share/                             |
| python/lib/python3.10/site-packages/aider/main.py", line 408, in main |
|         coder.run()                                                   |
|       File "/home/harleypig/.local/share/python/lib/p                 |
| ython3.10/site-packages/aider/coders/base_coder.py", line 496, in run |
|         new_user_message = self.run_loop()                            |
|       File "/home/harleypig/.local/share/python/lib/python            |
| 3.10/site-packages/aider/coders/base_coder.py", line 512, in run_loop |
|         inp = self.io.get_input(                                      |
|       File "/home/harleypig/.local/share/pyt                          |
| hon/lib/python3.10/site-packages/aider/io.py", line 210, in get_input |
|         completer_instance = AutoCompleter(                           |
|       File "/home/harleypig/.local/share/p                            |
| ython/lib/python3.10/site-packages/aider/io.py", line 47, in __init__ |
|         with open(fname, "r", encoding=self.encoding) as f:           |
|     IsADirectoryError                                                 |
| : [Errno 21] Is a directory: '/home/harleypig/projects/testoops/oops' |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
| ### Version and model info                                            |
|                                                                       |
| In example above.                                                     |
+-----------------------------------------------------------------------+
|                                                                       |
|                                                                       |
| The text was updated successfully, but these errors were encountered: |
|                                                                       |
|                                                                       |
+-----------------------------------------------------------------------+

All reactions

[paul-gauthier](/paul-gauthier) added the
[bug](/paul-gauthier/aider/labels/bug) Something isn\'t working label
[May 20, 2024](#event-12868798902)

[Sign up for free](/join?source=comment-repo) **to join this
conversation on GitHub**. Already have an account? [Sign in to
comment](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F606)

Assignees

No one assigned

Labels

[bug](/paul-gauthier/aider/labels/bug) Something isn\'t working

Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

2 participants

:::

## Footer

[](https://github.com) © 2024 GitHub, Inc.

### Footer navigation

-   [Terms](https://docs.github.com/site-policy/github-terms/github-terms-of-service)
-   [Privacy](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement)
-   [Security](https://github.com/security)
-   [Status](https://www.githubstatus.com/)
-   [Docs](https://docs.github.com/)
-   [Contact](https://support.github.com?tags=dotcom-footer)
-   Manage cookies
-   Do not share my personal information

You can't perform that action at this time.

ASSISTANT: To address the issue of `IsADirectoryError` when trying to create a file where a directory exists, we need to handle this exception in the relevant part of the code. The most likely file to need changes is `aider/io.py`.

Please add `aider/io.py` to the chat so I can propose the necessary changes.

USER: I added these files to the chat: aider/io.py.

If you need to propose edits to other existing files not already added to the chat, you *MUST* tell the me their full path names and ask me to *add the files to the chat*. End your reply and wait for my approval. You can keep asking if you then decide you need to edit more files.

ASSISTANT: To handle the `IsADirectoryError` when trying to create a file where a directory exists, we need to catch this exception and handle it appropriately in the `get_input` method of the `InputOutput` class in `aider/io.py`.

Here are the changes:

aider/io.py
```python
<<<<<<< SEARCH
            try:
                with open(fname, "r", encoding=self.encoding) as f:
                    content = f.read()
            except (FileNotFoundError, UnicodeDecodeError):
                continue
=======
            try:
                with open(fname, "r", encoding=self.encoding) as f:
                    content = f.read()
            except (FileNotFoundError, UnicodeDecodeError, IsADirectoryError):
                continue
>>>>>>> REPLACE
```

This change will ensure that if the code tries to open a directory as if it were a file, it will catch the `IsADirectoryError` and continue without crashing.
2024-05-23 13:48:44 -07:00
Paul Gauthier
a4a6066858 handle missing max_map_tokens/max_context_window 2024-05-23 13:46:08 -07:00
Paul Gauthier
6345cd3734 cleanup 2024-05-23 13:43:35 -07:00
Paul Gauthier
fb76895eb1 Merge branch 'main' into swe-bench 2024-05-23 13:36:23 -07:00
Paul Gauthier
17381bed58 copy 2024-05-23 13:13:34 -07:00
Paul Gauthier
b8feaed5f9 aider: Reversed the order of the bars from short to tall.
# Aider chat conversation:

USER: reverse the order of the bars, so they go short->tall

ASSISTANT: To reverse the order of the bars from short to tall, we can reverse the order of the `models` and `pass_rates` lists before plotting:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    models = []
    pass_rates = []

    for line in lines:
        if line.strip():
            pass_rate, model = line.split("%")
            model = model.strip()
            model = model.replace("|", "\n")
            models.append(model.strip())
            pass_rates.append(float(pass_rate.strip()))
=======
    models = []
    pass_rates = []

    for line in lines:
        if line.strip():
            pass_rate, model = line.split("%")
            model = model.strip()
            model = model.replace("|", "\n")
            models.insert(0, model.strip())
            pass_rates.insert(0, float(pass_rate.strip()))
>>>>>>> REPLACE
```

This change uses `insert(0, ...)` instead of `append()` to add each model and pass rate to the beginning of their respective lists. This will effectively reverse the order, so the bars will be plotted from shortest to tallest.
2024-05-23 13:12:57 -07:00
Paul Gauthier
bfe91ae191 copy 2024-05-23 12:11:12 -07:00
Paul Gauthier
bb2d5b2fbf copy 2024-05-23 11:34:42 -07:00
Paul Gauthier
b56aa129dc copy 2024-05-23 10:27:49 -07:00
Paul Gauthier
c0f5cbb9b5 copy 2024-05-23 10:03:43 -07:00
Paul Gauthier
5ff4deb12e aider: Increased font weight for "Aider" labels to make them more visually distinct.
# Aider chat conversation:

USER: there's no visible difference bwteen the bold and light fonts?

ASSISTANT: To make the bold font more visible, we can increase the font weight. Here are the changes:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
        fontweight = "bold" if "Aider" in model else "normal"
=======
        fontweight = 800 if "Aider" in model else "normal"
>>>>>>> REPLACE
```

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
            label.set_fontweight("bold")
=======
            label.set_fontweight(800)
>>>>>>> REPLACE
```

This sets the font weight to 800 for the "Aider" labels, which should make them bolder and more visibly distinct from the other labels.
2024-05-23 10:01:57 -07:00
Paul Gauthier
a03e36f7f7 Increased the font size of "Aider" labels in the SWE benchmark plot. 2024-05-23 10:01:55 -07:00
Paul Gauthier
ed43cd278a aider: Ensure bold and light fonts are visibly different by using "normal" instead of "light" for the non-Aider bars.
# Aider chat conversation:

USER: there's no visible difference bwteen the bold and light fonts?

ASSISTANT: To ensure that the bold and light fonts are visibly different, we can use "normal" instead of "light" for the non-Aider bars. This should provide a more noticeable contrast.

Here are the *SEARCH/REPLACE* blocks:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
        fontweight = "bold" if "Aider" in model else "light"
=======
        fontweight = "bold" if "Aider" in model else "normal"
>>>>>>> REPLACE
```

Let's make this change and see if it improves the visibility of the font differences.
2024-05-23 10:01:14 -07:00
Paul Gauthier
6d75a10c8b Changed font weight to "light" for model "Aider" in plot_swe_bench_lite. 2024-05-23 10:01:12 -07:00
Paul Gauthier
d37bace5d7 aider: Make the percentages on top of Aider's bars and the Aider x labels bold.
# Aider chat conversation:

USER: make the percentages on top of aiders bars bold. make the aider x labels bold too

ASSISTANT: To make the percentages on top of Aider's bars bold and the Aider x labels bold, we need to modify the text properties in the plotting section.

Here are the *SEARCH/REPLACE* blocks:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
        if "Aider" in model:
            y = yval + 0.75
            va = "bottom"
        else:
            y = yval - 1.25
            va = "top"

        ax.text(
            bar.get_x() + bar.get_width() / 2,
            y,
            f"{yval}%",
            ha="center",
            va=va,
            fontsize=14,
            # alpha=0.75,
        )
=======
        y = yval + 0.75 if "Aider" in model else yval - 1.25
        va = "bottom" if "Aider" in model else "top"
        fontweight = "bold" if "Aider" in model else "normal"

        ax.text(
            bar.get_x() + bar.get_width() / 2,
            y,
            f"{yval}%",
            ha="center",
            va=va,
            fontsize=14,
            fontweight=fontweight,
        )
>>>>>>> REPLACE
```

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    plt.xticks(
        # rotation=45,
        # ha="center",
        fontsize=16,
        color=font_color,
    )
=======
    xticks = plt.xticks(
        fontsize=16,
        color=font_color,
    )
    for label in xticks[1]:
        if "Aider" in label.get_text():
            label.set_fontweight("bold")
>>>>>>> REPLACE
```
2024-05-23 09:56:04 -07:00
Paul Gauthier
cf1140ba71 Updated colors and alpha values for Aider models in SWE benchmark plot. 2024-05-23 09:56:03 -07:00
Paul Gauthier
484f625b2c copy 2024-05-23 09:49:47 -07:00
Paul Gauthier
ee7dcd5071 aider: Adjust alpha values for bars based on model name containing "Aider".
# Aider chat conversation:

USER: can we use more alpha for the aider bars and less alpha for the other bars?

ASSISTANT: To adjust the alpha values for the bars, we need to modify the section where the bars are created and set their alpha values based on whether the model name contains "Aider".

Here is the *SEARCH/REPLACE* block:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    bars = ax.bar(models, pass_rates, color=colors, alpha=0.75, zorder=3)
=======
    bars = []
    for model, pass_rate, color in zip(models, pass_rates, colors):
        alpha = 0.9 if "Aider" in model else 0.5
        bar = ax.bar(model, pass_rate, color=color, alpha=alpha, zorder=3)
        bars.append(bar[0])
>>>>>>> REPLACE
```
2024-05-23 09:48:53 -07:00
Paul Gauthier
210aeb6133 copy 2024-05-23 09:47:23 -07:00
Paul Gauthier
15669b7ae8 copy 2024-05-23 08:28:42 -07:00
Paul Gauthier
0f92c2bd7e copy 2024-05-23 08:23:14 -07:00
Paul Gauthier
2f3baf7cdd copy 2024-05-23 08:04:37 -07:00
Paul Gauthier
2a881faac8 copy 2024-05-23 07:51:06 -07:00
Paul Gauthier
15c228097b copy 2024-05-23 07:45:54 -07:00
Paul Gauthier
d9594815b0 copy 2024-05-23 07:27:04 -07:00
Paul Gauthier
071e5e273b copy 2024-05-23 06:46:26 -07:00
Paul Gauthier
2852acb947 Updated HISTORY 2024-05-23 06:03:25 -07:00
Paul Gauthier
25d00b3aac copy 2024-05-22 21:31:44 -07:00
Paul Gauthier
a405defcff ask for confirmation 2024-05-22 21:27:59 -07:00
Paul Gauthier
5032adf44b Add the url content 2024-05-22 21:25:34 -07:00
Paul Gauthier
dc8ae596c3 aider: Implement regex to find all URLs in the input string.
# Aider chat conversation:

USER: do the todo

ASSISTANT: To address the TODO comment, we need to implement a regex to find all URLs in the input string. Here are the changes:

1. Import the `re` module for regular expressions.
2. Implement the `check_for_urls` method to find and process URLs in the input string.

Here are the *SEARCH/REPLACE* blocks:

aider/coders/base_coder.py
```python
<<<<<<< SEARCH
import hashlib
=======
import hashlib
import re
>>>>>>> REPLACE
```

aider/coders/base_coder.py
```python
<<<<<<< SEARCH
        # todo use a regex to find all urls in inp

        return inp
=======
        url_pattern = re.compile(
            r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'
        )
        urls = url_pattern.findall(inp)
        if urls:
            self.io.tool_output(f"Found URLs: {', '.join(urls)}")
        return inp
>>>>>>> REPLACE
```

This will implement the functionality to find all URLs in the input string using a regular expression.
2024-05-22 21:19:44 -07:00
Paul Gauthier
aca24976a7 Add method to check for URLs in input. 2024-05-22 21:19:43 -07:00
Paul Gauthier
cdf4969c01 copy 2024-05-22 20:24:10 -07:00
Paul Gauthier
8b5451f4ab copy 2024-05-22 20:15:46 -07:00
Paul Gauthier
5e13399f46 copy 2024-05-22 18:26:04 -07:00
Paul Gauthier
18e3f55c4e copy 2024-05-22 17:18:59 -07:00
Paul Gauthier
d437be5bc1 copy 2024-05-22 16:14:22 -07:00
Paul Gauthier
71ddb2799f aider: Aider achieved a state-of-the-art result on the SWE Bench Lite benchmark, with improvements in code editing, linting, and testing processes.
# Aider chat conversation:

USER: proofread this and make corrections. don't change style, but fix problems.

ASSISTANT: _posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider scored 26.3%
on the
[SWE Bench Lite benchmark](https://www.swebench.com), achieving a state of the art result.
The current top leaderboard entry is 20.3%
from Amazon Q Developer Agent.
The best result reported elsewhere online seems to be
=======
Aider scored 26.3% on the [SWE Bench Lite benchmark](https://www.swebench.com), achieving a state-of-the-art result. The current top leaderboard entry is 20.3% from Amazon Q Developer Agent. The best result reported elsewhere online seems to be
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider achieved this result mainly through its focus on static code analysis,
reliable LLM code editing
and pragmatic workflows for interactive pair programming with AI.
Aider intentionally has quite limited and narrow "agentic behavior":
it doesn't require a highly detailed upfront "spec" from
^C KeyboardInterrupt

USER: don't fix formatting like that. linebreaks are fine. it will render properly.

ASSISTANT: Understood. I will only fix grammatical and spelling errors without changing the formatting.

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider scored 26.3%
on the
[SWE Bench Lite benchmark](https://www.swebench.com), achieving a state of the art result.
The current top leaderboard entry is 20.3%
from Amazon Q Developer Agent.
The best result reported elsewhere online seems to be
[22.3% from AutoCodeRover](https://github.com/nus-apr/auto-code-rover).
=======
Aider scored 26.3%
on the
[SWE Bench Lite benchmark](https://www.swebench.com), achieving a state-of-the-art result.
The current top leaderboard entry is 20.3%
from Amazon Q Developer Agent.
The best result reported elsewhere online seems to be
[22.3% from AutoCodeRover](https://github.com/nus-apr/auto-code-rover).
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider achieved this result mainly through its focus on static code analysis,
reliable LLM code editing
and pragmatic workflows for interactive pair programming with AI.
Aider intentionally has quite limited and narrow "agentic behavior":
it doesn't require a highly detailed upfront "spec" from the user,
use RAG or vector search, farm out sub-problems to an army of LLMs,
allow the LLM to use tools
or perform  web searches,
=======
Aider achieved this result mainly through its focus on static code analysis,
reliable LLM code editing,
and pragmatic workflows for interactive pair programming with AI.
Aider intentionally has quite limited and narrow "agentic behavior":
it doesn't require a highly detailed upfront "spec" from the user,
use RAG or vector search, farm out sub-problems to an army of LLMs,
allow the LLM to use tools,
or perform web searches,
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
When a user asks aider for a change, they see the edits performed in real-time
and aider may also then offer additional
help like fixing lint or test errors.
In normal use, the user is in full interactive control.
This lets them quickly steer misunderstandings back on course and
=======
When a user asks aider for a change, they see the edits performed in real-time,
and aider may also then offer additional
help like fixing lint or test errors.
In normal use, the user is in full interactive control.
This lets them quickly steer misunderstandings back on course and
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
For the benchmark,
aider was launched in each problem's git repository
with the problem statement
submitted as the opening chat message from "the user".
After that aider runs as normal, with the following modifications:
=======
For the benchmark,
aider was launched in each problem's git repository
with the problem statement
submitted as the opening chat message from "the user."
After that aider runs as normal, with the following modifications:
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
- A simple harness was used to retry the SWE Bench problem if aider produced code which wasn't *plausibly correct*.
=======
- A simple harness was used to retry the SWE Bench problem if aider produced code that wasn't *plausibly correct*.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
- If the solution isn't plausible, the harness launches aider to try again from scratch
=======
- If the solution isn't plausible, the harness launches aider to try again from scratch,
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
- If aider doesn't produce code that lints and tests clean, the user might decide to revert the changes and try again. Maybe with a different LLM this time.
=======
- If aider doesn't produce code that lints and tests clean, the user might decide to revert the changes and try again, maybe with a different LLM this time.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Of course, outside a benchmark setting it's probably
=======
Of course, outside a benchmark setting, it's probably
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Running the entire SWE Bench Lite benchmark using aider with just GPT-4o
achieved a score of 25%.
This was itself a state of the art result, before being surpassed by the main
result being reported here
that used aider with both GPT-4o & Opus.
=======
Running the entire SWE Bench Lite benchmark using aider with just GPT-4o
achieved a score of 25%.
This was itself a state-of-the-art result, before being surpassed by the main
result being reported here
that used aider with both GPT-4o & Opus.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
The table below breaks down the 79 solutions which were ultimately
=======
The table below breaks down the 79 solutions that were ultimately
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
- Aider with GPT-4o immediately found 77% of the valid solutions on the first attempt.
=======
- Aider with GPT-4o immediately found 77% of the valid solutions on the first attempt.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
If we breakdown correct solutions purely by model,
=======
If we break down correct solutions purely by model,
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider instead uses a
[repository map](https://aider.chat/2023/10/22/repomap.html)
to help the LLM understand the
layout, code structure and content of a git repo.
The repo map is created from the code's AST and call graph
to provide a compact and powerful summary of the entire code base.
The map is constantly
tailored to show
repo context that is relevant to the current state of the chat conversation.
This is done by performing a graph optimization on the code's call graph.
=======
Aider instead uses a
[repository map](https://aider.chat/2023/10/22/repomap.html)
to help the LLM understand the
layout, code structure, and content of a git repo.
The repo map is created from the code's AST and call graph
to provide a compact and powerful summary of the entire code base.
The map is constantly
tailored to show
repo context that is relevant to the current state of the chat conversation.
This is done by performing a graph optimization on the code's call graph.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
When the user asks for a change to their code, the LLM uses the repo map
to decide which files to edit.
The LLM simply returns a normal text response explaining which files
it needs to edit and why.
Aider notices when the LLM mentions filenames from the repo,
and asks the user if they should be added to the chat.
Adding a file to the chat allows the LLM to see the full contents
of the file and edit it.
=======
When the user asks for a change to their code, the LLM uses the repo map
to decide which files to edit.
The LLM simply returns a normal text response explaining which files
it needs to edit and why.
Aider notices when the LLM mentions filenames from the repo,
and asks the user if they should be added to the chat.
Adding a file to the chat allows the LLM to see the full contents
of the file and edit it.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider has  always had a deep focus on ensuring that LLMs can not just write code,
but reliably *edit* code.
Aider a collection of prompting strategies and code editing backends which have
=======
Aider has always had a deep focus on ensuring that LLMs can not just write code,
but reliably *edit* code.
Aider has a collection of prompting strategies and code editing backends which have
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
[Aider lints code]()
after every LLM edit, and offers to automatically fix
=======
[Aider lints code](https://aider.chat/docs/linting/)
after every LLM edit and offers to automatically fix
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider includes basic linters built with tree-sitter that support
=======
Aider includes basic linters built with tree-sitter that supports
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
This context increases the ability of the LLM to understand the problem and
=======
This context increases the ability of the LLM to understand the problem and
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
A user working on a python project might do that by launching
=======
A user working on a Python project might do that by launching
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
The repositories that are used in the SWE Bench problems are large open
=======
The repositories that are used in the SWE Bench problems are large, open
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
A repo's test suite can be run in three ways:
=======
A repo's test suite can be run in three ways:
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
2. Run tests after aider has modified the repo.
=======
2. Run tests after aider has modified the repo.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
3. Run the final "acceptance tests" to judge if the coding agent has
=======
3. Run the final "acceptance tests" to judge if the coding agent has
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
For the benchmark, aider is configured with a test command that will run the tests
=======
For the benchmark, aider is configured with a test command that will run the tests
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
When aider runs a test command, it checks for a non-zero exit status.
=======
When aider runs a test command, it checks for a non-zero exit status.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
To be clear, *aider can not run or even see the "acceptance tests"* from the `test_patch`
=======
To be clear, *aider cannot run or even see the "acceptance tests"* from the `test_patch`
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
As aider executes, it notes the outcome of the editing, linting and testing
steps.
When aider completes, it returns their final status as either:
succeeded with no errors remaining,
or ended without resolving all errors.
=======
As aider executes, it notes the outcome of the editing, linting, and testing
steps.
When aider completes, it returns its final status as either:
succeeded with no errors remaining,
or ended without resolving all errors.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
The benchmark harness uses these outcomes to determine if it has a plausible
solution to the current SWE Bench task.
A plausible solution is one where aider
returns saying that it
edited the repo with no outstanding
edit, lint or test errors.
=======
The benchmark harness uses these outcomes to determine if it has a plausible
solution to the current SWE Bench task.
A plausible solution is one where aider
returns saying that it
edited the repo with no outstanding
edit, lint, or test errors.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
If the solution is not plausible, another
instance of aider is launched again from scratch on the same problem.
=======
If the solution is not plausible, another
instance of aider is launched again from scratch on the same problem.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
It's worth noting that repositories may have lint or test errors present before aider even starts to edit them. Whether errors are caused by aider or were pre-existing, there will be instances where, after six tries, no plausible solution is obtained.
=======
It's worth noting that repositories may have lint or test errors present before aider even starts to edit them. Whether errors are caused by aider or were pre-existing, there will be instances where, after six tries, no plausible solution is obtained.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
If all six attempts fail to produce a plausible solution,
then the "best" solution available is selected as a the
=======
If all six attempts fail to produce a plausible solution,
then the "best" solution available is selected as the
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Which of the non-plausible solutions to use is determined
=======
Which of the non-plausible solutions to use is determined
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
The benchmark harness produces one "best" solution for each of the 300
SWE Bench Lite instances, and saves it as a `model_patch`.
A separate evaluation script uses the SWE Bench support code to
test each of these results with the acceptance tests.
=======
The benchmark harness produces one "best" solution for each of the 300
SWE Bench Lite instances and saves it as a `model_patch`.
A separate evaluation script uses the SWE Bench support code to
test each of these results with the acceptance tests.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
These `test_patch` acceptance tests are only ever run outside of aider
and the benchmark harness, and only to compute the number of
correctly resolved instances.
They are never run, used or even visible during the attempts to solve the problems.
=======
These `test_patch` acceptance tests are only ever run outside of aider
and the benchmark harness, and only to compute the number of
correctly resolved instances.
They are never run, used, or even visible during the attempts to solve the problems.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Aider correctly resolved 79 out of 300 SWE Bench Lite instances, or 26.3%.
=======
Aider correctly resolved 79 out of 300 SWE Bench Lite instances, or 26.3%.
>>>>>>> REPLACE
</source>

_posts/2024-05-22-swe-bench-lite.md
<source>markdown
<<<<<<< SEARCH
Much thanks to the team behind the
[SWE Bench](https://www.swebench.com)
family of AI coding benchmarks.
Also thanks to Albert Örwall who has
[dockerized the SWE Bench evaluation scripts](SWE-bench-docker)
making it faster, easier and more reliable to run the acceptance tests.
=======
Much thanks to the team behind the
[SWE Bench](https://www.swebench.com)
family of AI coding benchmarks.
Also thanks to Albert Örwall who has
[dockerized the SWE Bench evaluation scripts](SWE-bench-docker)
making it faster, easier, and more reliable to run the acceptance tests.
>>>>>>> REPLACE
</source>
2024-05-22 16:11:17 -07:00
Paul Gauthier
5b8bbdbe5f copy 2024-05-22 16:08:51 -07:00
Paul Gauthier
b67891e7f8 copy 2024-05-22 15:56:11 -07:00
Paul Gauthier
9f2554fed7 copy 2024-05-22 15:33:11 -07:00
Paul Gauthier
fcfc05e2d3 aider: Make the x and y axis labels also #555.
# Aider chat conversation:

USER: make the x and y axis labels also #555

ASSISTANT: Here are the changes to make the x and y axis labels also #555 in `benchmark/swe_bench_lite.py` and `benchmark/over_time.py`:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    ax.set_ylabel("Pass rate (%)", fontsize=18)
    ax.set_title("SWE Bench Lite", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(rotation=45, ha="right", fontsize=16)
    plt.tight_layout(pad=3.0)
    plt.savefig("swe_bench_lite.jpg")
    plt.savefig("swe_bench_lite.svg")
    imgcat(fig)
=======
    ax.set_ylabel("Pass rate (%)", fontsize=18, color="#555")
    ax.set_title("SWE Bench Lite", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(rotation=45, ha="right", fontsize=16)
    plt.tight_layout(pad=3.0)
    plt.savefig("swe_bench_lite.jpg")
    plt.savefig("swe_bench_lite.svg")
    imgcat(fig)
    ax.xaxis.label.set_color("#555")
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    ax.set_xlabel("Model release date", fontsize=18)
    ax.set_ylabel("Aider code editing benchmark,\npercent completed correctly", fontsize=18)
    ax.set_title("LLM code editing skill by model release date", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(fontsize=14)
    plt.tight_layout(pad=3.0)
    plt.savefig("tmp_over_time.png")
    plt.savefig("tmp_over_time.svg")
    imgcat(fig)
=======
    ax.set_xlabel("Model release date", fontsize=18, color="#555")
    ax.set_ylabel("Aider code editing benchmark,\npercent completed correctly", fontsize=18, color="#555")
    ax.set_title("LLM code editing skill by model release date", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(fontsize=14)
    plt.tight_layout(pad=3.0)
    plt.savefig("tmp_over_time.png")
    plt.savefig("tmp_over_time.svg")
    imgcat(fig)
>>>>>>> REPLACE
```
2024-05-22 15:29:58 -07:00
Paul Gauthier
c3b8caf480 Update text color to "#555" in swe_bench_lite.py. 2024-05-22 15:29:57 -07:00