Commit graph

2214 commits

Author SHA1 Message Date
Paul Gauthier
ea83df1ea3 remove long ==--- lines to save tokens 2023-06-28 15:44:23 -07:00
Paul Gauthier
51638d0989 aider: Added test for cleanup_test_output function in test_benchmark.py.
# Context:
USER: make a test for `cleanup_test_output`
ASSISTANT: To create a test for the `cleanup_test_output` function, we need to create a new test file in the `benchmark/tests` directory. Let's call it `test_benchmark_utils.py`.

In this new file, we will import the `cleanup_test_output` function from `benchmark.py` and write a test function using the `unittest` framework.

Here is the content for the new test file:

```python
benchmark/tests/test_benchmark_utils.py
<<<<<<< ORIGINAL

=======
import unittest
from benchmark.benchmark import cleanup_test_output

class TestCleanupTestOutput(unittest.TestCase):
    def test_cleanup_test_output(self):
        # Test case with timing info
        output = "Ran 5 tests in 0.003s\nOK"
        expected = "\nOK"
        self.assertEqual(cleanup_test_output(output), expected)

        # Test case without timing info
        output = "OK"
        expected = "OK"
        self.assertEqual
^C KeyboardInterrupt
USER: no in test_benchmark.py
ASSISTANT: To add a test for the `cleanup_test_output` function in the `test_benchmark.py` file, we need to import the `cleanup_test_output` function from `benchmark.py` and write a test function using the `unittest` framework.

Here is the content for the updated test file:

```python
benchmark/tests/test_benchmark.py
<<<<<<< ORIGINAL

=======
import unittest
from benchmark.benchmark import cleanup_test_output

class TestCleanupTestOutput(unittest.TestCase):
    def test_cleanup_test_output(self):
        # Test case with timing info
        output = "Ran 5 tests in 0.003s\nOK"
        expected = "\nOK"
        self.assertEqual(cleanup_test_output(output), expected)

        # Test case without timing info
        output = "OK"
        expected = "OK"
        self.assertEqual(cleanup_test_output(output), expected)
>>>>>>> UPDATED
```

Please note that you need to add the above code to the existing `test_benchmark.py` file. The `ORIGINAL` section is empty because it's a new addition to the file.
2023-06-28 15:37:41 -07:00
Paul Gauthier
ac72e4c426 Added new files to the git repo: benchmark/tests/test_benchmark.py 2023-06-28 15:36:47 -07:00
Paul Gauthier
a0844839ca wip 2023-06-28 15:36:34 -07:00
Paul Gauthier
dbdccc42a2 prompt 2023-06-28 15:29:59 -07:00
Paul Gauthier
6c87d5be1c better 2023-06-28 13:52:43 -07:00
Paul Gauthier
2691c80ac2 aider: Change original_lines and updated_lines properties in EditBlockFunctionCoder class to be an array of strings instead of a single string.
# Context:
USER: change the `functions` jsonschema so that `original_lines` is a list of strings and so is `update_files`
ASSISTANT: To make the requested changes, we need to modify the `functions` JSON schema in the `EditBlockFunctionCoder` class in the `aider/coders/editblock_func_coder.py` file.

Here is the change:

```python
aider/coders/editblock_func_coder.py
<<<<<<< ORIGINAL
                            properties=dict(
                                path=dict(
                                    type="string",
                                    description="Path of file to edit",
                                ),
                                original_lines=dict(
                                    type="string",
                                    description=(
                                        "Lines from the original file, including all"
                                        " whitespace, newlines, without skipping any lines"
                                    ),
                                ),
                                updated_lines=dict(
                                    type="string",
                                    description="New content to replace the `original_lines` with",
                                ),
                            ),
=======
                            properties=dict(
                                path=dict(
                                    type="string",
                                    description="Path of file to edit",
                                ),
                                original_lines=dict(
                                    type="array",
                                    items=dict(
                                        type="string",
                                    ),
                                    description=(
                                        "Lines from the original file, including all"
                                        " whitespace, newlines, without skipping any lines"
                                    ),
                                ),
                                updated_lines=dict(
                                    type="array",
                                    items=dict(
                                        type="string",
                                    ),
                                    description="New content to replace the `original_lines` with",
                                ),
                            ),
>>>>>>> UPDATED
```

This change modifies the `original_lines` and `updated_lines` properties to be an array of strings instead of a single string. This allows for multiple lines to be specified in the `original_lines` and `updated_lines` fields.
2023-06-28 11:03:59 -07:00
Paul Gauthier
67114d1a80 Fail if docker build fails 2023-06-28 11:02:47 -07:00
Paul Gauthier
67c27fd9ef Changed --stats-only -> --stats 2023-06-28 10:56:37 -07:00
Paul Gauthier
777f179c6d better 2023-06-28 10:54:58 -07:00
Paul Gauthier
1ac366aa62 works 2023-06-27 19:38:14 -07:00
Paul Gauthier
f22ccf6195 fixed json schema 2023-06-27 19:07:45 -07:00
Paul Gauthier
2358b638b8 wip 2023-06-27 18:59:15 -07:00
Paul Gauthier
582143d563 wip 2023-06-27 18:34:52 -07:00
Paul Gauthier
e24d03060d rename --retries to --tries 2023-06-27 17:45:57 -07:00
Paul Gauthier
11dddad4c1 zip the hashes 2023-06-27 17:42:52 -07:00
Paul Gauthier
10049d8695 remove timing info, to avoid randomizing the response to GPT 2023-06-27 17:15:05 -07:00
Paul Gauthier
9cfebc0b60 record response hashes too 2023-06-27 16:11:05 -07:00
Paul Gauthier
974d96973c move the empty list into init 2023-06-27 15:00:49 -07:00
Paul Gauthier
64e16c8ad1 aider: Added SHA1 hash of kwargs to chat_completion_call_hashes in send_with_retries method.
# Context:
USER: when calling `ChatCompletion`, take a sha1 hash of dict with all the args and append it to a list self.chat_completion_call_hashes
ASSISTANT: To implement this change, we need to modify the `send_with_retries` method in the `Coder` class in the `aider/coders/base_coder.py` file. We will import the `hashlib` module to generate the SHA1 hash. We will also add a new attribute `chat_completion_call_hashes` to the `Coder` class to store the hashes.

Here are the changes:

```python
aider/coders/base_coder.py
<<<<<<< ORIGINAL
import json
import os
import sys
import traceback
from json.decoder import JSONDecodeError
from pathlib import Path
=======
import hashlib
import json
import os
import sys
import traceback
from json.decoder import JSONDecodeError
from pathlib import Path
>>>>>>> UPDATED
```

```python
aider/coders/base_coder.py
<<<<<<< ORIGINAL
class Coder:
    abs_fnames = None
    repo = None
    last_aider_commit_hash = None
    last_asked_for_commit_time = 0
    repo_map = None
    functions = None
    total_cost = 0.0
=======
class Coder:
    abs_fnames = None
    repo = None
    last_aider_commit_hash = None
    last_asked_for_commit_time = 0
    repo_map = None
    functions = None
    total_cost = 0.0
    chat_completion_call_hashes = []
>>>>>>> UPDATED
```

```python
aider/coders/base_coder.py
<<<<<<< ORIGINAL
    def send_with_retries(self, model, messages, functions):
        kwargs = dict(
            model=model,
            messages=messages,
            temperature=0,
            stream=self.stream,
        )
        if functions is not None:
            kwargs["functions"] = self.functions

        res = openai.ChatCompletion.create(**kwargs)
        return res
=======
    def send_with_retries(self, model, messages, functions):
        kwargs = dict(
            model=model,
            messages=messages,
            temperature=0,
            stream=self.stream,
        )
        if functions is not None:
            kwargs["functions"] = self.functions

        # Generate SHA1 hash of kwargs and append it to chat_completion_call_hashes
        hash_object = hashlib.sha1(json.dumps(kwargs, sort_keys=True).encode())
        self.chat_completion_call_hashes.append(hash_object.hexdigest())

        res = openai.ChatCompletion.create(**kwargs)
        return res
>>>>>>> UPDATED
```
2023-06-27 14:59:58 -07:00
Paul Gauthier
8999c3fd97 move the progress bar to the bottom 2023-06-27 14:58:15 -07:00
Paul Gauthier
9556313819 Added functionality to use pre-existing benchmark results if they exist. 2023-06-27 14:27:07 -07:00
Paul Gauthier
70050ca004 improved logic for creating benchmark dirs 2023-06-27 08:02:37 -07:00
Paul Gauthier
29d60b9ddb check for dirname on --clean 2023-06-27 07:52:48 -07:00
Paul Gauthier
7736289309 test error prompt 2023-06-27 07:49:47 -07:00
Paul Gauthier
39434289f5 updated docker image name 2023-06-26 20:52:02 -07:00
Paul Gauthier
a109ab1890 aider: Added check for existing Docker image before building.
# Context:
USER: in `build_docker` check if the image is already built and return
ASSISTANT: To check if the Docker image is already built and return if it exists, we can use the `docker images` command and parse its output. Here's how to modify the `build_docker` function:

```python
benchmark/benchmark.py
<<<<<<< ORIGINAL
def build_docker():
    command = [
        "docker",
        "build",
        "--quiet",
        "-t",
        "benchmark",
        "-f",
        "benchmark/Dockerfile",
        "/dev/null",
    ]
    print(" ".join(command))

    try:
        result = subprocess.run(
            command,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            text=True,
        )
        res = result.stdout
        print(res)

    except subprocess.CalledProcessError as e:
        res = f"Failed to build Docker image: {e.output}"

    return res
=======
def build_docker():
    image_name = "benchmark"
    check_command = ["docker", "images", "-q", image_name]
    check_result = subprocess.run(check_command, stdout=subprocess.PIPE, text=True)
    if check_result.stdout.strip():
        print(f"Docker image '{image_name}' already exists, skipping build.")
        return

    command = [
        "docker",
        "build",
        "--quiet",
        "-t",
        image_name,
        "-f",
        "benchmark/Dockerfile",
        "/dev/null",
    ]
    print(" ".join(command))

    try:
        result = subprocess.run(
            command,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            text=True,
        )
        res = result.stdout
        print(res)

    except subprocess.CalledProcessError as e:
        res = f"Failed to build Docker image: {e.output}"

    return res
>>>>>>> UPDATED
```

This modification adds a check for the existence of the Docker image before attempting to build it. If the image already exists, the function prints a message and returns early.
2023-06-26 20:48:55 -07:00
Paul Gauthier
5e2681ec9d call build_docker 2023-06-26 20:47:49 -07:00
Paul Gauthier
898d98fdb3 aider: Added a build_docker() function that builds a docker image called benchmark from the benchmark/Dockerfile with no other fs context.
# Context:
USER: add a build_docker() func that builds a docker image called `benchmark` from the benchmark/Dockerfile with no other fs context
2023-06-26 20:41:38 -07:00
Paul Gauthier
095f47593d run unit tests inside docker for safety 2023-06-26 20:37:55 -07:00
Paul Gauthier
d13ff918de aider: Refactor unit tests to run using docker python3.8 and mount only the file being run.
# Context:
USER: run the unit tests using docker python3.8 mounting just the file being run
2023-06-26 20:18:03 -07:00
Paul Gauthier
4d3c77332f ssh typer 2023-06-26 20:15:42 -07:00
Paul Gauthier
dd30cb34ed refac 2023-06-26 20:11:41 -07:00
Paul Gauthier
5c8546c140 sssh typer 2023-06-26 16:48:15 -07:00
Paul Gauthier
acc6be9e95 switched back to ``` as default fence 2023-06-26 16:42:29 -07:00
Paul Gauthier
307f3a4da4 Just HH-MM 2023-06-26 16:36:50 -07:00
Paul Gauthier
23f96a28a0 use <source> fence by default 2023-06-26 16:34:46 -07:00
Paul Gauthier
9e91a392c4 aider: Switched the script to use typer and returned a file listing.
# Context:
USER: switch this script to typer
USER: return a file listing
2023-06-26 16:33:55 -07:00
Paul Gauthier
3c557a00af refac 2023-06-26 16:28:33 -07:00
Paul Gauthier
38f7b36bb1 improved error/ask reporting 2023-06-26 11:11:12 -07:00
Paul Gauthier
651b1293c3 tweaks 2023-06-26 10:51:06 -07:00
Paul Gauthier
2c7960ae26 prefix date/time on benchmark dirs 2023-06-26 10:46:30 -07:00
Paul Gauthier
1370da14fb track and report errors/asks during benchmarking 2023-06-26 10:33:16 -07:00
Paul Gauthier
cbfda391bb version bump to 0.7.2 2023-06-26 10:24:14 -07:00
Paul Gauthier
079682b29e Omit the "No changes are needed" redaction from commit messages with wholefile coders 2023-06-26 10:21:27 -07:00
Paul Gauthier
33cc72a0df ssh 2023-06-25 21:40:08 -07:00
Paul Gauthier
819b8b2672 tolerate missing final newline 2023-06-25 21:27:18 -07:00
Paul Gauthier
dc535a2aa0 aider: Updated summarize_results function to display remaining time in MM:SS format.
# Context:
USER: show MM:SS remaining
ASSISTANT: To show the remaining time in MM:SS format, you need to modify the `summarize_results` function in the `scripts/benchmark.py` file. Update the `remaining` variable calculation and its print statement.

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
    avg_duration = duration / completed_tests
    remaining = (total_tests - completed_tests) * avg_duration

    console.print(f"duration: {avg_duration:.1f} sec/test-case, {remaining:.0f} sec remaining")
=======
    avg_duration = duration / completed_tests
    remaining_seconds = (total_tests - completed_tests) * avg_duration
    remaining_minutes, remaining_seconds = divmod(remaining_seconds, 60)

    console.print(f"duration: {avg_duration:.1f} sec/test-case, {remaining_minutes:.0f}:{remaining_seconds:02.0f} remaining")
>>>>>>> UPDATED
```

This change will display the remaining time in MM:SS format.
2023-06-25 21:22:55 -07:00
Paul Gauthier
f836a4b760 show remaining 2023-06-25 20:59:03 -07:00
Paul Gauthier
58690da8b4 fences are obfuscated so aider can modify itself 2023-06-25 20:55:10 -07:00