Commit graph

269 commits

Author SHA1 Message Date
Paul Gauthier
ea83df1ea3 remove long ==--- lines to save tokens 2023-06-28 15:44:23 -07:00
Paul Gauthier
51638d0989 aider: Added test for cleanup_test_output function in test_benchmark.py.
# Context:
USER: make a test for `cleanup_test_output`
ASSISTANT: To create a test for the `cleanup_test_output` function, we need to create a new test file in the `benchmark/tests` directory. Let's call it `test_benchmark_utils.py`.

In this new file, we will import the `cleanup_test_output` function from `benchmark.py` and write a test function using the `unittest` framework.

Here is the content for the new test file:

```python
benchmark/tests/test_benchmark_utils.py
<<<<<<< ORIGINAL

=======
import unittest
from benchmark.benchmark import cleanup_test_output

class TestCleanupTestOutput(unittest.TestCase):
    def test_cleanup_test_output(self):
        # Test case with timing info
        output = "Ran 5 tests in 0.003s\nOK"
        expected = "\nOK"
        self.assertEqual(cleanup_test_output(output), expected)

        # Test case without timing info
        output = "OK"
        expected = "OK"
        self.assertEqual
^C KeyboardInterrupt
USER: no in test_benchmark.py
ASSISTANT: To add a test for the `cleanup_test_output` function in the `test_benchmark.py` file, we need to import the `cleanup_test_output` function from `benchmark.py` and write a test function using the `unittest` framework.

Here is the content for the updated test file:

```python
benchmark/tests/test_benchmark.py
<<<<<<< ORIGINAL

=======
import unittest
from benchmark.benchmark import cleanup_test_output

class TestCleanupTestOutput(unittest.TestCase):
    def test_cleanup_test_output(self):
        # Test case with timing info
        output = "Ran 5 tests in 0.003s\nOK"
        expected = "\nOK"
        self.assertEqual(cleanup_test_output(output), expected)

        # Test case without timing info
        output = "OK"
        expected = "OK"
        self.assertEqual(cleanup_test_output(output), expected)
>>>>>>> UPDATED
```

Please note that you need to add the above code to the existing `test_benchmark.py` file. The `ORIGINAL` section is empty because it's a new addition to the file.
2023-06-28 15:37:41 -07:00
Paul Gauthier
ac72e4c426 Added new files to the git repo: benchmark/tests/test_benchmark.py 2023-06-28 15:36:47 -07:00
Paul Gauthier
a0844839ca wip 2023-06-28 15:36:34 -07:00
Paul Gauthier
67114d1a80 Fail if docker build fails 2023-06-28 11:02:47 -07:00
Paul Gauthier
67c27fd9ef Changed --stats-only -> --stats 2023-06-28 10:56:37 -07:00
Paul Gauthier
e24d03060d rename --retries to --tries 2023-06-27 17:45:57 -07:00
Paul Gauthier
11dddad4c1 zip the hashes 2023-06-27 17:42:52 -07:00
Paul Gauthier
10049d8695 remove timing info, to avoid randomizing the response to GPT 2023-06-27 17:15:05 -07:00
Paul Gauthier
9cfebc0b60 record response hashes too 2023-06-27 16:11:05 -07:00
Paul Gauthier
9556313819 Added functionality to use pre-existing benchmark results if they exist. 2023-06-27 14:27:07 -07:00
Paul Gauthier
70050ca004 improved logic for creating benchmark dirs 2023-06-27 08:02:37 -07:00
Paul Gauthier
29d60b9ddb check for dirname on --clean 2023-06-27 07:52:48 -07:00
Paul Gauthier
7736289309 test error prompt 2023-06-27 07:49:47 -07:00
Paul Gauthier
39434289f5 updated docker image name 2023-06-26 20:52:02 -07:00
Paul Gauthier
a109ab1890 aider: Added check for existing Docker image before building.
# Context:
USER: in `build_docker` check if the image is already built and return
ASSISTANT: To check if the Docker image is already built and return if it exists, we can use the `docker images` command and parse its output. Here's how to modify the `build_docker` function:

```python
benchmark/benchmark.py
<<<<<<< ORIGINAL
def build_docker():
    command = [
        "docker",
        "build",
        "--quiet",
        "-t",
        "benchmark",
        "-f",
        "benchmark/Dockerfile",
        "/dev/null",
    ]
    print(" ".join(command))

    try:
        result = subprocess.run(
            command,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            text=True,
        )
        res = result.stdout
        print(res)

    except subprocess.CalledProcessError as e:
        res = f"Failed to build Docker image: {e.output}"

    return res
=======
def build_docker():
    image_name = "benchmark"
    check_command = ["docker", "images", "-q", image_name]
    check_result = subprocess.run(check_command, stdout=subprocess.PIPE, text=True)
    if check_result.stdout.strip():
        print(f"Docker image '{image_name}' already exists, skipping build.")
        return

    command = [
        "docker",
        "build",
        "--quiet",
        "-t",
        image_name,
        "-f",
        "benchmark/Dockerfile",
        "/dev/null",
    ]
    print(" ".join(command))

    try:
        result = subprocess.run(
            command,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            text=True,
        )
        res = result.stdout
        print(res)

    except subprocess.CalledProcessError as e:
        res = f"Failed to build Docker image: {e.output}"

    return res
>>>>>>> UPDATED
```

This modification adds a check for the existence of the Docker image before attempting to build it. If the image already exists, the function prints a message and returns early.
2023-06-26 20:48:55 -07:00
Paul Gauthier
5e2681ec9d call build_docker 2023-06-26 20:47:49 -07:00
Paul Gauthier
898d98fdb3 aider: Added a build_docker() function that builds a docker image called benchmark from the benchmark/Dockerfile with no other fs context.
# Context:
USER: add a build_docker() func that builds a docker image called `benchmark` from the benchmark/Dockerfile with no other fs context
2023-06-26 20:41:38 -07:00
Paul Gauthier
095f47593d run unit tests inside docker for safety 2023-06-26 20:37:55 -07:00