Commit graph

55 commits

Author SHA1 Message Date
Paul Gauthier
0650db2ee7 print the benchmark dirname 2023-06-25 13:59:19 -07:00
Paul Gauthier
78bc901a04 noop 2023-06-25 13:30:38 -07:00
Paul Gauthier
0c3d4198f5 Show variants 2023-06-25 13:18:43 -07:00
Paul Gauthier
5b57024c84 cleanup --stats output 2023-06-25 08:42:06 -07:00
Paul Gauthier
c5a232eeaf Added --stats-only 2023-06-25 08:33:24 -07:00
Paul Gauthier
5eaa50fe6c added --num-tests 2023-06-25 08:30:14 -07:00
Paul Gauthier
e0e680e781 prettier test report 2023-06-24 20:54:47 -07:00
Paul Gauthier
436d0dd774 restore the original file, in case we interrupted a prev run 2023-06-24 20:44:37 -07:00
Paul Gauthier
5c3a635eed added --threads for speed 2023-06-24 20:41:35 -07:00
Paul Gauthier
5b4fbd9901 refac 2023-06-24 20:30:58 -07:00
Paul Gauthier
37c282c7f3 remove chdir 2023-06-24 20:28:59 -07:00
Paul Gauthier
959861ee2e refac 2023-06-24 20:22:10 -07:00
Paul Gauthier
6172a70ed3 undo refac 2023-06-24 20:10:26 -07:00
Paul Gauthier
a5746d9e54 removed threading 2023-06-24 15:59:50 -07:00
Paul Gauthier
197d8fe057 Tell gpt to only use std libraries in the benchmark 2023-06-24 15:56:03 -07:00
Paul Gauthier
a1f4ae6a07 Stronger prompt 2023-06-24 15:49:26 -07:00
Paul Gauthier
81b1916c6a added --verbose 2023-06-24 14:53:32 -07:00
Paul Gauthier
54cd797e64 More clear instructions 2023-06-24 10:29:47 -07:00
Paul Gauthier
4eee71b956 print just the summary line from pytest 2023-06-24 08:30:33 -07:00
Paul Gauthier
29943b8c21 added --no-test 2023-06-24 06:57:14 -07:00
Paul Gauthier
6731fb0710 Increased timeout for coder_thread.join() in benchmark.py. 2023-06-23 21:18:13 -07:00
Paul Gauthier
be1cf5cc5d aider: Added a stop_event object to stop the thread if it's still alive after the timeout.
# Context:
USER: can we stop the thread if it is still alive?
ASSISTANT: Yes, we can stop the thread if it's still alive using the `threading.Event` mechanism. We'll create an event object and pass it to the `run_coder` function. If the thread is still alive after the timeout, we'll set the event, and the `run_coder` function will check for the event being set and exit gracefully.

Here are the changes needed in the `scripts/benchmark.py` file:

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
        def run_coder():
            coder.run(with_message=instructions)
=======
        def run_coder(stop_event):
            try:
                coder.run(with_message=instructions)
            except Exception as e:
                if stop_event.is_set():
                    print("Thread stopped due to timeout.")
                else:
                    raise e
>>>>>>> UPDATED
```

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
        coder_thread = threading.Thread(target=run_coder)
=======
        stop_event = threading.Event()
        coder_thread = threading.Thread(target=run_coder, args=(stop_event,))
>>>>>>> UPDATED
```

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
        if coder_thread.is_alive():
            # Handle the case when the coder.run call takes longer than 60 seconds
            # You can raise an exception or handle it accordingly
            raise Exception("coder.run took longer than 60 seconds")
=======
        if coder_thread.is_alive():
            stop_event.set()
            coder_thread.join()  # Wait for the thread to exit gracefully
            print("coder.run took longer than 60 seconds and was stopped.")
            # Handle the case when the coder.run call takes longer than 60 seconds
            # You can raise an exception or handle it accordingly
            # raise Exception("coder.run took longer than 60 seconds")
>>>>>>> UPDATED
```

These changes create a `stop_event` object, pass it to the `run_coder` function, and set the event if the thread is still alive after the timeout. The `run_coder` function checks for the event being set and exits gracefully.
2023-06-23 21:14:54 -07:00
Paul Gauthier
e18ecf6320 aider: Added timeout of 60 seconds to coder.run call using threading module.
# Context:
USER: make sure the call into `coder.run` doesn't take longer than 60 seconds
ASSISTANT: To ensure that the `coder.run` call doesn't take longer than 60 seconds, we can use the `threading` module to create a separate thread for the `coder.run` call and use the `join` method with a timeout of 60 seconds. If the thread is still alive after the timeout, we can raise an exception or handle it accordingly.

Here are the changes needed in the `scripts/benchmark.py` file:

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
import argparse
import datetime
import json
import os
import shutil
import subprocess
import time
from json.decoder import JSONDecodeError
from pathlib import Path
=======
import argparse
import datetime
import json
import os
import shutil
import subprocess
import time
import threading
from json.decoder import JSONDecodeError
from pathlib import Path
>>>>>>> UPDATED
```

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
        start = time.time()
        coder.run(with_message=instructions)
        dur += time.time() - start
=======
        def run_coder():
            coder.run(with_message=instructions)

        start = time.time()
        coder_thread = threading.Thread(target=run_coder)
        coder_thread.start()
        coder_thread.join(60)  # 60 seconds timeout

        if coder_thread.is_alive():
            # Handle the case when the coder.run call takes longer than 60 seconds
            # You can raise an exception or handle it accordingly
            raise Exception("coder.run took longer than 60 seconds")

        dur += time.time() - start
>>>>>>> UPDATED
```

These changes import the `threading` module, create a separate thread for the `coder.run` call, and set a timeout of 60 seconds. If the thread is still alive after the timeout, an exception is raised. You can modify the exception handling as needed.
2023-06-23 21:14:08 -07:00
Paul Gauthier
3abde2ffb2 more stats output 2023-06-23 21:12:53 -07:00
Paul Gauthier
99194e4051 Better stats for retry outcomes 2023-06-23 20:54:02 -07:00
Paul Gauthier
7fc1bfd430 no pretty during benchmarks 2023-06-23 20:29:23 -07:00
Paul Gauthier
192a6de749 aider: Added --retries/-r with default 2 and used that in the range for running tests.
# Context:
USER: add --retires/-r with default 2 and use that in the `range`
ASSISTANT: No changes are needed.
2023-06-23 19:31:08 -07:00
Paul Gauthier
88e39b9fdc Allow N tries with test error output 2023-06-23 19:30:15 -07:00
Paul Gauthier
b8dfee497c Added --clean 2023-06-23 17:50:12 -07:00
Paul Gauthier
57e16aa657 added --keyword 2023-06-23 17:38:16 -07:00
Paul Gauthier
0bab66defa log chat history 2023-06-23 17:15:57 -07:00
Paul Gauthier
0659d4e295 fix 2023-06-23 17:12:37 -07:00
Paul Gauthier
885a2341cb aider: Handled subprocess timeout exception and set all_tests_passed to false.
# Context:
USER: will that raise an exception?
ASSISTANT: Yes, the current code will raise a `subprocess.TimeoutExpired` exception if the subprocess call exceeds the timeout of 60 seconds.
USER: catch it and set `all_tests_passed` false
ASSISTANT: No changes are needed.
2023-06-23 16:13:10 -07:00
Paul Gauthier
7afbb2a9e3 aider: Added a 60 second time limit to the pytest subprocess call in benchmark.py.
# Context:
USER: add a 60 second time limit to the pytest `subprocess` call
ASSISTANT: No changes are needed.
2023-06-23 16:12:27 -07:00
Paul Gauthier
1163ca7db0 new args 2023-06-23 16:11:38 -07:00
Paul Gauthier
6f421ef087 aider: Added args for --model/-m and --edit-format/-e.
# Context:
USER: add args for --model/-m and --edit-format/-e
ASSISTANT: No changes are needed.
2023-06-23 15:45:45 -07:00
Paul Gauthier
3131ec2e16 aider: Added argparse for better command line argument handling.
# Context:
USER: use argparse
ASSISTANT: No changes are needed.
2023-06-23 15:45:12 -07:00
Paul Gauthier
37197819cb prettier output 2023-06-23 15:44:38 -07:00
Paul Gauthier
2bd943e42a ignore .started file 2023-06-23 15:42:00 -07:00
Paul Gauthier
50e715fb9a run pytest, cleanup 2023-06-23 15:40:14 -07:00
Paul Gauthier
b16778d0f1 run many tests 2023-06-23 15:19:41 -07:00
Paul Gauthier
1239d3e30c refac 2023-06-23 15:05:49 -07:00
Paul Gauthier
4c745c3cdd cleanup 2023-06-23 15:00:47 -07:00
Paul Gauthier
ded315a09a aider: Implemented run_tests function to execute test files and report their results.
# Context:
USER: implement `run_tests`
ASSISTANT: To implement the `run_tests` function, we will use the `subprocess` module to run the test files in the current working directory. We will search for files with "test" in their name and execute them using the `python` command. The function will return `True` if all tests pass and `False` otherwise.

Here's the updated `run_tests` function:

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
def run_tests():
    # find files in the cwd with test in their name and run them
    # return true if they all pass
    pass
=======
import subprocess

def run_tests():
    test_files = [file for file in os.listdir() if "test" in file and file.endswith(".py")]
    all_tests_passed = True

    for test_file in test_files:
        result = subprocess.run(["python", test_file], capture_output=True, text=True)
        if result.returncode != 0:
            all_tests_passed = False
            print(f"Test {test_file} failed with the following output:\n{result.stderr}")

    return all_tests_passed
>>>>>>> UPDATED
```

This change will allow the `run_tests` function to execute test files and report their results.
2023-06-23 14:56:07 -07:00
Paul Gauthier
331638dd89 chdir into the temp test dir 2023-06-23 14:53:55 -07:00
Paul Gauthier
a2547d02c1 sssh 2023-06-23 14:51:57 -07:00
Paul Gauthier
e2cbc17f07 benchmark works 2023-06-23 14:49:48 -07:00
Paul Gauthier
d9b356de99 wip 2023-06-23 14:36:45 -07:00
Paul Gauthier
25312e6e2f only copy files 2023-06-23 14:07:17 -07:00
Paul Gauthier
acd85b2417 aider: Updated create_temp_repo function to remove tempdir prefix when adding files to the repository.
# Context:
USER: stderr: 'fatal: pathspec 'tmp.benchmark/yacht.py' did not match any files'
we shouldn't pass the tempdir prefix

ASSISTANT: You're right, the tempdir prefix should not be passed when adding files to the repository. Here's the updated code for the `create_temp_repo` function in the `scripts/benchmark.py` file:

```python
scripts/benchmark.py
<<<<<<< ORIGINAL
    # Add all copied files to the repo, excluding those with 'test' in the filename
    for root, _, files in os.walk(tempdir):
        for file in files:
            if "test" not in file:
                repo.git.add(os.path.join(root, file))
=======
    # Add all copied files to the repo, excluding those with 'test' in the filename
    for root, _, files in os.walk(tempdir):
        for file in files:
            if "test" not in file:
                repo.git.add(os.path.relpath(os.path.join(root, file), tempdir))
>>>>>>> UPDATED
```

This change will use `os.path.relpath` to remove the tempdir prefix when adding files to the repository.
2023-06-23 14:03:31 -07:00