aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-05-22 05:14:59 +00:00

Author	SHA1	Message	Date
Paul Gauthier	c8d2649bb8	Update model names to remove parentheses and extra whitespace in benchmark plot.	2024-05-15 11:36:37 -07:00
Paul Gauthier	445c73267a	Updated plot dimensions and axis labels for better visualization in benchmark over time.	2024-05-15 11:29:44 -07:00
Paul Gauthier	77ba518a1a	aider: Made the border of the graph thinner and lighter by adjusting the `spines` properties of the axes. # Aider chat conversation: USER: make the border of the graph thinner and lighter ASSISTANT: To make the border of the graph thinner and lighter, we can adjust the `spines` properties of the axes. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models] ======= for spine in ax.spines.values(): spine.set_edgecolor('#DDDDDD') spine.set_linewidth(0.5) colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models] >>>>>>> REPLACE ```	2024-05-15 11:21:01 -07:00
Paul Gauthier	a018fdfe2c	aider: Make the dots bigger. # Aider chat conversation: USER: make the dots bigger ASSISTANT: To make the dots bigger, we can adjust the `s` parameter in the `scatter` function, which controls the size of the dots. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH ax.scatter(dates, pass_rates, c=colors, alpha=0.5) ======= ax.scatter(dates, pass_rates, c=colors, alpha=0.5, s=100) >>>>>>> REPLACE ```	2024-05-15 11:20:05 -07:00
Paul Gauthier	3f1b86a1cc	aider: Make the axes labels font larger. # Aider chat conversation: USER: I did `git reset --hard HEAD~1` to discard the last edits. Please wait for further instructions before attempting that change again. Feel free to ask relevant questions about why the changes were reverted. ASSISTANT: Understood. If you have any specific instructions or further changes you'd like to make, please let me know! USER: make the axes labels font larger ASSISTANT: To make the axes labels font larger, we can adjust the `fontsize` parameter in the `set_xlabel` and `set_ylabel` methods. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH ax.set_xlabel('Model release date') ax.set_ylabel('Aider code editing benchmark,\npercent completed correctly') ======= ax.set_xlabel('Model release date', fontsize=14) ax.set_ylabel('Aider code editing benchmark,\npercent completed correctly', fontsize=14) >>>>>>> REPLACE ```	2024-05-15 11:14:48 -07:00
Paul Gauthier	05e34783c2	Update plot labels and annotations for better readability and clarity.	2024-05-15 11:13:17 -07:00
Paul Gauthier	75fa061016	aider: Adjust the position of labels to prevent overlap with dots. # Aider chat conversation: USER: make the labels not overlap the dots ASSISTANT: To make the labels not overlap the dots, we can adjust the position of the labels slightly using the `xytext` parameter in the `annotate` function. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75) ======= ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75, xytext=(5, 5), textcoords='offset points') >>>>>>> REPLACE ```	2024-05-15 11:06:32 -07:00
Paul Gauthier	49eb72cc20	Updated image file format from SVG to PNG and added imgcat library for displaying the plot.	2024-05-15 11:06:31 -07:00
Paul Gauthier	8d670f187c	aider: Used the colorscheme and fonts from the graphs in `plots.py` in `over_time.py`. # Aider chat conversation: USER: use the colorscheme and fonts from the graphs in plots.py ASSISTANT: To use the colorscheme and fonts from the graphs in `plots.py`, we need to: 1. Import the necessary settings from `plots.py`. 2. Apply these settings to the graph in `over_time.py`. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH import yaml from datetime import datetime ======= import yaml from datetime import datetime from matplotlib import rc >>>>>>> REPLACE ``` benchmark/over_time.py ```python <<<<<<< SEARCH plt.figure(figsize=(10, 6)) ======= plt.rcParams["hatch.linewidth"] = 0.5 plt.rcParams["hatch.color"] = "#444444" rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10}) fig, ax = plt.subplots(figsize=(6, 4)) ax.grid(axis="y", zorder=0, lw=0.2) >>>>>>> REPLACE ``` benchmark/over_time.py ```python <<<<<<< SEARCH plt.scatter(dates, pass_rates, c=colors, alpha=0.5) ======= ax.scatter(dates, pass_rates, c=colors, alpha=0.5) >>>>>>> REPLACE ``` benchmark/over_time.py ```python <<<<<<< SEARCH plt.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75) plt.xlabel('Release Date') plt.ylabel('Pass Rate 2') plt.title('Model Performance Over Time') plt.grid(True) plt.tight_layout() plt.show() ======= ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75) ax.set_xlabel('Release Date') ax.set_ylabel('Pass Rate 2') ax.set_title('Model Performance Over Time') plt.tight_layout() plt.savefig("tmp_over_time.svg") imgcat(fig) >>>>>>> REPLACE ```	2024-05-15 11:04:56 -07:00
Paul Gauthier	85d9d8430f	aider: Color the "gpt-3.5" models green in the plotting code. # Aider chat conversation: USER: color the gpt-3.5 models a new color ASSISTANT: To color the "gpt-3.5" models a new color, we need to modify the plotting code in `benchmark/over_time.py` to check for "gpt-3.5" in the model name and assign a different color. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH colors = ['red' if 'gpt-4' in model else 'blue' for model in models] ======= colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models] >>>>>>> REPLACE ```	2024-05-15 11:04:14 -07:00
Paul Gauthier	0d11357980	aider: Color the "gpt-4" models differently. # Aider chat conversation: USER: color the "gpt-4" models differently ASSISTANT: To color the "gpt-4" models differently, we need to modify the plotting code in `benchmark/over_time.py` to check for "gpt-4" in the model name and assign a different color. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH plt.scatter(dates, pass_rates, c='blue', alpha=0.5) ======= colors = ['red' if 'gpt-4' in model else 'blue' for model in models] plt.scatter(dates, pass_rates, c=colors, alpha=0.5) >>>>>>> REPLACE ``` This change will color the points red if the model name contains "gpt-4", otherwise, it will color them blue.	2024-05-15 11:03:47 -07:00
Paul Gauthier	af2f4b73e9	aider: Use the `released` field directly in the plot instead of parsing it with `datetime.strptime`. # Aider chat conversation: USER: I ran this command: python benchmark/over_time.py And got this output: Traceback (most recent call last): File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 33, in <module> plot_over_time('_data/edit_leaderboard.yml') File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 15, in plot_over_time dates.append(datetime.strptime(entry['released'], '%Y-%m-%d')) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: strptime() argument 1 must be str, not datetime.date ASSISTANT: The error indicates that the `released` field in the YAML file is already a `datetime.date` object, so we don't need to parse it with `datetime.strptime`. We can directly use it in the plot. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH dates.append(datetime.strptime(entry['released'], '%Y-%m-%d')) ======= dates.append(entry['released']) >>>>>>> REPLACE ```	2024-05-15 11:03:07 -07:00
Paul Gauthier	49ac9ca357	aider: Added scatter plot functionality for YAML entries with a `released` date and `pass_rate_2`. # Aider chat conversation: USER: in over_time.py, draw a graph for each of the yaml entries with a `released` date. plot date on x, `pass_rate_2` on y and label with the `model` name. do a scatter plot. use colors/styles similar to the other graphs. ASSISTANT: To draw a scatter plot for each of the YAML entries with a `released` date, we need to: 1. Read the YAML file. 2. Extract the `released` date and `pass_rate_2` for each entry. 3. Plot the data using matplotlib. Here are the changes: benchmark/over_time.py ```python <<<<<<< SEARCH ======= import matplotlib.pyplot as plt import yaml from datetime import datetime def plot_over_time(yaml_file): with open(yaml_file, 'r') as file: data = yaml.safe_load(file) dates = [] pass_rates = [] models = [] for entry in data: if 'released' in entry and 'pass_rate_2' in entry: dates.append(datetime.strptime(entry['released'], '%Y-%m-%d')) pass_rates.append(entry['pass_rate_2']) models.append(entry['model']) plt.figure(figsize=(10, 6)) plt.scatter(dates, pass_rates, c='blue', alpha=0.5) for i, model in enumerate(models): plt.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75) plt.xlabel('Release Date') plt.ylabel('Pass Rate 2') plt.title('Model Performance Over Time') plt.grid(True) plt.tight_layout() plt.show() # Example usage plot_over_time('_data/edit_leaderboard.yml') >>>>>>> REPLACE ```	2024-05-15 11:02:44 -07:00
Paul Gauthier	70b1c0c20c	load .env in benchmark.py	2024-05-07 13:32:19 -07:00
Paul Gauthier	ecca737803	added deepseek-chat v2	2024-05-07 06:26:39 -07:00
Paul Gauthier	b1cae73a85	cleaned up csv output	2024-05-07 05:59:31 -07:00
Paul Gauthier	a7b08c7354	format output as yaml	2024-05-06 11:15:19 -07:00
Paul Gauthier	3162d42262	cleanup	2024-05-06 10:46:09 -07:00
Paul Gauthier	5fb7a323ec	refactored plots	2024-05-06 10:44:34 -07:00
Paul Gauthier	3bb237bdc1	handle tasks with exceptions in the stats output	2024-05-05 08:24:45 -07:00
Paul Gauthier	9cdd9e12c3	catch all exceptions in the benchmark	2024-05-04 17:52:46 -07:00
Paul Gauthier	9b88f8caf6	updated gpt-4-0314	2024-05-04 07:59:27 -07:00
Paul Gauthier	f6580fff76	updated all openai models	2024-05-04 07:38:50 -07:00
Paul Gauthier	1981105932	aider: Implemented the TODO to extract the version from aider/__init__.py for each commit hash. # Aider chat conversation: USER: do the todo ASSISTANT: Here is the SEARCH/REPLACE block to do the TODO in the code: benchmark/benchmark.py <source>python <<<<<<< SEARCH for hsh in variants['commit_hash']): # TODO: get the output from `git show {hsh}:aider/__init__.py` ======= for hsh in variants['commit_hash']: try: version = subprocess.check_output( ['git', 'show', f'{hsh}:aider/__init__.py'], universal_newlines=True ) version = re.search(r'__version__ = "(.*)"', version).group(1) csv.append(version) except subprocess.CalledProcessError: csv.append('unknown') >>>>>>> REPLACE </source> This change: 1. Loops through each commit hash in `variants['commit_hash']` 2. Uses `subprocess.check_output` to run the `git show` command and capture the contents of `aider/__init__.py` at that commit 3. Searches the file contents for the `__version__ = "X.Y.Z"` line and extracts the version string 4. Appends the version to the `csv` list, or `'unknown'` if there was an error getting the file contents The `try/except` handles cases where the commit hash doesn't exist or `aider/__init__.py` is missing.	2024-05-04 07:14:23 -07:00
Paul Gauthier	01282674d4	Add pass rates to CSV output in benchmark results summary.	2024-05-04 07:13:40 -07:00
Paul Gauthier	f0b794ce9c	revert docker.sh	2024-05-03 14:46:52 -07:00
Paul Gauthier	26d841ccc7	aider: Add --add-host flag to docker run command to allow container to connect to host machine on macOS using host.docker.internal. # Aider chat conversation: USER: when i run `curl http://127.0.0.1:11434` inside the container, it "Couldn't connect to server". i want it to connect to the server running on the host computer on that port. the host is a mac. ASSISTANT: To allow the Docker container to connect to the host machine on macOS, you can use the special DNS name `host.docker.internal` which resolves to the internal IP address used by the host. This is a feature provided by Docker for Mac to facilitate such connections. Here's the diff to update the `docker.sh` script to include the `--add-host` flag, which adds an entry to the container's `/etc/hosts` file: ```diff --- benchmark/docker.sh +++ benchmark/docker.sh @@ ... @@ docker run \ -it --rm \ -v `pwd`:/aider \ -v `pwd`/tmp.benchmarks/.:/benchmarks \ -e OPENAI_API_KEY=$OPENAI_API_KEY \ -e HISTFILE=/aider/.bash_history \ -e AIDER_DOCKER=1 \ -e AIDER_BENCHMARK_DIR=/benchmarks \ + --add-host=host.docker.internal:host-gateway \ --net=host \ -p 11434:11434 \ aider-benchmark \ bash ``` With this change, when you run `curl http://host.docker.internal:11434` inside the container, it should connect to the server running on the host computer on that port.	2024-05-01 13:54:30 -07:00
Paul Gauthier	71b675d0ef	Add network host mode and change port mapping in docker run script.	2024-05-01 13:54:28 -07:00
Paul Gauthier	5e4ac44008	aider: Add port forwarding for the client inside the container to connect to the Ollama server on the host. # Aider chat conversation: USER: add a port fwd so a client inside the contain can connect to the ollama server running on the host ASSISTANT: To forward a port so that a client inside the container can connect to the Ollama server running on the host, you need to add a `--publish` or `-p` option to the `docker run` command in the `benchmark/docker.sh` script. This option takes the format `host_port:container_port`. Assuming the Ollama server is running on port 12345 on the host, and you want to use the same port inside the container, the diff would look like this: ```diff --- benchmark/docker.sh +++ benchmark/docker.sh @@ ... @@ docker run \ -it --rm \ -v `pwd`:/aider \ -v `pwd`/tmp.benchmarks/.:/benchmarks \ -e OPENAI_API_KEY=$OPENAI_API_KEY \ -e HISTFILE=/aider/.bash_history \ -e AIDER_DOCKER=1 \ -e AIDER_BENCHMARK_DIR=/benchmarks \ + -p 12345:12345 \ aider-benchmark \ bash ``` This change will forward port 12345 from the host to port 12345 in the container, allowing a client inside the container to connect to the Ollama server running on the host machine.	2024-05-01 13:46:45 -07:00
Paul Gauthier	4461c7c4b2	fixed benchmark	2024-04-23 09:44:04 -07:00
Paul Gauthier	fd5b9bbfcb	Added groq llama3	2024-04-22 07:12:01 -07:00
Paul Gauthier	434fa5f6a7	updated benchmark to new Coder & Model classes	2024-04-19 15:21:24 -07:00
Paul Gauthier	7875418183	fix column order	2024-04-09 18:11:08 -07:00
Paul Gauthier	00f1cdb561	Added gpt-4-turbo vision blog post	2024-04-09 16:55:35 -07:00
Paul Gauthier	ac39791fee	fixed mislabelled gpt-4 column	2024-03-09 08:20:27 -08:00
Paul Gauthier	f5887a5098	tweaking graph labels	2024-02-03 08:25:19 -08:00
Paul Gauthier	9033be74bf	Initial benchmark results for 0125	2024-01-25 13:00:16 -08:00
Joshua Vial	93f32d3855	make benchmark listen to openai_api_base env var	2023-12-21 09:38:54 +13:00
Joshua Vial	9e656945fe	Merge remote-tracking branch 'upstream/main' into gpt4-vision	2023-12-21 09:29:32 +13:00
Paul Gauthier	76c1deae6a	improved test for toplevel refactored func	2023-12-19 15:58:24 -08:00
Joshua Vial	d4e663f7bc	benchmark work with openrouter	2023-12-20 10:27:33 +13:00
Paul Gauthier	755b3858eb	copy	2023-12-19 11:11:58 -08:00
Paul Gauthier	ef2a1f3875	diff with move hint	2023-12-18 19:09:32 -08:00
Paul Gauthier	308007a8e9	laziness24-turbo-udiff-never2x	2023-12-18 18:43:15 -08:00
Paul Gauthier	e3c8fac604	copy	2023-12-18 10:20:40 -08:00
Paul Gauthier	b0c03820e9	copy	2023-12-18 10:19:38 -08:00
Paul Gauthier	16534e914b	better graph	2023-12-18 10:02:52 -08:00
Paul Gauthier	6ab2db192c	Added udiff graph	2023-12-18 09:53:28 -08:00
Paul Gauthier	0d4b6b6885	added missing files	2023-12-17 13:00:32 -08:00
Paul Gauthier	7113a30271	unified diffs	2023-12-17 12:54:34 -08:00

... 6 7 8 9 10 ...

531 commits