aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-06-30 08:20:44 +00:00

Author	SHA1	Message	Date
Paul Gauthier	9351f438ac	aider: Add a cmd line switch --graphs and only draw them if it's provided. # Aider chat conversation: USER: add a cmd line switch --graphs and only draw them if its provided ASSISTANT: To add a command-line switch `--graphs` and conditionally generate graphs based on its presence, we need to modify the `benchmark.py` file. Specifically, we will add a new option to the `main` function to accept the `--graphs` switch and then use this option to control the calls to `plot_timing` and `plot_outcomes` functions. Here are the steps we will follow: 1. Add the `--graphs` option to the `typer` command-line interface in the `main` function. 2. Use the value of this option to conditionally call the `plot_timing` and `plot_outcomes` functions. Let's make these changes: <source>python benchmark/benchmark.py <<<<<<< SEARCH @app.command() def main( dirnames: List[str] = typer.Argument(..., help="Directory names"), ======= @app.command() def main( dirnames: List[str] = typer.Argument(..., help="Directory names"), graphs: bool = typer.Option(False, "--graphs", help="Generate graphs"), >>>>>>> REPLACE ... <<<<<<< SEARCH plot_timing(df) plot_outcomes(df, repeats, repeat_hi, repeat_lo, repeat_avg) ======= if graphs: plot_timing(df) plot_outcomes(df, repeats, repeat_hi, repeat_lo, repeat_avg) >>>>>>> REPLACE </source> With these changes, the `--graphs` switch will be available when running the `benchmark.py` script, and the graph generation will only occur if this switch is provided.	2023-11-08 14:13:59 -08:00
Paul Gauthier	ed0dc95a10	copy	2023-11-08 11:16:56 -08:00
Paul Gauthier	f2a3e489e1	labels	2023-11-08 10:26:22 -08:00
Paul Gauthier	02b604adc3	Add labels to the legend in the speed graph	2023-11-08 10:20:28 -08:00
Paul Gauthier	867a0cc33f	aider: Add labels to the bars in the `plot_timing` function to display them correctly in the legend. # Aider chat conversation: ASSISTANT: To address the issue with the empty legend for the edit format in the `plot_timing` function, we need to add labels to the bars so that the legend can display them correctly. We will modify the `benchmark/benchmark.py` file to include labels for each edit format. Here's the change that needs to be made: <source>python benchmark/benchmark.py <<<<<<< SEARCH for i, fmt in enumerate(formats): edge = dict(edgecolor="#ffffff", linewidth=1.5) color = "#b3e6a8" if "diff" in fmt else "#b3d1e6" hatch = "////" if "func" in fmt else "" rects = ax.bar( pos + i * width, grouped[fmt], width * 0.95, color=color, hatch=hatch, zorder=zorder + 1, *edge, ) ======= # Create a mapping for the labels and colors to ensure consistency format_labels = { 'diff': 'Diff', 'func': 'Function', 'whole': 'Whole', 'diff-func': 'Diff-Function', 'diff-whole': 'Diff-Whole', 'func-whole': 'Function-Whole', # Add other edit formats here if necessary } format_colors = { 'diff': '#b3e6a8', 'func': '#b3d1e6', 'whole': '#e6b3b3', 'diff-func': '#e6d1b3', 'diff-whole': '#b3e6d1', 'func-whole': '#d1b3e6', # Add other edit format colors here if necessary } format_hatches = { 'diff': '////', 'func': '....', 'whole': '', 'diff-func': '\\\\\\\\', 'diff-whole': 'xxxx', 'func-whole': '----', # Add other edit format hatches here if necessary } for i, fmt in enumerate(formats): label = format_labels.get(fmt, fmt) color = format_colors.get(fmt, '#b3d1e6') hatch = format_hatches.get(fmt, '') edge = dict(edgecolor="#ffffff", linewidth=1.5) rects = ax.bar( pos + i width, grouped[fmt], width * 0.95, label=label, color=color, hatch=hatch, zorder=zorder + 1, **edge, ) >>>>>>> REPLACE </source> This change adds a mapping for labels, colors, and hatches for each edit format. It also ensures that each bar in the bar chart has a label associated with it, which will be used to generate the legend. The colors and hatches are placeholders and should be adjusted to match the desired styling.	2023-11-08 10:11:47 -08:00
Paul Gauthier	eed4864af1	Changed the label of the y-axis in the plot to "Average GPT response time per exercise (sec)".	2023-11-08 10:11:46 -08:00
Paul Gauthier	207bbbc4df	aider: Implemented the `plot_timing` function to create a bar chart showing the average duration of each (model, edit_format) combination, with consistent styling as `plot_outcomes`. # Aider chat conversation: ASSISTANT: To implement the `plot_timing` function and style the graph similarly to `plot_outcomes`, we will follow these steps: 1. Define the `plot_timing` function to create a bar chart that shows the average duration for each combination of model and edit format. 2. Use similar styling options as used in `plot_outcomes` for consistency. Let's start by implementing the `plot_timing` function in the `benchmark/benchmark.py` file. <source>python benchmark/benchmark.py <<<<<<< SEARCH def plot_timing(df): """plot a graph showing the average duration of each (model, edit_format)""" ======= def plot_timing(df): """plot a graph showing the average duration of each (model, edit_format)""" plt.rcParams["hatch.linewidth"] = 0.5 plt.rcParams["hatch.color"] = "#444444" from matplotlib import rc rc("font", *{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10}) fig, ax = plt.subplots(figsize=(6, 4)) ax.grid(axis="y", zorder=0, lw=0.2) zorder = 1 grouped = df.groupby(["model", "edit_format"])["avg_duration"].mean().unstack() num_models, num_formats = grouped.shape pos = np.array(range(num_models)) width = 0.8 / num_formats formats = grouped.columns models = grouped.index for i, fmt in enumerate(formats): edge = dict(edgecolor="#ffffff", linewidth=1.5) color = "#b3e6a8" if "diff" in fmt else "#b3d1e6" hatch = "////" if "func" in fmt else "" rects = ax.bar( pos + i width, grouped[fmt], width * 0.95, color=color, hatch=hatch, zorder=zorder + 1, *edge, ) ax.bar_label(rects, padding=4, labels=[f"{v:.1f}s" for v in grouped[fmt]], size=6) ax.set_xticks([p + 0.5 width for p in pos]) ax.set_xticklabels(models) ax.set_ylabel("Average duration (seconds)") ax.set_title("GPT Code Editing - Timing") ax.legend( title="Edit Format", loc="upper left", ) ax.set_ylim(top=max(grouped.max()) * 1.1) # Set y-axis limit to 10% more than the max value plt.tight_layout() plt.savefig("tmp_timing.svg") imgcat(fig) >>>>>>> REPLACE </source> This code block adds the implementation of the `plot_timing` function, which creates a bar chart to visualize the average duration of tests for each model and edit format. The styling is kept consistent with the `plot_outcomes` function, using similar colors, hatches, and font settings. The resulting graph is saved as an SVG file and displayed in the terminal using `imgcat`.	2023-11-08 10:07:29 -08:00
Paul Gauthier	ab55b6ff0a	Copy ORIGINAL_DNAME to dirname.	2023-11-08 10:07:27 -08:00
Paul Gauthier	59fed25cd1	copy	2023-11-08 08:36:48 -08:00
Paul Gauthier	2c83287c46	refac	2023-11-08 06:21:16 -08:00
Paul Gauthier	5da64a6abc	copy	2023-11-07 07:22:26 -08:00
Paul Gauthier	dfb921c509	copy	2023-11-06 20:24:44 -08:00
Paul Gauthier	15a1c754a4	copy	2023-11-06 18:58:24 -08:00
Paul Gauthier	35754247b4	copy	2023-11-06 18:48:14 -08:00
Paul Gauthier	976fc7a836	update benchmarking script	2023-11-06 18:26:02 -08:00
Joshua Vial	91b7648f33	update model creation method	2023-09-09 12:06:34 +12:00
Paul Gauthier	2c53c153b8	Updated ORIGINAL prompting	2023-08-09 11:38:39 -03:00
Paul Gauthier	0aa2ff2cf9	roughed in cache; elide full dirnames from test results to make them deterministic	2023-08-09 11:30:13 -03:00
Paul Gauthier	2eeee67ee5	support --keywords	2023-08-09 11:12:27 -03:00
Paul Gauthier	c43d7e9a69	initial --diffs implementation	2023-08-08 17:40:32 -03:00
Paul Gauthier	568be3812b	Fix last_keyboard_interrupt in benchmark	2023-07-20 13:25:15 -03:00
Paul Gauthier	edbec5498e	update benchmark driver	2023-07-13 20:02:58 -07:00
Paul Gauthier	44311f4737	preparing for v0.8.0	2023-07-06 14:03:06 -07:00
Paul Gauthier	ba2c5d05da	new svg	2023-07-02 10:45:14 -07:00
Paul Gauthier	4ea70bfdc0	new svg	2023-07-02 10:35:11 -07:00
Paul Gauthier	5e4ad5bf00	new svg	2023-07-02 10:28:39 -07:00
Paul Gauthier	8f4a823855	new svg	2023-07-02 10:26:18 -07:00
Paul Gauthier	2915acb9a6	new svg	2023-07-02 10:23:06 -07:00
Paul Gauthier	86ced88f8e	new svg	2023-07-02 10:18:58 -07:00
Paul Gauthier	db18876db6	plot the repeat error bars	2023-07-02 07:10:54 -07:00
Paul Gauthier	a4ab262d96	new svg	2023-07-01 21:35:38 -07:00
Paul Gauthier	f548f88220	new svg	2023-07-01 21:31:54 -07:00
Paul Gauthier	b9f8ed47f4	copy	2023-07-01 17:50:48 -07:00
Paul Gauthier	8ef166478a	graph layout	2023-07-01 17:18:14 -07:00
Paul Gauthier	3f0c752368	show 1st & 2nd coding attempts	2023-07-01 16:40:25 -07:00
Paul Gauthier	f27a869649	new svg	2023-07-01 16:23:09 -07:00
Paul Gauthier	bddc5a36f0	cleanup	2023-07-01 13:29:52 -07:00
Paul Gauthier	34b4ab199d	show try 1 and 2	2023-07-01 13:28:36 -07:00
Paul Gauthier	6471b89033	return 0	2023-07-01 06:50:13 -07:00
Paul Gauthier	92e5f43abe	intro	2023-06-30 20:42:54 -07:00
Paul Gauthier	5942057a9c	introduction.md	2023-06-30 20:39:08 -07:00
Paul Gauthier	895d3d2d67	skip graph	2023-06-30 20:36:12 -07:00
Paul Gauthier	5310ed6faf	filename	2023-06-30 20:31:17 -07:00
Paul Gauthier	c745a3156f	instructions_append	2023-06-30 20:30:00 -07:00
Paul Gauthier	8a0909738d	stars	2023-06-30 19:27:57 -07:00
Paul Gauthier	674bfb52c6	updated graph with larger fonts	2023-06-30 19:19:23 -07:00
Paul Gauthier	ca306b6a4a	remove the 1 try results	2023-06-30 14:01:29 -07:00
Paul Gauthier	7c56363d86	copy	2023-06-30 13:07:24 -07:00
Paul Gauthier	f84e3d6ab7	show both tries; repeats	2023-06-30 10:20:06 -07:00
Paul Gauthier	a9cc3f81b2	nice graph	2023-06-30 06:44:37 -07:00

1 2 3 4 5

204 commits