aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-05-20 20:35:00 +00:00

Author	SHA1	Message	Date
Paul Gauthier (aider)	351b8e50f0	feat: add --num-ctx flag to override model context window size	2024-11-25 19:20:43 -08:00
fry69	667a58052e	feat: change edit format from "senior" to "architect"	2024-09-27 09:03:42 +02:00
fry69	e3e0d57512	chore: update parameter names in args and benchmark	2024-09-27 08:57:22 +02:00
Paul Gauthier	eb21cf2830	architect/editor	2024-09-26 16:10:19 -07:00
Paul Gauthier (aider)	5a78e7d1b8	chore: Run the linter	2024-09-26 11:35:13 -07:00
Paul Gauthier (aider)	1c05192b69	fix: Only record `junior_model` and `junior_edit_format` in the results array if `edit_format` is "senior"	2024-09-26 11:35:09 -07:00
Paul Gauthier	e682eb8669	fix: Add junior model and junior edit format to benchmark results	2024-09-25 16:31:40 -07:00
Paul Gauthier (aider)	ed7503dbbe	feat: optimize find_latest_benchmark_dir to check only .md files and limit to one file per subtree	2024-09-25 12:20:45 -07:00
Paul Gauthier (aider)	e21cdafb15	style: run linter and fix code formatting issues	2024-09-25 12:18:43 -07:00
Paul Gauthier (aider)	8d90df1ebc	feat: implement automatic selection of the most recently updated benchmark directory when using --stats without dirnames	2024-09-25 12:18:39 -07:00
Paul Gauthier (aider)	24c959af2d	feat: Add --junior-model and --junior-edit-format flags to the benchmark	2024-09-25 11:44:34 -07:00
Paul Gauthier	15cc709322	feat: Improve senior coder's edit format handling	2024-09-25 11:42:09 -07:00
Paul Gauthier	65e57df7ea	feat: Implement changes to handle files content in Coder and prompts	2024-09-25 09:54:16 -07:00
Paul Gauthier	075bc828f6	stand alone junior message	2024-09-25 08:41:49 -07:00
Paul Gauthier	c912982747	senior-junior	2024-09-25 08:25:11 -07:00
Paul Gauthier	a9e9f9cdbe	Merge branch 'main' into ask-plan-simple	2024-09-25 07:46:15 -07:00
Paul Gauthier	412b8e7c3c	copy	2024-09-21 10:09:26 -07:00
Paul Gauthier	2753ac6b62	feat: Add new benchmark test case for qwen-2.5-72b-instruct-diff model	2024-09-20 13:27:58 -07:00
Paul Gauthier	8cb83afcc4	ask transient whole, o1-preview deep	2024-09-12 17:21:35 -07:00
Paul Gauthier	83662b7470	Merge branch 'main' into ask-plan-simple	2024-09-12 17:19:14 -07:00
Paul Gauthier	1fbb5079d5	unhack o1 mini	2024-09-12 15:38:28 -07:00
Paul Gauthier	291b456a45	hack for o1-mini: no system prompt, no temperature	2024-09-12 13:05:25 -07:00
Paul Gauthier	5408dcb185	wip	2024-09-11 09:32:14 -07:00
Paul Gauthier	39ae106bb3	wip	2024-09-10 15:21:54 -07:00
Paul Gauthier	abd484bfa7	wip	2024-09-06 12:01:51 -07:00
Paul Gauthier	cc15909629	clean diff edit format	2024-09-06 11:25:20 -07:00
Paul Gauthier	5b584db90c	sonnet-sonnet gets 60.2/84.2	2024-09-06 09:49:01 -07:00
Paul Gauthier	1c73e7d43a	turn off suggest shell commands during benchmarks	2024-09-05 14:35:34 -07:00
Paul Gauthier	05dcbeecac	noop	2024-09-05 14:25:09 -07:00
Paul Gauthier	ff3a75413b	sonnet+deep got 60.9/82.0	2024-09-05 13:30:25 -07:00
Paul Gauthier	1a3d8c4015	wip	2024-08-20 17:45:40 -07:00
Paul Gauthier	b61b5f4b74	cleanup before merge	2024-08-16 11:35:30 -07:00
Paul Gauthier	bac04a2a3d	no lint	2024-08-15 06:10:46 -07:00
Paul Gauthier	060c8ff89a	override dotenv	2024-08-13 18:06:00 -07:00
Paul Gauthier	139f7992cb	do not pass pretty to coder	2024-08-13 17:43:41 -07:00
Paul Gauthier	ca18220b77	num_with_malformed_responses	2024-05-19 14:19:06 -07:00
Paul Gauthier	70b1c0c20c	load .env in benchmark.py	2024-05-07 13:32:19 -07:00
Paul Gauthier	ecca737803	added deepseek-chat v2	2024-05-07 06:26:39 -07:00
Paul Gauthier	b1cae73a85	cleaned up csv output	2024-05-07 05:59:31 -07:00
Paul Gauthier	a7b08c7354	format output as yaml	2024-05-06 11:15:19 -07:00
Paul Gauthier	3162d42262	cleanup	2024-05-06 10:46:09 -07:00
Paul Gauthier	5fb7a323ec	refactored plots	2024-05-06 10:44:34 -07:00
Paul Gauthier	3bb237bdc1	handle tasks with exceptions in the stats output	2024-05-05 08:24:45 -07:00
Paul Gauthier	9cdd9e12c3	catch all exceptions in the benchmark	2024-05-04 17:52:46 -07:00
Paul Gauthier	9b88f8caf6	updated gpt-4-0314	2024-05-04 07:59:27 -07:00
Paul Gauthier	f6580fff76	updated all openai models	2024-05-04 07:38:50 -07:00
Paul Gauthier	1981105932	aider: Implemented the TODO to extract the version from aider/__init__.py for each commit hash. # Aider chat conversation: USER: do the todo ASSISTANT: Here is the SEARCH/REPLACE block to do the TODO in the code: benchmark/benchmark.py <source>python <<<<<<< SEARCH for hsh in variants['commit_hash']): # TODO: get the output from `git show {hsh}:aider/__init__.py` ======= for hsh in variants['commit_hash']: try: version = subprocess.check_output( ['git', 'show', f'{hsh}:aider/__init__.py'], universal_newlines=True ) version = re.search(r'__version__ = "(.*)"', version).group(1) csv.append(version) except subprocess.CalledProcessError: csv.append('unknown') >>>>>>> REPLACE </source> This change: 1. Loops through each commit hash in `variants['commit_hash']` 2. Uses `subprocess.check_output` to run the `git show` command and capture the contents of `aider/__init__.py` at that commit 3. Searches the file contents for the `__version__ = "X.Y.Z"` line and extracts the version string 4. Appends the version to the `csv` list, or `'unknown'` if there was an error getting the file contents The `try/except` handles cases where the commit hash doesn't exist or `aider/__init__.py` is missing.	2024-05-04 07:14:23 -07:00
Paul Gauthier	01282674d4	Add pass rates to CSV output in benchmark results summary.	2024-05-04 07:13:40 -07:00
Paul Gauthier	4461c7c4b2	fixed benchmark	2024-04-23 09:44:04 -07:00
Paul Gauthier	fd5b9bbfcb	Added groq llama3	2024-04-22 07:12:01 -07:00

1 2 3 4 5

224 commits