moved website/ -> aider/website/
8
aider/website/Gemfile
Normal file
|
@ -0,0 +1,8 @@
|
|||
source 'https://rubygems.org'
|
||||
gem 'jekyll'
|
||||
gem "just-the-docs", "0.8.2"
|
||||
gem 'jekyll-redirect-from'
|
||||
gem 'jekyll-sitemap'
|
||||
gem "webrick"
|
||||
gem 'github-pages', group: :jekyll_plugins
|
||||
gem "html-proofer"
|
466
aider/website/HISTORY.md
Normal file
|
@ -0,0 +1,466 @@
|
|||
---
|
||||
title: Release history
|
||||
parent: More info
|
||||
nav_order: 999
|
||||
---
|
||||
|
||||
<!--[[[cog
|
||||
# This page is a copy of HISTORY.md, adding the front matter above.
|
||||
text = open("HISTORY.md").read()
|
||||
cog.out(text)
|
||||
]]]-->
|
||||
|
||||
# Release history
|
||||
|
||||
### Aider v0.42.0
|
||||
|
||||
- Performance release:
|
||||
- 5X faster launch!
|
||||
- Faster auto-complete in large git repos (users report ~100X speedup)!
|
||||
|
||||
### Aider v0.41.0
|
||||
|
||||
- [Allow Claude 3.5 Sonnet to stream back >4k tokens!](https://aider.chat/2024/07/01/sonnet-not-lazy.html)
|
||||
- It is the first model capable of writing such large coherent, useful code edits.
|
||||
- Do large refactors or generate multiple files of new code in one go.
|
||||
- Aider now uses `claude-3-5-sonnet-20240620` by default if `ANTHROPIC_API_KEY` is set in the environment.
|
||||
- [Enabled image support](https://aider.chat/docs/images-urls.html) for 3.5 Sonnet and for GPT-4o & 3.5 Sonnet via OpenRouter (by @yamitzky).
|
||||
- Added `--attribute-commit-message` to prefix aider's commit messages with "aider:".
|
||||
- Fixed regression in quality of one-line commit messages.
|
||||
- Automatically retry on Anthropic `overloaded_error`.
|
||||
- Bumped dependency versions.
|
||||
|
||||
### Aider v0.40.6
|
||||
|
||||
- Fixed `/undo` so it works regardless of `--attribute` settings.
|
||||
|
||||
### Aider v0.40.5
|
||||
|
||||
- Bump versions to pickup latest litellm to fix streaming issue with Gemini
|
||||
- https://github.com/BerriAI/litellm/issues/4408
|
||||
|
||||
### Aider v0.40.1
|
||||
|
||||
- Improved context awareness of repomap.
|
||||
- Restored proper `--help` functionality.
|
||||
|
||||
### Aider v0.40.0
|
||||
|
||||
- Improved prompting to discourage Sonnet from wasting tokens emitting unchanging code (#705).
|
||||
- Improved error info for token limit errors.
|
||||
- Options to suppress adding "(aider)" to the [git author and committer names](https://aider.chat/docs/git.html#commit-attribution).
|
||||
- Use `--model-settings-file` to customize per-model settings, like use of repo-map (by @caseymcc).
|
||||
- Improved invocation of flake8 linter for python code.
|
||||
|
||||
|
||||
### Aider v0.39.0
|
||||
|
||||
- Use `--sonnet` for Claude 3.5 Sonnet, which is the top model on [aider's LLM code editing leaderboard](https://aider.chat/docs/leaderboards/#claude-35-sonnet-takes-the-top-spot).
|
||||
- All `AIDER_xxx` environment variables can now be set in `.env` (by @jpshack-at-palomar).
|
||||
- Use `--llm-history-file` to log raw messages sent to the LLM (by @daniel-vainsencher).
|
||||
- Commit messages are no longer prefixed with "aider:". Instead the git author and committer names have "(aider)" added.
|
||||
|
||||
### Aider v0.38.0
|
||||
|
||||
- Use `--vim` for [vim keybindings](https://aider.chat/docs/commands.html#vi) in the chat.
|
||||
- [Add LLM metadata](https://aider.chat/docs/llms/warnings.html#specifying-context-window-size-and-token-costs) via `.aider.models.json` file (by @caseymcc).
|
||||
- More detailed [error messages on token limit errors](https://aider.chat/docs/troubleshooting/token-limits.html).
|
||||
- Single line commit messages, without the recent chat messages.
|
||||
- Ensure `--commit --dry-run` does nothing.
|
||||
- Have playwright wait for idle network to better scrape js sites.
|
||||
- Documentation updates, moved into website/ subdir.
|
||||
- Moved tests/ into aider/tests/.
|
||||
|
||||
### Aider v0.37.0
|
||||
|
||||
- Repo map is now optimized based on text of chat history as well as files added to chat.
|
||||
- Improved prompts when no files have been added to chat to solicit LLM file suggestions.
|
||||
- Aider will notice if you paste a URL into the chat, and offer to scrape it.
|
||||
- Performance improvements the repo map, especially in large repos.
|
||||
- Aider will not offer to add bare filenames like `make` or `run` which may just be words.
|
||||
- Properly override `GIT_EDITOR` env for commits if it is already set.
|
||||
- Detect supported audio sample rates for `/voice`.
|
||||
- Other small bug fixes.
|
||||
|
||||
### Aider v0.36.0
|
||||
|
||||
- [Aider can now lint your code and fix any errors](https://aider.chat/2024/05/22/linting.html).
|
||||
- Aider automatically lints and fixes after every LLM edit.
|
||||
- You can manually lint-and-fix files with `/lint` in the chat or `--lint` on the command line.
|
||||
- Aider includes built in basic linters for all supported tree-sitter languages.
|
||||
- You can also configure aider to use your preferred linter with `--lint-cmd`.
|
||||
- Aider has additional support for running tests and fixing problems.
|
||||
- Configure your testing command with `--test-cmd`.
|
||||
- Run tests with `/test` or from the command line with `--test`.
|
||||
- Aider will automatically attempt to fix any test failures.
|
||||
|
||||
|
||||
### Aider v0.35.0
|
||||
|
||||
- Aider now uses GPT-4o by default.
|
||||
- GPT-4o tops the [aider LLM code editing leaderboard](https://aider.chat/docs/leaderboards/) at 72.9%, versus 68.4% for Opus.
|
||||
- GPT-4o takes second on [aider's refactoring leaderboard](https://aider.chat/docs/leaderboards/#code-refactoring-leaderboard) with 62.9%, versus Opus at 72.3%.
|
||||
- Added `--restore-chat-history` to restore prior chat history on launch, so you can continue the last conversation.
|
||||
- Improved reflection feedback to LLMs using the diff edit format.
|
||||
- Improved retries on `httpx` errors.
|
||||
|
||||
### Aider v0.34.0
|
||||
|
||||
- Updated prompting to use more natural phrasing about files, the git repo, etc. Removed reliance on read-write/read-only terminology.
|
||||
- Refactored prompting to unify some phrasing across edit formats.
|
||||
- Enhanced the canned assistant responses used in prompts.
|
||||
- Added explicit model settings for `openrouter/anthropic/claude-3-opus`, `gpt-3.5-turbo`
|
||||
- Added `--show-prompts` debug switch.
|
||||
- Bugfix: catch and retry on all litellm exceptions.
|
||||
|
||||
|
||||
### Aider v0.33.0
|
||||
|
||||
- Added native support for [Deepseek models](https://aider.chat/docs/llms.html#deepseek) using `DEEPSEEK_API_KEY` and `deepseek/deepseek-chat`, etc rather than as a generic OpenAI compatible API.
|
||||
|
||||
### Aider v0.32.0
|
||||
|
||||
- [Aider LLM code editing leaderboards](https://aider.chat/docs/leaderboards/) that rank popular models according to their ability to edit code.
|
||||
- Leaderboards include GPT-3.5/4 Turbo, Opus, Sonnet, Gemini 1.5 Pro, Llama 3, Deepseek Coder & Command-R+.
|
||||
- Gemini 1.5 Pro now defaults to a new diff-style edit format (diff-fenced), enabling it to work better with larger code bases.
|
||||
- Support for Deepseek-V2, via more a flexible config of system messages in the diff edit format.
|
||||
- Improved retry handling on errors from model APIs.
|
||||
- Benchmark outputs results in YAML, compatible with leaderboard.
|
||||
|
||||
### Aider v0.31.0
|
||||
|
||||
- [Aider is now also AI pair programming in your browser!](https://aider.chat/2024/05/02/browser.html) Use the `--browser` switch to launch an experimental browser based version of aider.
|
||||
- Switch models during the chat with `/model <name>` and search the list of available models with `/models <query>`.
|
||||
|
||||
### Aider v0.30.1
|
||||
|
||||
- Adding missing `google-generativeai` dependency
|
||||
|
||||
### Aider v0.30.0
|
||||
|
||||
- Added [Gemini 1.5 Pro](https://aider.chat/docs/llms.html#free-models) as a recommended free model.
|
||||
- Allow repo map for "whole" edit format.
|
||||
- Added `--models <MODEL-NAME>` to search the available models.
|
||||
- Added `--no-show-model-warnings` to silence model warnings.
|
||||
|
||||
### Aider v0.29.2
|
||||
|
||||
- Improved [model warnings](https://aider.chat/docs/llms.html#model-warnings) for unknown or unfamiliar models
|
||||
|
||||
### Aider v0.29.1
|
||||
|
||||
- Added better support for groq/llama3-70b-8192
|
||||
|
||||
### Aider v0.29.0
|
||||
|
||||
- Added support for [directly connecting to Anthropic, Cohere, Gemini and many other LLM providers](https://aider.chat/docs/llms.html).
|
||||
- Added `--weak-model <model-name>` which allows you to specify which model to use for commit messages and chat history summarization.
|
||||
- New command line switches for working with popular models:
|
||||
- `--4-turbo-vision`
|
||||
- `--opus`
|
||||
- `--sonnet`
|
||||
- `--anthropic-api-key`
|
||||
- Improved "whole" and "diff" backends to better support [Cohere's free to use Command-R+ model](https://aider.chat/docs/llms.html#cohere).
|
||||
- Allow `/add` of images from anywhere in the filesystem.
|
||||
- Fixed crash when operating in a repo in a detached HEAD state.
|
||||
- Fix: Use the same default model in CLI and python scripting.
|
||||
|
||||
### Aider v0.28.0
|
||||
|
||||
- Added support for new `gpt-4-turbo-2024-04-09` and `gpt-4-turbo` models.
|
||||
- Benchmarked at 61.7% on Exercism benchmark, comparable to `gpt-4-0613` and worse than the `gpt-4-preview-XXXX` models. See [recent Exercism benchmark results](https://aider.chat/2024/03/08/claude-3.html).
|
||||
- Benchmarked at 34.1% on the refactoring/laziness benchmark, significantly worse than the `gpt-4-preview-XXXX` models. See [recent refactor bencmark results](https://aider.chat/2024/01/25/benchmarks-0125.html).
|
||||
- Aider continues to default to `gpt-4-1106-preview` as it performs best on both benchmarks, and significantly better on the refactoring/laziness benchmark.
|
||||
|
||||
### Aider v0.27.0
|
||||
|
||||
- Improved repomap support for typescript, by @ryanfreckleton.
|
||||
- Bugfix: Only /undo the files which were part of the last commit, don't stomp other dirty files
|
||||
- Bugfix: Show clear error message when OpenAI API key is not set.
|
||||
- Bugfix: Catch error for obscure languages without tags.scm file.
|
||||
|
||||
### Aider v0.26.1
|
||||
|
||||
- Fixed bug affecting parsing of git config in some environments.
|
||||
|
||||
### Aider v0.26.0
|
||||
|
||||
- Use GPT-4 Turbo by default.
|
||||
- Added `-3` and `-4` switches to use GPT 3.5 or GPT-4 (non-Turbo).
|
||||
- Bug fix to avoid reflecting local git errors back to GPT.
|
||||
- Improved logic for opening git repo on launch.
|
||||
|
||||
### Aider v0.25.0
|
||||
|
||||
- Issue a warning if user adds too much code to the chat.
|
||||
- https://aider.chat/docs/faq.html#how-can-i-add-all-the-files-to-the-chat
|
||||
- Vocally refuse to add files to the chat that match `.aiderignore`
|
||||
- Prevents bug where subsequent git commit of those files will fail.
|
||||
- Added `--openai-organization-id` argument.
|
||||
- Show the user a FAQ link if edits fail to apply.
|
||||
- Made past articles part of https://aider.chat/blog/
|
||||
|
||||
### Aider v0.24.1
|
||||
|
||||
- Fixed bug with cost computations when --no-steam in effect
|
||||
|
||||
### Aider v0.24.0
|
||||
|
||||
- New `/web <url>` command which scrapes the url, turns it into fairly clean markdown and adds it to the chat.
|
||||
- Updated all OpenAI model names, pricing info
|
||||
- Default GPT 3.5 model is now `gpt-3.5-turbo-0125`.
|
||||
- Bugfix to the `!` alias for `/run`.
|
||||
|
||||
### Aider v0.23.0
|
||||
|
||||
- Added support for `--model gpt-4-0125-preview` and OpenAI's alias `--model gpt-4-turbo-preview`. The `--4turbo` switch remains an alias for `--model gpt-4-1106-preview` at this time.
|
||||
- New `/test` command that runs a command and adds the output to the chat on non-zero exit status.
|
||||
- Improved streaming of markdown to the terminal.
|
||||
- Added `/quit` as alias for `/exit`.
|
||||
- Added `--skip-check-update` to skip checking for the update on launch.
|
||||
- Added `--openrouter` as a shortcut for `--openai-api-base https://openrouter.ai/api/v1`
|
||||
- Fixed bug preventing use of env vars `OPENAI_API_BASE, OPENAI_API_TYPE, OPENAI_API_VERSION, OPENAI_API_DEPLOYMENT_ID`.
|
||||
|
||||
### Aider v0.22.0
|
||||
|
||||
- Improvements for unified diff editing format.
|
||||
- Added ! as an alias for /run.
|
||||
- Autocomplete for /add and /drop now properly quotes filenames with spaces.
|
||||
- The /undo command asks GPT not to just retry reverted edit.
|
||||
|
||||
### Aider v0.21.1
|
||||
|
||||
- Bugfix for unified diff editing format.
|
||||
- Added --4turbo and --4 aliases for --4-turbo.
|
||||
|
||||
### Aider v0.21.0
|
||||
|
||||
- Support for python 3.12.
|
||||
- Improvements to unified diff editing format.
|
||||
- New `--check-update` arg to check if updates are available and exit with status code.
|
||||
|
||||
### Aider v0.20.0
|
||||
|
||||
- Add images to the chat to automatically use GPT-4 Vision, by @joshuavial
|
||||
|
||||
- Bugfixes:
|
||||
- Improved unicode encoding for `/run` command output, by @ctoth
|
||||
- Prevent false auto-commits on Windows, by @ctoth
|
||||
|
||||
### Aider v0.19.1
|
||||
|
||||
- Removed stray debug output.
|
||||
|
||||
### Aider v0.19.0
|
||||
|
||||
- [Significantly reduced "lazy" coding from GPT-4 Turbo due to new unified diff edit format](https://aider.chat/docs/unified-diffs.html)
|
||||
- Score improves from 20% to 61% on new "laziness benchmark".
|
||||
- Aider now uses unified diffs by default for `gpt-4-1106-preview`.
|
||||
- New `--4-turbo` command line switch as a shortcut for `--model gpt-4-1106-preview`.
|
||||
|
||||
### Aider v0.18.1
|
||||
|
||||
- Upgraded to new openai python client v1.3.7.
|
||||
|
||||
### Aider v0.18.0
|
||||
|
||||
- Improved prompting for both GPT-4 and GPT-4 Turbo.
|
||||
- Far fewer edit errors from GPT-4 Turbo (`gpt-4-1106-preview`).
|
||||
- Significantly better benchmark results from the June GPT-4 (`gpt-4-0613`). Performance leaps from 47%/64% up to 51%/71%.
|
||||
- Fixed bug where in-chat files were marked as both read-only and ready-write, sometimes confusing GPT.
|
||||
- Fixed bug to properly handle repos with submodules.
|
||||
|
||||
### Aider v0.17.0
|
||||
|
||||
- Support for OpenAI's new 11/06 models:
|
||||
- gpt-4-1106-preview with 128k context window
|
||||
- gpt-3.5-turbo-1106 with 16k context window
|
||||
- [Benchmarks for OpenAI's new 11/06 models](https://aider.chat/docs/benchmarks-1106.html)
|
||||
- Streamlined [API for scripting aider, added docs](https://aider.chat/docs/faq.html#can-i-script-aider)
|
||||
- Ask for more concise SEARCH/REPLACE blocks. [Benchmarked](https://aider.chat/docs/benchmarks.html) at 63.9%, no regression.
|
||||
- Improved repo-map support for elisp.
|
||||
- Fixed crash bug when `/add` used on file matching `.gitignore`
|
||||
- Fixed misc bugs to catch and handle unicode decoding errors.
|
||||
|
||||
### Aider v0.16.3
|
||||
|
||||
- Fixed repo-map support for C#.
|
||||
|
||||
### Aider v0.16.2
|
||||
|
||||
- Fixed docker image.
|
||||
|
||||
### Aider v0.16.1
|
||||
|
||||
- Updated tree-sitter dependencies to streamline the pip install process
|
||||
|
||||
### Aider v0.16.0
|
||||
|
||||
- [Improved repository map using tree-sitter](https://aider.chat/docs/repomap.html)
|
||||
- Switched from "edit block" to "search/replace block", which reduced malformed edit blocks. [Benchmarked](https://aider.chat/docs/benchmarks.html) at 66.2%, no regression.
|
||||
- Improved handling of malformed edit blocks targeting multiple edits to the same file. [Benchmarked](https://aider.chat/docs/benchmarks.html) at 65.4%, no regression.
|
||||
- Bugfix to properly handle malformed `/add` wildcards.
|
||||
|
||||
|
||||
### Aider v0.15.0
|
||||
|
||||
- Added support for `.aiderignore` file, which instructs aider to ignore parts of the git repo.
|
||||
- New `--commit` cmd line arg, which just commits all pending changes with a sensible commit message generated by gpt-3.5.
|
||||
- Added universal ctags and multiple architectures to the [aider docker image](https://aider.chat/docs/install/docker.html)
|
||||
- `/run` and `/git` now accept full shell commands, like: `/run (cd subdir; ls)`
|
||||
- Restored missing `--encoding` cmd line switch.
|
||||
|
||||
### Aider v0.14.2
|
||||
|
||||
- Easily [run aider from a docker image](https://aider.chat/docs/install/docker.html)
|
||||
- Fixed bug with chat history summarization.
|
||||
- Fixed bug if `soundfile` package not available.
|
||||
|
||||
### Aider v0.14.1
|
||||
|
||||
- /add and /drop handle absolute filenames and quoted filenames
|
||||
- /add checks to be sure files are within the git repo (or root)
|
||||
- If needed, warn users that in-chat file paths are all relative to the git repo
|
||||
- Fixed /add bug in when aider launched in repo subdir
|
||||
- Show models supported by api/key if requested model isn't available
|
||||
|
||||
### Aider v0.14.0
|
||||
|
||||
- [Support for Claude2 and other LLMs via OpenRouter](https://aider.chat/docs/faq.html#accessing-other-llms-with-openrouter) by @joshuavial
|
||||
- Documentation for [running the aider benchmarking suite](https://github.com/paul-gauthier/aider/tree/main/benchmark)
|
||||
- Aider now requires Python >= 3.9
|
||||
|
||||
|
||||
### Aider v0.13.0
|
||||
|
||||
- [Only git commit dirty files that GPT tries to edit](https://aider.chat/docs/faq.html#how-did-v0130-change-git-usage)
|
||||
- Send chat history as prompt/context for Whisper voice transcription
|
||||
- Added `--voice-language` switch to constrain `/voice` to transcribe to a specific language
|
||||
- Late-bind importing `sounddevice`, as it was slowing down aider startup
|
||||
- Improved --foo/--no-foo switch handling for command line and yml config settings
|
||||
|
||||
### Aider v0.12.0
|
||||
|
||||
- [Voice-to-code](https://aider.chat/docs/voice.html) support, which allows you to code with your voice.
|
||||
- Fixed bug where /diff was causing crash.
|
||||
- Improved prompting for gpt-4, refactor of editblock coder.
|
||||
- [Benchmarked](https://aider.chat/docs/benchmarks.html) at 63.2% for gpt-4/diff, no regression.
|
||||
|
||||
### Aider v0.11.1
|
||||
|
||||
- Added a progress bar when initially creating a repo map.
|
||||
- Fixed bad commit message when adding new file to empty repo.
|
||||
- Fixed corner case of pending chat history summarization when dirty committing.
|
||||
- Fixed corner case of undefined `text` when using `--no-pretty`.
|
||||
- Fixed /commit bug from repo refactor, added test coverage.
|
||||
- [Benchmarked](https://aider.chat/docs/benchmarks.html) at 53.4% for gpt-3.5/whole (no regression).
|
||||
|
||||
### Aider v0.11.0
|
||||
|
||||
- Automatically summarize chat history to avoid exhausting context window.
|
||||
- More detail on dollar costs when running with `--no-stream`
|
||||
- Stronger GPT-3.5 prompt against skipping/eliding code in replies (51.9% [benchmark](https://aider.chat/docs/benchmarks.html), no regression)
|
||||
- Defend against GPT-3.5 or non-OpenAI models suggesting filenames surrounded by asterisks.
|
||||
- Refactored GitRepo code out of the Coder class.
|
||||
|
||||
### Aider v0.10.1
|
||||
|
||||
- /add and /drop always use paths relative to the git root
|
||||
- Encourage GPT to use language like "add files to the chat" to ask users for permission to edit them.
|
||||
|
||||
### Aider v0.10.0
|
||||
|
||||
- Added `/git` command to run git from inside aider chats.
|
||||
- Use Meta-ENTER (Esc+ENTER in some environments) to enter multiline chat messages.
|
||||
- Create a `.gitignore` with `.aider*` to prevent users from accidentaly adding aider files to git.
|
||||
- Check pypi for newer versions and notify user.
|
||||
- Updated keyboard interrupt logic so that 2 ^C in 2 seconds always forces aider to exit.
|
||||
- Provide GPT with detailed error if it makes a bad edit block, ask for a retry.
|
||||
- Force `--no-pretty` if aider detects it is running inside a VSCode terminal.
|
||||
- [Benchmarked](https://aider.chat/docs/benchmarks.html) at 64.7% for gpt-4/diff (no regression)
|
||||
|
||||
|
||||
### Aider v0.9.0
|
||||
|
||||
- Support for the OpenAI models in [Azure](https://aider.chat/docs/faq.html#azure)
|
||||
- Added `--show-repo-map`
|
||||
- Improved output when retrying connections to the OpenAI API
|
||||
- Redacted api key from `--verbose` output
|
||||
- Bugfix: recognize and add files in subdirectories mentioned by user or GPT
|
||||
- [Benchmarked](https://aider.chat/docs/benchmarks.html) at 53.8% for gpt-3.5-turbo/whole (no regression)
|
||||
|
||||
### Aider v0.8.3
|
||||
|
||||
- Added `--dark-mode` and `--light-mode` to select colors optimized for terminal background
|
||||
- Install docs link to [NeoVim plugin](https://github.com/joshuavial/aider.nvim) by @joshuavial
|
||||
- Reorganized the `--help` output
|
||||
- Bugfix/improvement to whole edit format, may improve coding editing for GPT-3.5
|
||||
- Bugfix and tests around git filenames with unicode characters
|
||||
- Bugfix so that aider throws an exception when OpenAI returns InvalidRequest
|
||||
- Bugfix/improvement to /add and /drop to recurse selected directories
|
||||
- Bugfix for live diff output when using "whole" edit format
|
||||
|
||||
### Aider v0.8.2
|
||||
|
||||
- Disabled general availability of gpt-4 (it's rolling out, not 100% available yet)
|
||||
|
||||
### Aider v0.8.1
|
||||
|
||||
- Ask to create a git repo if none found, to better track GPT's code changes
|
||||
- Glob wildcards are now supported in `/add` and `/drop` commands
|
||||
- Pass `--encoding` into ctags, require it to return `utf-8`
|
||||
- More robust handling of filepaths, to avoid 8.3 windows filenames
|
||||
- Added [FAQ](https://aider.chat/docs/faq.html)
|
||||
- Marked GPT-4 as generally available
|
||||
- Bugfix for live diffs of whole coder with missing filenames
|
||||
- Bugfix for chats with multiple files
|
||||
- Bugfix in editblock coder prompt
|
||||
|
||||
### Aider v0.8.0
|
||||
|
||||
- [Benchmark comparing code editing in GPT-3.5 and GPT-4](https://aider.chat/docs/benchmarks.html)
|
||||
- Improved Windows support:
|
||||
- Fixed bugs related to path separators in Windows
|
||||
- Added a CI step to run all tests on Windows
|
||||
- Improved handling of Unicode encoding/decoding
|
||||
- Explicitly read/write text files with utf-8 encoding by default (mainly benefits Windows)
|
||||
- Added `--encoding` switch to specify another encoding
|
||||
- Gracefully handle decoding errors
|
||||
- Added `--code-theme` switch to control the pygments styling of code blocks (by @kwmiebach)
|
||||
- Better status messages explaining the reason when ctags is disabled
|
||||
|
||||
### Aider v0.7.2:
|
||||
|
||||
- Fixed a bug to allow aider to edit files that contain triple backtick fences.
|
||||
|
||||
### Aider v0.7.1:
|
||||
|
||||
- Fixed a bug in the display of streaming diffs in GPT-3.5 chats
|
||||
|
||||
### Aider v0.7.0:
|
||||
|
||||
- Graceful handling of context window exhaustion, including helpful tips.
|
||||
- Added `--message` to give GPT that one instruction and then exit after it replies and any edits are performed.
|
||||
- Added `--no-stream` to disable streaming GPT responses.
|
||||
- Non-streaming responses include token usage info.
|
||||
- Enables display of cost info based on OpenAI advertised pricing.
|
||||
- Coding competence benchmarking tool against suite of programming tasks based on Execism's python repo.
|
||||
- https://github.com/exercism/python
|
||||
- Major refactor in preparation for supporting new function calls api.
|
||||
- Initial implementation of a function based code editing backend for 3.5.
|
||||
- Initial experiments show that using functions makes 3.5 less competent at coding.
|
||||
- Limit automatic retries when GPT returns a malformed edit response.
|
||||
|
||||
### Aider v0.6.2
|
||||
|
||||
* Support for `gpt-3.5-turbo-16k`, and all OpenAI chat models
|
||||
* Improved ability to correct when gpt-4 omits leading whitespace in code edits
|
||||
* Added `--openai-api-base` to support API proxies, etc.
|
||||
|
||||
### Aider v0.5.0
|
||||
|
||||
- Added support for `gpt-3.5-turbo` and `gpt-4-32k`.
|
||||
- Added `--map-tokens` to set a token budget for the repo map, along with a PageRank based algorithm for prioritizing which files and identifiers to include in the map.
|
||||
- Added in-chat command `/tokens` to report on context window token usage.
|
||||
- Added in-chat command `/clear` to clear the conversation history.
|
||||
<!--[[[end]]]-->
|
39
aider/website/_config.yml
Normal file
|
@ -0,0 +1,39 @@
|
|||
theme: just-the-docs
|
||||
url: "https://aider.chat"
|
||||
|
||||
plugins:
|
||||
- jekyll-redirect-from
|
||||
- jekyll-sitemap
|
||||
- jekyll-feed
|
||||
|
||||
defaults:
|
||||
- scope:
|
||||
path: "README.md"
|
||||
type: "pages"
|
||||
values:
|
||||
description: "aider is AI pair programming in your terminal"
|
||||
|
||||
exclude:
|
||||
- "tmp*"
|
||||
- "**/tmp*"
|
||||
- OLD
|
||||
- "**/OLD/**"
|
||||
- "OLD/**"
|
||||
- vendor
|
||||
- feed.xml
|
||||
|
||||
aux_links:
|
||||
"GitHub":
|
||||
- "https://github.com/paul-gauthier/aider"
|
||||
"Discord":
|
||||
- "https://discord.gg/Tv2uQnR88V"
|
||||
"Blog":
|
||||
- "/blog/"
|
||||
|
||||
nav_external_links:
|
||||
- title: "GitHub"
|
||||
url: "https://github.com/paul-gauthier/aider"
|
||||
- title: "Discord"
|
||||
url: "https://discord.gg/Tv2uQnR88V"
|
||||
|
||||
repository: paul-gauthier/aider
|
684
aider/website/_data/edit_leaderboard.yml
Normal file
|
@ -0,0 +1,684 @@
|
|||
- dirname: 2024-05-01-20-05-59--direct-opus-filenames-outside-fence
|
||||
test_cases: 133
|
||||
model: claude-3-opus-20240229
|
||||
released: 2024-02-29
|
||||
edit_format: diff
|
||||
commit_hash: f4b1797-dirty, f4b1797
|
||||
pass_rate_1: 53.4
|
||||
pass_rate_2: 68.4
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 2
|
||||
num_malformed_responses: 0
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 1
|
||||
command: aider --opus
|
||||
date: 2024-05-01
|
||||
versions: 0.30.2-dev
|
||||
seconds_per_case: 32.4
|
||||
total_cost: 13.8395
|
||||
|
||||
- dirname: 2024-03-06-16-42-00--claude3-sonnet-whole
|
||||
test_cases: 133
|
||||
model: claude-3-sonnet-20240229
|
||||
released: 2024-02-29
|
||||
edit_format: whole
|
||||
commit_hash: a5f8076-dirty
|
||||
pass_rate_1: 43.6
|
||||
pass_rate_2: 54.9
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 1
|
||||
num_malformed_responses: 0
|
||||
user_asks: 1
|
||||
lazy_comments: 1
|
||||
syntax_errors: 2
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 7
|
||||
command: aider --sonnet
|
||||
date: 2024-03-06
|
||||
versions: 0.25.1-dev
|
||||
seconds_per_case: 23.1
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-05-03-20-47-24--gemini-1.5-pro-diff-fenced
|
||||
test_cases: 133
|
||||
model: gemini-1.5-pro-latest
|
||||
edit_format: diff-fenced
|
||||
commit_hash: 3a48dfb, 5d32dd7
|
||||
pass_rate_1: 45.9
|
||||
pass_rate_2: 57.1
|
||||
percent_cases_well_formed: 87.2
|
||||
error_outputs: 60
|
||||
num_malformed_responses: 17
|
||||
user_asks: 3
|
||||
lazy_comments: 0
|
||||
syntax_errors: 8
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --model gemini/gemini-1.5-pro-latest
|
||||
date: 2024-05-03
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 21.3
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-05-08-20-59-15--may-gpt-3.5-turbo-whole
|
||||
test_cases: 133
|
||||
model: gpt-3.5-turbo-0125
|
||||
released: 2024-01-25
|
||||
edit_format: whole
|
||||
commit_hash: 1d55f74
|
||||
pass_rate_1: 41.4
|
||||
pass_rate_2: 50.4
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 0
|
||||
num_malformed_responses: 0
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 3
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 4
|
||||
command: aider -3
|
||||
date: 2024-05-08
|
||||
versions: 0.33.1-dev
|
||||
seconds_per_case: 6.5
|
||||
total_cost: 0.5032
|
||||
|
||||
- dirname: 2023-11-06-21-23-59--gpt-3.5-turbo-0301
|
||||
test_cases: 133
|
||||
model: gpt-3.5-turbo-0301
|
||||
released: 2023-03-01
|
||||
edit_format: whole
|
||||
commit_hash: 44388db-dirty
|
||||
pass_rate_1: 50.4
|
||||
pass_rate_2: 57.9
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 1
|
||||
num_malformed_responses: 0
|
||||
user_asks: 1
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 8
|
||||
command: aider --model gpt-3.5-turbo-0301
|
||||
date: 2023-11-06
|
||||
versions: 0.16.4-dev
|
||||
seconds_per_case: 6.5
|
||||
total_cost: 0.4822
|
||||
|
||||
- dirname: 2023-11-07-02-41-07--gpt-3.5-turbo-0613
|
||||
test_cases: 133
|
||||
model: gpt-3.5-turbo-0613
|
||||
released: 2023-06-13
|
||||
edit_format: whole
|
||||
commit_hash: 93aa497-dirty
|
||||
pass_rate_1: 38.3
|
||||
pass_rate_2: 50.4
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 1
|
||||
num_malformed_responses: 0
|
||||
user_asks: 1
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 5
|
||||
command: aider --model gpt-3.5-turbo-0613
|
||||
date: 2023-11-07
|
||||
versions: 0.16.4-dev
|
||||
seconds_per_case: 18.0
|
||||
total_cost: 0.5366
|
||||
- dirname: 2024-04-30-21-40-51--litellm-gpt-3.5-turbo-1106-again
|
||||
test_cases: 132
|
||||
model: gpt-3.5-turbo-1106
|
||||
edit_format: whole
|
||||
commit_hash: 7b14d77
|
||||
pass_rate_1: 45.5
|
||||
pass_rate_2: 56.1
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 1
|
||||
num_malformed_responses: 0
|
||||
user_asks: 1
|
||||
lazy_comments: 0
|
||||
syntax_errors: 19
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model gpt-3.5-turbo-1106
|
||||
date: 2024-04-30
|
||||
versions: 0.30.2-dev
|
||||
seconds_per_case: 5.3
|
||||
total_cost: 0.3261
|
||||
|
||||
- dirname: 2024-01-25-23-37-15--jan-exercism-gpt-4-0125-preview-udiff
|
||||
test_cases: 133
|
||||
model: gpt-4-0125-preview
|
||||
released: 2024-01-25
|
||||
edit_format: udiff
|
||||
commit_hash: edcf9b1
|
||||
pass_rate_1: 55.6
|
||||
pass_rate_2: 66.2
|
||||
percent_cases_well_formed: 97.7
|
||||
error_outputs: 6
|
||||
num_malformed_responses: 3
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 3
|
||||
indentation_errors: 7
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 4
|
||||
command: aider --model gpt-4-0125-preview
|
||||
date: 2024-01-25
|
||||
versions: 0.22.1-dev
|
||||
seconds_per_case: 44.8
|
||||
total_cost: 14.6428
|
||||
|
||||
- dirname: 2024-05-04-15-07-30--redo-gpt-4-0314-diff-reminder-rules
|
||||
test_cases: 133
|
||||
model: gpt-4-0314
|
||||
released: 2023-03-14
|
||||
edit_format: diff
|
||||
commit_hash: 0d43468
|
||||
pass_rate_1: 50.4
|
||||
pass_rate_2: 66.2
|
||||
percent_cases_well_formed: 93.2
|
||||
error_outputs: 28
|
||||
num_malformed_responses: 9
|
||||
user_asks: 1
|
||||
lazy_comments: 3
|
||||
syntax_errors: 9
|
||||
indentation_errors: 7
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --model gpt-4-0314
|
||||
date: 2024-05-04
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 19.8
|
||||
total_cost: 16.2689
|
||||
|
||||
- dirname: 2023-12-16-21-24-28--editblock-gpt-4-0613-actual-main
|
||||
test_cases: 133
|
||||
model: gpt-4-0613
|
||||
released: 2023-06-13
|
||||
edit_format: diff
|
||||
commit_hash: 3aa17c4
|
||||
pass_rate_1: 46.6
|
||||
pass_rate_2: 67.7
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 14
|
||||
num_malformed_responses: 0
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 2
|
||||
command: aider -4
|
||||
date: 2023-12-16
|
||||
versions: 0.18.2-dev
|
||||
seconds_per_case: 33.6
|
||||
total_cost: 17.4657
|
||||
|
||||
- dirname: 2024-05-08-21-16-03--may-gpt-4-1106-preview-udiff
|
||||
test_cases: 133
|
||||
model: gpt-4-1106-preview
|
||||
released: 2023-11-06
|
||||
edit_format: udiff
|
||||
commit_hash: 87664dc
|
||||
pass_rate_1: 51.9
|
||||
pass_rate_2: 65.4
|
||||
percent_cases_well_formed: 92.5
|
||||
error_outputs: 30
|
||||
num_malformed_responses: 10
|
||||
user_asks: 0
|
||||
lazy_comments: 3
|
||||
syntax_errors: 11
|
||||
indentation_errors: 2
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 1
|
||||
command: aider --model gpt-4-1106-preview
|
||||
date: 2024-05-08
|
||||
versions: 0.33.1-dev
|
||||
seconds_per_case: 20.4
|
||||
total_cost: 6.6061
|
||||
|
||||
- dirname: 2024-05-01-02-09-20--gpt-4-turbo-examples
|
||||
test_cases: 133
|
||||
model: gpt-4-turbo-2024-04-09 (udiff)
|
||||
released: 2024-04-09
|
||||
edit_format: udiff
|
||||
commit_hash: e610e5b-dirty
|
||||
pass_rate_1: 48.1
|
||||
pass_rate_2: 63.9
|
||||
percent_cases_well_formed: 97.0
|
||||
error_outputs: 12
|
||||
num_malformed_responses: 4
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 4
|
||||
indentation_errors: 2
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --gpt-4-turbo
|
||||
date: 2024-05-01
|
||||
versions: 0.30.2-dev
|
||||
seconds_per_case: 22.8
|
||||
total_cost: 6.3337
|
||||
|
||||
- dirname: 2024-05-03-22-24-48--openrouter--llama3-diff-examples-sys-msg
|
||||
test_cases: 132
|
||||
model: llama3-70b-8192
|
||||
released: 2024-04-18
|
||||
edit_format: diff
|
||||
commit_hash: b5bb453
|
||||
pass_rate_1: 38.6
|
||||
pass_rate_2: 49.2
|
||||
percent_cases_well_formed: 73.5
|
||||
error_outputs: 105
|
||||
num_malformed_responses: 35
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 1
|
||||
indentation_errors: 2
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --model groq/llama3-70b-8192
|
||||
date: 2024-05-03
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 14.5
|
||||
total_cost: 0.4311
|
||||
|
||||
- dirname: 2024-05-06-18-31-08--command-r-plus-whole-final
|
||||
test_cases: 133
|
||||
model: command-r-plus
|
||||
released: 2024-04-04
|
||||
edit_format: whole
|
||||
commit_hash: fc3a43e-dirty
|
||||
pass_rate_1: 21.8
|
||||
pass_rate_2: 31.6
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 0
|
||||
num_malformed_responses: 0
|
||||
user_asks: 0
|
||||
lazy_comments: 1
|
||||
syntax_errors: 5
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 7
|
||||
command: aider --model command-r-plus
|
||||
date: 2024-05-06
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 22.9
|
||||
total_cost: 2.7494
|
||||
|
||||
- dirname: 2024-05-07-12-55-06--deepseek-chat-v2-whole
|
||||
test_cases: 133
|
||||
model: deepseek-chat v2 (whole)
|
||||
edit_format: whole
|
||||
commit_hash: b1cae73, db994fb
|
||||
pass_rate_1: 50.4
|
||||
pass_rate_2: 60.2
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 3
|
||||
num_malformed_responses: 0
|
||||
user_asks: 3
|
||||
lazy_comments: 13
|
||||
syntax_errors: 0
|
||||
indentation_errors: 2
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 1
|
||||
command: aider --model deepseek/deepseek-chat --edit-format whole
|
||||
date: 2024-05-07
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 42.4
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-05-09-18-57-52--deepseek-chat-v2-diff-reverted-and-helpful-assistant2
|
||||
test_cases: 133
|
||||
model: deepseek-chat v2 (diff)
|
||||
released: 2024-05-06
|
||||
edit_format: diff
|
||||
commit_hash: 80a3f6d
|
||||
pass_rate_1: 44.4
|
||||
pass_rate_2: 60.9
|
||||
percent_cases_well_formed: 97.0
|
||||
error_outputs: 14
|
||||
num_malformed_responses: 4
|
||||
user_asks: 2
|
||||
lazy_comments: 0
|
||||
syntax_errors: 13
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --model deepseek/deepseek-chat
|
||||
date: 2024-05-09
|
||||
versions: 0.33.1-dev
|
||||
seconds_per_case: 86.8
|
||||
total_cost: 0.0941
|
||||
|
||||
- dirname: 2024-05-07-20-32-37--qwen1.5-110b-chat-whole
|
||||
test_cases: 133
|
||||
model: qwen1.5-110b-chat
|
||||
released: 2024-02-04
|
||||
edit_format: whole
|
||||
commit_hash: 70b1c0c
|
||||
pass_rate_1: 30.8
|
||||
pass_rate_2: 37.6
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 3
|
||||
num_malformed_responses: 0
|
||||
user_asks: 3
|
||||
lazy_comments: 20
|
||||
syntax_errors: 0
|
||||
indentation_errors: 6
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --model together_ai/qwen/qwen1.5-110b-chat
|
||||
date: 2024-05-07
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 46.9
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-05-07-20-57-04--wizardlm-2-8x22b-whole
|
||||
test_cases: 133
|
||||
model: WizardLM-2 8x22B
|
||||
edit_format: whole
|
||||
commit_hash: 8e272bf, bbe8639
|
||||
pass_rate_1: 27.8
|
||||
pass_rate_2: 44.4
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 0
|
||||
num_malformed_responses: 0
|
||||
user_asks: 0
|
||||
lazy_comments: 1
|
||||
syntax_errors: 2
|
||||
indentation_errors: 2
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model openrouter/microsoft/wizardlm-2-8x22b
|
||||
date: 2024-05-07
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 36.6
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-05-13-17-39-05--gpt-4o-diff
|
||||
test_cases: 133
|
||||
model: gpt-4o
|
||||
released: 2024-05-13
|
||||
edit_format: diff
|
||||
commit_hash: b6cd852
|
||||
pass_rate_1: 60.2
|
||||
pass_rate_2: 72.9
|
||||
percent_cases_well_formed: 96.2
|
||||
error_outputs: 103
|
||||
num_malformed_responses: 5
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 2
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 1
|
||||
command: aider
|
||||
date: 2024-05-13
|
||||
versions: 0.34.1-dev
|
||||
seconds_per_case: 6.0
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-04-12-22-18-20--gpt-4-turbo-2024-04-09-plain-diff
|
||||
test_cases: 33
|
||||
model: gpt-4-turbo-2024-04-09 (diff)
|
||||
edit_format: diff
|
||||
commit_hash: 9b2e697-dirty
|
||||
pass_rate_1: 48.5
|
||||
pass_rate_2: 57.6
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 15
|
||||
num_malformed_responses: 0
|
||||
user_asks: 15
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model gpt-4-turbo-2024-04-09
|
||||
date: 2024-04-12
|
||||
versions: 0.28.1-dev
|
||||
seconds_per_case: 17.6
|
||||
total_cost: 1.6205
|
||||
|
||||
- dirname: 2024-06-08-22-37-55--qwen2-72b-instruct-whole
|
||||
test_cases: 133
|
||||
model: Qwen2 72B Instruct
|
||||
edit_format: whole
|
||||
commit_hash: 02c7335-dirty, 1a97498-dirty
|
||||
pass_rate_1: 44.4
|
||||
pass_rate_2: 55.6
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 3
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 3
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 1
|
||||
command: aider --model together_ai/qwen/Qwen2-72B-Instruct
|
||||
date: 2024-06-08
|
||||
versions: 0.37.1-dev
|
||||
seconds_per_case: 14.3
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-06-08-23-45-41--gemini-1.5-flash-latest-whole
|
||||
test_cases: 133
|
||||
model: gemini-1.5-flash-latest
|
||||
edit_format: whole
|
||||
commit_hash: 86ea47f-dirty
|
||||
pass_rate_1: 33.8
|
||||
pass_rate_2: 44.4
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 16
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 12
|
||||
lazy_comments: 0
|
||||
syntax_errors: 9
|
||||
indentation_errors: 1
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --model gemini/gemini-1.5-flash-latest
|
||||
date: 2024-06-08
|
||||
versions: 0.37.1-dev
|
||||
seconds_per_case: 7.2
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-06-09-03-28-21--codestral-whole
|
||||
test_cases: 133
|
||||
model: codestral-2405
|
||||
edit_format: whole
|
||||
commit_hash: effc88a
|
||||
pass_rate_1: 35.3
|
||||
pass_rate_2: 51.1
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 4
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 4
|
||||
lazy_comments: 1
|
||||
syntax_errors: 0
|
||||
indentation_errors: 1
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 4
|
||||
command: aider --model mistral/codestral-2405
|
||||
date: 2024-06-09
|
||||
versions: 0.37.1-dev
|
||||
seconds_per_case: 7.5
|
||||
total_cost: 0.6805
|
||||
|
||||
- dirname: 2024-06-08-19-25-26--codeqwen:7b-chat-v1.5-q8_0-whole
|
||||
test_cases: 133
|
||||
model: codeqwen:7b-chat-v1.5-q8_0
|
||||
edit_format: whole
|
||||
commit_hash: be0520f-dirty
|
||||
pass_rate_1: 32.3
|
||||
pass_rate_2: 34.6
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 8
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 8
|
||||
lazy_comments: 0
|
||||
syntax_errors: 1
|
||||
indentation_errors: 2
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 1
|
||||
command: aider --model ollama/codeqwen:7b-chat-v1.5-q8_0
|
||||
date: 2024-06-08
|
||||
versions: 0.37.1-dev
|
||||
seconds_per_case: 15.6
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-06-08-16-12-31--codestral:22b-v0.1-q8_0-whole
|
||||
test_cases: 133
|
||||
model: codestral:22b-v0.1-q8_0
|
||||
edit_format: whole
|
||||
commit_hash: be0520f-dirty
|
||||
pass_rate_1: 35.3
|
||||
pass_rate_2: 48.1
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 8
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 8
|
||||
lazy_comments: 2
|
||||
syntax_errors: 0
|
||||
indentation_errors: 1
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 3
|
||||
command: aider --model ollama/codestral:22b-v0.1-q8_0
|
||||
date: 2024-06-08
|
||||
versions: 0.37.1-dev
|
||||
seconds_per_case: 46.4
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-06-08-17-54-04--qwen2:72b-instruct-q8_0-whole
|
||||
test_cases: 133
|
||||
model: qwen2:72b-instruct-q8_0
|
||||
edit_format: whole
|
||||
commit_hash: 74e51d5-dirty
|
||||
pass_rate_1: 43.6
|
||||
pass_rate_2: 49.6
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 27
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 27
|
||||
lazy_comments: 0
|
||||
syntax_errors: 5
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model ollama/qwen2:72b-instruct-q8_0
|
||||
date: 2024-06-08
|
||||
versions: 0.37.1-dev
|
||||
seconds_per_case: 280.6
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-07-04-14-32-08--claude-3.5-sonnet-diff-continue
|
||||
test_cases: 133
|
||||
model: claude-3.5-sonnet
|
||||
edit_format: diff
|
||||
commit_hash: 35f21b5
|
||||
pass_rate_1: 57.1
|
||||
pass_rate_2: 77.4
|
||||
percent_cases_well_formed: 99.2
|
||||
error_outputs: 23
|
||||
num_malformed_responses: 4
|
||||
num_with_malformed_responses: 1
|
||||
user_asks: 2
|
||||
lazy_comments: 0
|
||||
syntax_errors: 1
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 1
|
||||
command: aider --sonnet
|
||||
date: 2024-07-04
|
||||
versions: 0.42.1-dev
|
||||
seconds_per_case: 17.6
|
||||
total_cost: 3.6346
|
||||
|
||||
- dirname: 2024-06-17-14-45-54--deepseek-coder2-whole
|
||||
test_cases: 133
|
||||
model: DeepSeek Coder V2 (whole)
|
||||
edit_format: whole
|
||||
commit_hash: ca8672b
|
||||
pass_rate_1: 63.9
|
||||
pass_rate_2: 75.2
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 1
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 1
|
||||
lazy_comments: 0
|
||||
syntax_errors: 1
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 7
|
||||
command: aider --model deepseek/deepseek-coder
|
||||
date: 2024-06-17
|
||||
versions: 0.38.1-dev
|
||||
seconds_per_case: 21.1
|
||||
total_cost: 0.0537
|
||||
|
||||
- dirname: 2024-06-21-15-29-08--deepseek-coder2-diff-again3
|
||||
test_cases: 133
|
||||
model: DeepSeek Coder V2 (diff)
|
||||
edit_format: diff
|
||||
commit_hash: 515ab3e
|
||||
pass_rate_1: 58.6
|
||||
pass_rate_2: 66.2
|
||||
percent_cases_well_formed: 98.5
|
||||
error_outputs: 23
|
||||
num_malformed_responses: 5
|
||||
num_with_malformed_responses: 2
|
||||
user_asks: 2
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 1
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 2
|
||||
command: aider --model deepseek/deepseek-coder
|
||||
date: 2024-06-21
|
||||
versions: 0.39.1-dev
|
||||
seconds_per_case: 30.2
|
||||
total_cost: 0.0857
|
||||
|
||||
- dirname: 2024-07-01-21-41-48--haiku-whole
|
||||
test_cases: 133
|
||||
model: claude-3-haiku-20240307
|
||||
edit_format: whole
|
||||
commit_hash: 75f506d
|
||||
pass_rate_1: 40.6
|
||||
pass_rate_2: 47.4
|
||||
percent_cases_well_formed: 100.0
|
||||
error_outputs: 6
|
||||
num_malformed_responses: 0
|
||||
num_with_malformed_responses: 0
|
||||
user_asks: 0
|
||||
lazy_comments: 0
|
||||
syntax_errors: 0
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 2
|
||||
command: aider --model claude-3-haiku-20240307
|
||||
date: 2024-07-01
|
||||
versions: 0.41.1-dev
|
||||
seconds_per_case: 7.1
|
||||
total_cost: 0.1946
|
||||
|
190
aider/website/_data/refactor_leaderboard.yml
Normal file
|
@ -0,0 +1,190 @@
|
|||
- dirname: 2024-05-04-23-27-02--refac-gemini
|
||||
test_cases: 89
|
||||
model: gemini/gemini-1.5-pro-latest
|
||||
edit_format: diff-fenced
|
||||
commit_hash: a0649ba-dirty, 425cb29, 1b35ca2-dirty, 3e4fca2-dirty
|
||||
pass_rate_1: 49.4
|
||||
percent_cases_well_formed: 7.9
|
||||
error_outputs: 247
|
||||
num_malformed_responses: 82
|
||||
user_asks: 0
|
||||
lazy_comments: 4
|
||||
syntax_errors: 0
|
||||
indentation_errors: 8
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model gemini/gemini-1.5-pro-latest
|
||||
date: 2024-05-04
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 55.7
|
||||
total_cost: 0.0000
|
||||
- dirname: 2024-05-04-17-45-53--refac-opus
|
||||
test_cases: 83
|
||||
model: claude-3-opus-20240229
|
||||
edit_format: diff
|
||||
commit_hash: b02320b-dirty
|
||||
pass_rate_1: 72.3
|
||||
percent_cases_well_formed: 79.5
|
||||
error_outputs: 51
|
||||
num_malformed_responses: 17
|
||||
user_asks: 0
|
||||
lazy_comments: 2
|
||||
syntax_errors: 1
|
||||
indentation_errors: 3
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --opus
|
||||
date: 2024-05-04
|
||||
versions: 0.31.2-dev
|
||||
seconds_per_case: 67.8
|
||||
total_cost: 27.9176
|
||||
- dirname: 2024-04-09-21-49-54--refac-gpt-4-turbo-2024-04-09
|
||||
test_cases: 88
|
||||
model: gpt-4-turbo-2024-04-09 (udiff)
|
||||
edit_format: udiff
|
||||
commit_hash: b75fdb9
|
||||
pass_rate_1: 34.1
|
||||
percent_cases_well_formed: 30.7
|
||||
error_outputs: 183
|
||||
num_malformed_responses: 61
|
||||
user_asks: 0
|
||||
lazy_comments: 1
|
||||
syntax_errors: 3
|
||||
indentation_errors: 15
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --gpt-4-turbo
|
||||
date: 2024-04-09
|
||||
versions: 0.27.1-dev
|
||||
seconds_per_case: 42.4
|
||||
total_cost: 19.6556
|
||||
|
||||
- dirname: 2024-05-08-22-25-41--may-refac-gpt-4-0125-preview-ex-sys
|
||||
test_cases: 89
|
||||
model: gpt-4-0125-preview
|
||||
edit_format: udiff
|
||||
commit_hash: bf09bd3-dirty
|
||||
pass_rate_1: 33.7
|
||||
percent_cases_well_formed: 47.2
|
||||
error_outputs: 142
|
||||
num_malformed_responses: 47
|
||||
user_asks: 0
|
||||
lazy_comments: 1
|
||||
syntax_errors: 2
|
||||
indentation_errors: 16
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model gpt-4-0125-preview
|
||||
date: 2024-05-08
|
||||
versions: 0.33.1-dev
|
||||
seconds_per_case: 56.6
|
||||
total_cost: 20.3270
|
||||
|
||||
- dirname: 2024-05-08-21-24-16--may-refac-gpt-4-1106-preview
|
||||
test_cases: 89
|
||||
model: gpt-4-1106-preview
|
||||
edit_format: udiff
|
||||
commit_hash: eaa2514-dirty
|
||||
pass_rate_1: 50.6
|
||||
percent_cases_well_formed: 39.3
|
||||
error_outputs: 164
|
||||
num_malformed_responses: 54
|
||||
user_asks: 1
|
||||
lazy_comments: 17
|
||||
syntax_errors: 0
|
||||
indentation_errors: 8
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model gpt-4-1106-preview
|
||||
date: 2024-05-08
|
||||
versions: 0.33.1-dev
|
||||
seconds_per_case: 61.8
|
||||
total_cost: 18.3844
|
||||
|
||||
- dirname: 2024-05-13-17-42-22--refac-gpt-4o-diff
|
||||
test_cases: 89
|
||||
model: gpt-4o
|
||||
edit_format: diff
|
||||
commit_hash: b6cd852
|
||||
pass_rate_1: 62.9
|
||||
percent_cases_well_formed: 53.9
|
||||
error_outputs: 9025
|
||||
num_malformed_responses: 41
|
||||
user_asks: 0
|
||||
lazy_comments: 2
|
||||
syntax_errors: 0
|
||||
indentation_errors: 5
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider
|
||||
date: 2024-05-13
|
||||
versions: 0.34.1-dev
|
||||
seconds_per_case: 27.8
|
||||
total_cost: 0.0000
|
||||
|
||||
- dirname: 2024-04-10-13-26-18--refac-gpt-4-turbo-2024-04-09-diff
|
||||
test_cases: 88
|
||||
model: gpt-4-turbo-2024-04-09 (diff)
|
||||
edit_format: diff
|
||||
commit_hash: 7875418
|
||||
pass_rate_1: 21.4
|
||||
percent_cases_well_formed: 6.8
|
||||
error_outputs: 247
|
||||
num_malformed_responses: 82
|
||||
user_asks: 1
|
||||
lazy_comments: 2
|
||||
syntax_errors: 3
|
||||
indentation_errors: 8
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --model gpt-4-turbo-2024-04-09
|
||||
date: 2024-04-10
|
||||
versions: 0.28.1-dev
|
||||
seconds_per_case: 67.8
|
||||
total_cost: 20.4889
|
||||
|
||||
- dirname: 2024-07-01-18-30-33--refac-claude-3.5-sonnet-diff-not-lazy
|
||||
test_cases: 89
|
||||
model: claude-3.5-sonnet (diff)
|
||||
edit_format: diff
|
||||
commit_hash: 7396e38-dirty
|
||||
pass_rate_1: 64.0
|
||||
percent_cases_well_formed: 76.4
|
||||
error_outputs: 176
|
||||
num_malformed_responses: 39
|
||||
num_with_malformed_responses: 21
|
||||
user_asks: 11
|
||||
lazy_comments: 2
|
||||
syntax_errors: 4
|
||||
indentation_errors: 0
|
||||
exhausted_context_windows: 0
|
||||
test_timeouts: 0
|
||||
command: aider --sonnet
|
||||
date: 2024-07-01
|
||||
versions: 0.40.7-dev
|
||||
seconds_per_case: 42.8
|
||||
total_cost: 11.5242
|
||||
|
||||
|
||||
- dirname: 2024-07-04-15-06-43--refac-deepseek-coder2-128k
|
||||
test_cases: 89
|
||||
model: DeepSeek Coder V2 (128k context)
|
||||
edit_format: diff
|
||||
commit_hash: 08868fd
|
||||
pass_rate_1: 38.2
|
||||
percent_cases_well_formed: 73.0
|
||||
error_outputs: 393
|
||||
num_malformed_responses: 89
|
||||
num_with_malformed_responses: 24
|
||||
user_asks: 4
|
||||
lazy_comments: 2
|
||||
syntax_errors: 1
|
||||
indentation_errors: 5
|
||||
exhausted_context_windows: 3
|
||||
test_timeouts: 0
|
||||
command: aider --model deepseek/deepseek-coder
|
||||
date: 2024-07-04
|
||||
versions: 0.42.1-dev
|
||||
seconds_per_case: 82.9
|
||||
total_cost: 0.2601
|
||||
|
17
aider/website/_includes/get-started.md
Normal file
|
@ -0,0 +1,17 @@
|
|||
|
||||
You can get started quickly like this:
|
||||
|
||||
```
|
||||
$ pip install aider-chat
|
||||
|
||||
# Change directory into a git repo
|
||||
$ cd /to/your/git/repo
|
||||
|
||||
# Work with Claude 3.5 Sonnet on your repo
|
||||
$ export ANTHROPIC_API_KEY=your-key-goes-here
|
||||
$ aider
|
||||
|
||||
# Work with GPT-4o on your repo
|
||||
$ export OPENAI_API_KEY=your-key-goes-here
|
||||
$ aider
|
||||
```
|
20
aider/website/_includes/head_custom.html
Normal file
|
@ -0,0 +1,20 @@
|
|||
{% if page.highlight_image %}
|
||||
<meta property="og:image" content="{{ site.url }}{{ page.highlight_image }}">
|
||||
<meta property="twitter:image" content="{{ site.url }}{{ page.highlight_image }}">
|
||||
{% else %}
|
||||
<meta property="og:image" content="{{ site.url }}/assets/aider.jpg">
|
||||
<meta property="twitter:image" content="{{ site.url }}/assets/aider-square.jpg">
|
||||
{% endif %}
|
||||
<link rel="alternate" type="application/rss+xml" title="RSS Feed" href="{{ site.url }}/feed.xml">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com">
|
||||
<link rel="preload" href="https://fonts.googleapis.com/css?family=Open+Sans:400,700&display=swap" as="style" type="text/css" crossorigin>
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<meta name="theme-color" content="#157878">
|
||||
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
|
||||
<link rel="icon" type="image/png" sizes="32x32" href="{{ '/assets/icons/favicon-32x32.png' | relative_url }}">
|
||||
<link rel="icon" type="image/png" sizes="16x16" href="{{ '/assets/icons/favicon-16x16.png' | relative_url }}">
|
||||
<link rel="apple-touch-icon" sizes="180x180" href="{{ '/assets/icons/apple-touch-icon.png' | relative_url }}">
|
||||
<link rel="manifest" href="{{ '/assets/icons/site.webmanifest' | relative_url }}">
|
||||
<link rel="mask-icon" href="{{ '/assets/icons/safari-pinned-tab.svg' | relative_url }}" color="#5bbad5">
|
||||
<meta name="msapplication-TileColor" content="#da532c">
|
||||
<meta name="theme-color" content="#ffffff">
|
22
aider/website/_includes/help.md
Normal file
|
@ -0,0 +1,22 @@
|
|||
If you need more help, please check our
|
||||
[GitHub issues](https://github.com/paul-gauthier/aider/issues)
|
||||
and file a new issue if your problem isn't discussed.
|
||||
Or drop into our
|
||||
[Discord](https://discord.gg/Tv2uQnR88V)
|
||||
to chat with us.
|
||||
|
||||
When reporting problems, it is very helpful if you can provide:
|
||||
|
||||
- Aider version
|
||||
- LLM model you are using
|
||||
|
||||
Including the "announcement" lines that
|
||||
aider prints at startup
|
||||
is an easy way to share this helpful info.
|
||||
|
||||
```
|
||||
Aider v0.37.1-dev
|
||||
Models: gpt-4o with diff edit format, weak model gpt-3.5-turbo
|
||||
Git repo: .git with 243 files
|
||||
Repo-map: using 1024 tokens
|
||||
```
|
62
aider/website/_includes/model-warnings.md
Normal file
|
@ -0,0 +1,62 @@
|
|||
|
||||
## Unknown context window size and token costs
|
||||
|
||||
```
|
||||
Model foobar: Unknown context window size and costs, using sane defaults.
|
||||
```
|
||||
|
||||
*You can probably ignore the unknown context window size and token costs warning.*
|
||||
|
||||
If you specify a model that aider has never heard of, you will get
|
||||
this warning.
|
||||
This means aider doesn't know the context window size and token costs
|
||||
for that model.
|
||||
Aider will use an unlimited context window and assume the model is free,
|
||||
so this is not usually a significant problem.
|
||||
|
||||
See the docs on
|
||||
[configuring advanced model settings](/docs/config/adv-model-settings.html)
|
||||
for details on how to remove this warning.
|
||||
|
||||
## Did you mean?
|
||||
|
||||
If aider isn't familiar with the model you've specified,
|
||||
it will suggest similarly named models.
|
||||
This helps
|
||||
in the case where you made a typo or mistake when specifying the model name.
|
||||
|
||||
```
|
||||
Model gpt-5o: Unknown context window size and costs, using sane defaults.
|
||||
Did you mean one of these?
|
||||
- gpt-4o
|
||||
```
|
||||
|
||||
## Missing environment variables
|
||||
|
||||
You need to set the listed environment variables.
|
||||
Otherwise you will get error messages when you start chatting with the model.
|
||||
|
||||
```
|
||||
Model azure/gpt-4-turbo: Missing these environment variables:
|
||||
- AZURE_API_BASE
|
||||
- AZURE_API_VERSION
|
||||
- AZURE_API_KEY
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Unknown which environment variables are required
|
||||
|
||||
```
|
||||
Model gpt-5: Unknown which environment variables are required.
|
||||
```
|
||||
|
||||
Aider is unable verify the environment because it doesn't know
|
||||
which variables are required for the model.
|
||||
If required variables are missing,
|
||||
you may get errors when you attempt to chat with the model.
|
||||
You can look in the [aider's LLM documentation](/docs/llms.html)
|
||||
or the
|
||||
[litellm documentation](https://docs.litellm.ai/docs/providers)
|
||||
to see if the required variables are listed there.
|
||||
|
4
aider/website/_includes/multi-line.md
Normal file
|
@ -0,0 +1,4 @@
|
|||
You can send long, multi-line messages in the chat in a few ways:
|
||||
- Paste a multi-line message directly into the chat.
|
||||
- Enter `{` alone on the first line to start a multiline message and `}` alone on the last line to end it.
|
||||
- Use Meta-ENTER to start a new line without sending the message (Esc+ENTER in some environments).
|
7
aider/website/_includes/nav_footer_custom.html
Normal file
|
@ -0,0 +1,7 @@
|
|||
<footer class="site-footer">
|
||||
Aider is AI pair programming in your terminal.
|
||||
Aider is on
|
||||
<a href="https://github.com/paul-gauthier/aider">GitHub</a>
|
||||
and
|
||||
<a href="https://discord.gg/Tv2uQnR88V">Discord</a>.
|
||||
</footer>
|
9
aider/website/_includes/special-keys.md
Normal file
|
@ -0,0 +1,9 @@
|
|||
Aider has special support for providing
|
||||
OpenAI and Anthropic API keys
|
||||
via command line switches and yaml config settings.
|
||||
*All other LLM providers* must
|
||||
have their keys and settings
|
||||
specified in environment variables.
|
||||
This can be done in your shell,
|
||||
or by using a
|
||||
[`.env` file](/docs/config/dotenv.html).
|
10
aider/website/_layouts/redirect.html
Normal file
|
@ -0,0 +1,10 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="en-US">
|
||||
<meta charset="utf-8">
|
||||
<title>Redirecting…</title>
|
||||
<link rel="canonical" href="{{ page.redirect.to }}">
|
||||
<script>location="{{ page.redirect.to }}"</script>
|
||||
<meta http-equiv="refresh" content="0; url={{ page.redirect.to }}">
|
||||
<h1>Redirecting…</h1>
|
||||
<a href="{{ page.redirect.to }}">Click here if you are not redirected.</a>
|
||||
</html>
|
1
aider/website/_posts/2023-05-25-ctags.md
Symbolic link
|
@ -0,0 +1 @@
|
|||
../docs/ctags.md
|
1
aider/website/_posts/2023-07-02-benchmarks.md
Symbolic link
|
@ -0,0 +1 @@
|
|||
../docs/benchmarks.md
|
268
aider/website/_posts/2023-10-22-repomap.md
Normal file
|
@ -0,0 +1,268 @@
|
|||
---
|
||||
title: Building a better repository map with tree sitter
|
||||
excerpt: Tree-sitter allows aider to build a repo map that better summarizes large code bases.
|
||||
highlight_image: /assets/robot-ast.png
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Building a better repository map with tree sitter
|
||||
|
||||

|
||||
|
||||
GPT-4 is extremely useful for "self-contained" coding tasks,
|
||||
like generating or modifying a simple function
|
||||
that has no dependencies. Tools like GitHub CoPilot serve
|
||||
these simple coding tasks well.
|
||||
|
||||
But making complex changes in a larger, pre-existing codebase
|
||||
is much more difficult, for both humans and AIs.
|
||||
To do this successfully, you need to:
|
||||
|
||||
1. Find the code that needs to be changed.
|
||||
2. Understand how that code relates to the rest of the codebase.
|
||||
3. Make the correct code change to accomplish the task.
|
||||
|
||||
GPT-4 is actually great at making the code changes (3),
|
||||
once you tell it which files need to be changed (1)
|
||||
and show it how they fit into the rest of the codebase (2).
|
||||
|
||||
This article is going to focus on step (2), providing "code context":
|
||||
|
||||
- We need to help GPT understand the overall codebase.
|
||||
- This will help it understand the code it needs to change, which may depend on other parts of the codebase.
|
||||
- It will also help GPT write new code and modify the existing code in a way
|
||||
that respects and utilizes existing libraries, modules and abstractions
|
||||
found elsewhere in the codebase.
|
||||
- We must convey all of this "code context" to GPT in an
|
||||
efficient manner that fits within the limited context window.
|
||||
|
||||
To address these issues, aider
|
||||
sends GPT a **concise map of your whole git repository**
|
||||
that includes
|
||||
the most important classes and functions along with their types and call signatures.
|
||||
|
||||
This **repository map** is now built automatically using
|
||||
[tree-sitter](https://tree-sitter.github.io/tree-sitter/)
|
||||
to extract symbol definitions from source files.
|
||||
Tree-sitter is used by many IDEs, editors and LSP servers to
|
||||
help humans search and navigate large codebases.
|
||||
Aider now uses it to help GPT better comprehend, navigate
|
||||
and edit code in larger repos.
|
||||
|
||||
*To code with GPT-4 using the techniques discussed here, just install [aider](https://aider.chat/docs/install.html).*
|
||||
|
||||
|
||||
## The problem: code context
|
||||
|
||||
GPT-4 is great at "self contained" coding tasks, like writing or
|
||||
modifying a pure function with no external dependencies.
|
||||
GPT can easily handle requests like "write a
|
||||
Fibonacci function" or "rewrite this loop using list
|
||||
comprehensions", because they require no context beyond the code
|
||||
being discussed.
|
||||
|
||||
Most real code is not pure and self-contained, it is intertwined with
|
||||
and depends on code from many different files in a repo.
|
||||
If you ask GPT to "switch all the print statements in class Foo to
|
||||
use the BarLog logging system", it needs to see and
|
||||
modify the code in the Foo class, but it also needs to understand
|
||||
how to use
|
||||
the project's BarLog
|
||||
subsystem.
|
||||
|
||||
A simple solution is to **send the entire codebase** to GPT along with
|
||||
each change request. Now GPT has all the context! But this won't work
|
||||
for even moderately
|
||||
sized repos, because they won't fit into the context window.
|
||||
|
||||
A better approach is to be selective,
|
||||
and **hand pick which files to send**.
|
||||
For the example above, you could send the file that
|
||||
contains the Foo class
|
||||
and the file that contains the BarLog logging subsystem.
|
||||
This works pretty well, and is supported by aider -- you
|
||||
can manually specify which files to "add to the chat" you are having with GPT.
|
||||
|
||||
But sending whole files is a bulky way to send code context,
|
||||
wasting the precious context window.
|
||||
GPT doesn't need to see the entire implementation of BarLog,
|
||||
it just needs to understand it well enough to use it.
|
||||
You may quickly run out of context window by sending
|
||||
full files of code
|
||||
just to convey context.
|
||||
|
||||
Aider also strives to reduce the manual work involved in
|
||||
coding with AI.
|
||||
So in an ideal world, we'd like aider to automatically
|
||||
identify and provide the needed code context.
|
||||
|
||||
## Using a repo map to provide context
|
||||
|
||||
Aider sends a **repo map** to GPT along with
|
||||
each request from the user to make a code change.
|
||||
The map contains a list of the files in the
|
||||
repo, along with the key symbols which are defined in each file.
|
||||
It shows how each of these symbols are defined in the
|
||||
source code, by including the critical lines of code for each definition.
|
||||
|
||||
Here's a
|
||||
sample of the map of the aider repo, just showing the maps of
|
||||
[base_coder.py](https://github.com/paul-gauthier/aider/blob/main/aider/coders/base_coder.py)
|
||||
and
|
||||
[commands.py](https://github.com/paul-gauthier/aider/blob/main/aider/commands.py)
|
||||
:
|
||||
|
||||
```
|
||||
aider/coders/base_coder.py:
|
||||
⋮...
|
||||
│class Coder:
|
||||
│ abs_fnames = None
|
||||
⋮...
|
||||
│ @classmethod
|
||||
│ def create(
|
||||
│ self,
|
||||
│ main_model,
|
||||
│ edit_format,
|
||||
│ io,
|
||||
│ skip_model_availabily_check=False,
|
||||
│ **kwargs,
|
||||
⋮...
|
||||
│ def abs_root_path(self, path):
|
||||
⋮...
|
||||
│ def run(self, with_message=None):
|
||||
⋮...
|
||||
|
||||
aider/commands.py:
|
||||
⋮...
|
||||
│class Commands:
|
||||
│ voice = None
|
||||
│
|
||||
⋮...
|
||||
│ def get_commands(self):
|
||||
⋮...
|
||||
│ def get_command_completions(self, cmd_name, partial):
|
||||
⋮...
|
||||
│ def run(self, inp):
|
||||
⋮...
|
||||
```
|
||||
|
||||
Mapping out the repo like this provides some key benefits:
|
||||
|
||||
- GPT can see classes, methods and function signatures from everywhere in the repo. This alone may give it enough context to solve many tasks. For example, it can probably figure out how to use the API exported from a module just based on the details shown in the map.
|
||||
- If it needs to see more code, GPT can use the map to figure out by itself which files it needs to look at in more detail. GPT will then ask to see these specific files, and aider will automatically add them to the chat context.
|
||||
|
||||
## Optimizing the map
|
||||
|
||||
Of course, for large repositories even just the repo map might be too large
|
||||
for GPT's context window.
|
||||
Aider solves this problem by sending just the **most relevant**
|
||||
portions of the repo map.
|
||||
It does this by analyzing the full repo map using
|
||||
a graph ranking algorithm, computed on a graph
|
||||
where each source file is a node and edges connect
|
||||
files which have dependencies.
|
||||
Aider optimizes the repo map by
|
||||
selecting the most important parts of the codebase
|
||||
which will
|
||||
fit into the token budget assigned by the user
|
||||
(via the `--map-tokens` switch, which defaults to 1k tokens).
|
||||
|
||||
The sample map shown above doesn't contain *every* class, method and function from those
|
||||
files.
|
||||
It only includes the most important identifiers,
|
||||
the ones which are most often referenced by other portions of the code.
|
||||
These are the key pieces of context that GPT needs to know to understand
|
||||
the overall codebase.
|
||||
|
||||
|
||||
## Using tree-sitter to make the map
|
||||
|
||||
Under the hood, aider uses
|
||||
[tree sitter](https://tree-sitter.github.io/tree-sitter/)
|
||||
to build the
|
||||
map.
|
||||
It specifically uses the
|
||||
[py-tree-sitter-languages](https://github.com/grantjenks/py-tree-sitter-languages)
|
||||
python module,
|
||||
which provides simple, pip-installable binary wheels for
|
||||
[most popular programming languages](https://github.com/paul-gauthier/grep-ast/blob/main/grep_ast/parsers.py).
|
||||
|
||||
Tree-sitter parses source code into an Abstract Syntax Tree (AST) based
|
||||
on the syntax of the programming language.
|
||||
Using the AST, we can identify where functions, classes, variables, types and
|
||||
other definitions occur in the source code.
|
||||
We can also identify where else in the code these things are used or referenced.
|
||||
|
||||
Aider uses all of these definitions and references to
|
||||
determine which are the most important identifiers in the repository,
|
||||
and to produce the repo map that shows just those key
|
||||
lines from the codebase.
|
||||
|
||||
## What about ctags?
|
||||
|
||||
The tree-sitter repository map replaces the
|
||||
[ctags based map](https://aider.chat/docs/ctags.html)
|
||||
that aider originally used.
|
||||
Switching from ctags to tree-sitter provides a bunch of benefits:
|
||||
|
||||
- The map is richer, showing full function call signatures and other details straight from the source files.
|
||||
- Thanks to `py-tree-sitter-languages`, we get full support for many programming languages via a python package that's automatically installed as part of the normal `pip install aider-chat`.
|
||||
- We remove the requirement for users to manually install `universal-ctags` via some external tool or package manager (brew, apt, choco, etc).
|
||||
- Tree-sitter integration is a key enabler for future work and capabilities for aider.
|
||||
|
||||
## Future work
|
||||
|
||||
You'll recall that we identified the 3 key steps
|
||||
required to use GPT
|
||||
to complete a coding task within a large, pre-existing codebase:
|
||||
|
||||
1. Find the code that needs to be changed.
|
||||
2. Understand how that code relates to the rest of the codebase.
|
||||
3. Make the correct code change to accomplish the task.
|
||||
|
||||
We're now using tree-sitter to help solve the code context problem (2),
|
||||
but it's also an important foundation
|
||||
for future work on automatically finding all the code which
|
||||
will need to be changed (1).
|
||||
|
||||
Right now, aider relies on the user to specify which source files
|
||||
will need to be modified to complete their request.
|
||||
Users manually "add files to the chat" using aider's `/add` command,
|
||||
which makes those files available for GPT to modify.
|
||||
|
||||
This works well, but a key piece of future work is to harness the
|
||||
power of GPT and tree-sitter to automatically identify
|
||||
which parts of the code will need changes.
|
||||
|
||||
## Try it out
|
||||
|
||||
To code with GPT-4 using the techniques discussed here,
|
||||
just install [aider](https://aider.chat/docs/install.html).
|
||||
|
||||
## Credits
|
||||
|
||||
Aider uses
|
||||
[modified versions of the tags.scm files](https://github.com/paul-gauthier/aider/tree/main/aider/queries)
|
||||
from these
|
||||
open source tree-sitter language implementations:
|
||||
|
||||
* [https://github.com/tree-sitter/tree-sitter-c](https://github.com/tree-sitter/tree-sitter-c) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-c-sharp](https://github.com/tree-sitter/tree-sitter-c-sharp) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-cpp](https://github.com/tree-sitter/tree-sitter-cpp) — licensed under the MIT License.
|
||||
* [https://github.com/Wilfred/tree-sitter-elisp](https://github.com/Wilfred/tree-sitter-elisp) — licensed under the MIT License.
|
||||
* [https://github.com/elixir-lang/tree-sitter-elixir](https://github.com/elixir-lang/tree-sitter-elixir) — licensed under the Apache License, Version 2.0.
|
||||
* [https://github.com/elm-tooling/tree-sitter-elm](https://github.com/elm-tooling/tree-sitter-elm) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-go](https://github.com/tree-sitter/tree-sitter-go) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-java](https://github.com/tree-sitter/tree-sitter-java) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-javascript](https://github.com/tree-sitter/tree-sitter-javascript) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-ocaml](https://github.com/tree-sitter/tree-sitter-ocaml) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-php](https://github.com/tree-sitter/tree-sitter-php) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-python](https://github.com/tree-sitter/tree-sitter-python) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-ql](https://github.com/tree-sitter/tree-sitter-ql) — licensed under the MIT License.
|
||||
* [https://github.com/r-lib/tree-sitter-r](https://github.com/r-lib/tree-sitter-r) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-ruby](https://github.com/tree-sitter/tree-sitter-ruby) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-rust](https://github.com/tree-sitter/tree-sitter-rust) — licensed under the MIT License.
|
||||
* [https://github.com/tree-sitter/tree-sitter-typescript](https://github.com/tree-sitter/tree-sitter-typescript) — licensed under the MIT License.
|
1
aider/website/_posts/2023-11-06-benchmarks-1106.md
Symbolic link
|
@ -0,0 +1 @@
|
|||
../docs/benchmarks-1106.md
|
1
aider/website/_posts/2023-11-06-benchmarks-speed-1106.md
Symbolic link
|
@ -0,0 +1 @@
|
|||
../docs/benchmarks-speed-1106.md
|
1
aider/website/_posts/2023-12-21-unified-diffs.md
Symbolic link
|
@ -0,0 +1 @@
|
|||
../docs/unified-diffs.md
|
1
aider/website/_posts/2024-01-25-benchmarks-0125.md
Symbolic link
|
@ -0,0 +1 @@
|
|||
../docs/benchmarks-0125.md
|
93
aider/website/_posts/2024-03-08-claude-3.md
Normal file
|
@ -0,0 +1,93 @@
|
|||
---
|
||||
title: Claude 3 beats GPT-4 on Aider's code editing benchmark
|
||||
excerpt: Claude 3 Opus outperforms all of OpenAI's models on Aider's code editing benchmark, making it the best available model for pair programming with AI.
|
||||
highlight_image: /assets/2024-03-07-claude-3.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Claude 3 beats GPT-4 on Aider's code editing benchmark
|
||||
|
||||
[](https://aider.chat/assets/2024-03-07-claude-3.svg)
|
||||
|
||||
[Anthropic just released their new Claude 3 models](https://www.anthropic.com/news/claude-3-family)
|
||||
with evals showing better performance on coding tasks.
|
||||
With that in mind, I've been benchmarking the new models
|
||||
using Aider's code editing benchmark suite.
|
||||
|
||||
Claude 3 Opus outperforms all of OpenAI's models,
|
||||
making it the best available model for pair programming with AI.
|
||||
|
||||
To use Claude 3 Opus with aider:
|
||||
|
||||
```
|
||||
pip install aider-chat
|
||||
export ANTHROPIC_API_KEY=sk-...
|
||||
aider --opus
|
||||
```
|
||||
|
||||
## Aider's code editing benchmark
|
||||
|
||||
[Aider](https://github.com/paul-gauthier/aider)
|
||||
is an open source command line chat tool that lets you
|
||||
pair program with AI on code in your local git repo.
|
||||
|
||||
Aider relies on a
|
||||
[code editing benchmark](https://aider.chat/docs/benchmarks.html)
|
||||
to quantitatively evaluate how well
|
||||
an LLM can make changes to existing code.
|
||||
The benchmark uses aider to try and complete
|
||||
[133 Exercism Python coding exercises](https://github.com/exercism/python).
|
||||
For each exercise,
|
||||
Exercism provides a starting python file with stubs for the needed functions,
|
||||
a natural language description of the problem to solve
|
||||
and a test suite to evaluate whether the coder has correctly solved the problem.
|
||||
|
||||
The LLM gets two tries to solve each problem:
|
||||
|
||||
1. On the first try, it gets the initial stub code and the English description of the coding task. If the tests all pass, we are done.
|
||||
2. If any tests failed, aider sends the LLM the failing test output and gives it a second try to complete the task.
|
||||
|
||||
## Benchmark results
|
||||
|
||||
### Claude 3 Opus
|
||||
|
||||
- The new `claude-3-opus-20240229` model got the highest score ever on this benchmark, completing 68.4% of the tasks with two tries.
|
||||
- Its single-try performance was comparable to the latest GPT-4 Turbo model `gpt-4-0125-preview`, at 54.1%.
|
||||
- While Opus got the highest score, it was only a few points higher than the GPT-4 Turbo results. Given the extra costs of Opus and the slower response times, it remains to be seen which is the most practical model for daily coding use.
|
||||
|
||||
### Claude 3 Sonnet
|
||||
|
||||
- The new `claude-3-sonnet-20240229` model performed similarly to OpenAI's GPT-3.5 Turbo models with an overall score of 54.9% and a first-try score of 43.6%.
|
||||
|
||||
## Code editing
|
||||
|
||||
It's highly desirable to have the LLM send back code edits as
|
||||
some form of diffs, rather than having it send back an updated copy of the
|
||||
entire source code.
|
||||
|
||||
Weaker models like GPT-3.5 are unable to use diffs, and are stuck sending back
|
||||
updated copies of entire source files.
|
||||
Aider uses more efficient
|
||||
[search/replace blocks](https://aider.chat/2023/07/02/benchmarks.html#diff)
|
||||
with the original GPT-4
|
||||
and
|
||||
[unified diffs](https://aider.chat/2023/12/21/unified-diffs.html#unified-diff-editing-format)
|
||||
with the newer GPT-4 Turbo models.
|
||||
|
||||
Claude 3 Opus works best with the search/replace blocks, allowing it to send back
|
||||
code changes efficiently.
|
||||
Unfortunately, the Sonnet model was only able to work reliably with whole files,
|
||||
which limits it to editing smaller source files and uses more tokens, money and time.
|
||||
|
||||
## Other observations
|
||||
|
||||
There are a few other things worth noting:
|
||||
|
||||
- Claude 3 Opus and Sonnet are both slower and more expensive than OpenAI's models. You can get almost the same coding skill faster and cheaper with OpenAI's models.
|
||||
- Claude 3 has a 2X larger context window than the latest GPT-4 Turbo, which may be an advantage when working with larger code bases.
|
||||
- The Claude models refused to perform a number of coding tasks and returned the error "Output blocked by content filtering policy". They refused to code up the [beer song](https://exercism.org/tracks/python/exercises/beer-song) program, which makes some sort of superficial sense. But they also refused to work in some larger open source code bases, for unclear reasons.
|
||||
- The Claude APIs seem somewhat unstable, returning HTTP 5xx errors of various sorts. Aider automatically recovers from these errors with exponential backoff retries, but it's a sign that Anthropic made be struggling under surging demand.
|
||||
|
74
aider/website/_posts/2024-04-09-gpt-4-turbo.md
Normal file
|
@ -0,0 +1,74 @@
|
|||
---
|
||||
title: GPT-4 Turbo with Vision is a step backwards for coding
|
||||
excerpt: OpenAI's GPT-4 Turbo with Vision model scores worse on aider's code editing benchmarks than all the previous GPT-4 models. In particular, it seems much more prone to "lazy coding" than the existing GPT-4 Turbo "preview" models.
|
||||
highlight_image: /assets/2024-04-09-gpt-4-turbo-laziness.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# GPT-4 Turbo with Vision is a step backwards for coding
|
||||
|
||||
[OpenAI just released GPT-4 Turbo with Vision](https://twitter.com/OpenAIDevs/status/1777769463258988634)
|
||||
and it performs worse on aider's coding benchmark suites than all the previous GPT-4 models.
|
||||
In particular, it seems much more prone to "lazy coding" than the
|
||||
existing GPT-4 Turbo "preview" models.
|
||||
|
||||
## Code editing skill
|
||||
|
||||
[](https://aider.chat/assets/2024-04-09-gpt-4-turbo.svg)
|
||||
|
||||
Aider relies on a
|
||||
[code editing benchmark](https://aider.chat/docs/benchmarks.html#the-benchmark)
|
||||
to quantitatively evaluate how well
|
||||
an LLM can make changes to existing code.
|
||||
The benchmark uses aider to try and complete
|
||||
[133 Exercism Python coding exercises](https://github.com/exercism/python).
|
||||
|
||||
For each exercise, the LLM gets two tries to solve each problem:
|
||||
|
||||
1. On the first try, it gets initial stub code and the English description of the coding task. If the tests all pass, we are done.
|
||||
2. If any tests failed, aider sends the LLM the failing test output and gives it a second try to complete the task.
|
||||
|
||||
**GPT-4 Turbo with Vision
|
||||
scores only 62% on this benchmark,
|
||||
the lowest score of any of the existing GPT-4 models.**
|
||||
The other models scored 63-66%, so this represents only a small
|
||||
regression, and is likely statistically insignificant when compared
|
||||
against `gpt-4-0613`.
|
||||
|
||||
## Lazy coding
|
||||
|
||||
[](https://aider.chat/assets/2024-04-09-gpt-4-turbo-laziness.svg)
|
||||
|
||||
The GPT-4 Turbo "preview" models have been widely criticized for being "lazy"
|
||||
when coding.
|
||||
They often omit needed code
|
||||
and instead leave comments with homework assignments like "implement method here".
|
||||
|
||||
```
|
||||
def some_complex_method(foo, bar):
|
||||
# ... implement method here ...
|
||||
```
|
||||
|
||||
Aider uses a ["laziness" benchmark suite](https://github.com/paul-gauthier/refactor-benchmark)
|
||||
which is designed to both provoke and quantify lazy coding.
|
||||
It consists of
|
||||
89 python refactoring tasks
|
||||
which tend to make GPT-4 Turbo code in that lazy manner.
|
||||
|
||||
**The new GPT-4 Turbo with Vision model scores only 34% on aider's
|
||||
refactoring benchmark, making it the laziest coder of all the GPT-4 Turbo models
|
||||
by a significant margin.**
|
||||
|
||||
# Conclusions
|
||||
|
||||
Aider has full support for the new GPT-4 Turbo with Vision
|
||||
model, which you can access using the switch `--model gpt-4-turbo-2024-04-09`.
|
||||
But aider will continue to use `gpt-4-1106-preview` by default,
|
||||
as it is by far the strongest coder of the GPT-4 models.
|
||||
|
||||
|
||||
|
||||
|
55
aider/website/_posts/2024-05-02-browser.md
Normal file
|
@ -0,0 +1,55 @@
|
|||
---
|
||||
title: Aider in your browser
|
||||
excerpt: Aider has an experimental browser UI, allowing you to collaborate with LLMs on code in your local git repo.
|
||||
highlight_image: /assets/browser.jpg
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Aider in your browser
|
||||
|
||||
<div class="video-container">
|
||||
<video controls loop poster="/assets/browser.jpg">
|
||||
<source src="/assets/aider-browser-social.mp4" type="video/mp4">
|
||||
<a href="/assets/aider-browser-social.mp4">Aider browser UI demo video</a>
|
||||
</video>
|
||||
</div>
|
||||
|
||||
<style>
|
||||
.video-container {
|
||||
position: relative;
|
||||
padding-bottom: 101.89%; /* 1080 / 1060 = 1.0189 */
|
||||
height: 0;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.video-container video {
|
||||
position: absolute;
|
||||
top: 0;
|
||||
left: 0;
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
}
|
||||
</style>
|
||||
|
||||
Use aider's new experimental browser UI to collaborate with LLMs
|
||||
to edit code in your local git repo.
|
||||
Aider will directly edit the code in your local source files,
|
||||
and [git commit the changes](https://aider.chat/docs/git.html)
|
||||
with sensible commit messages.
|
||||
You can start a new project or work with an existing git repo.
|
||||
Aider works well with GPT 3.5, GPT-4, GPT-4 Turbo with Vision,
|
||||
and Claude 3 Opus.
|
||||
It also supports [connecting to almost any LLM](https://aider.chat/docs/llms.html).
|
||||
|
||||
Use the `--browser` switch to launch the browser version of aider:
|
||||
|
||||
```
|
||||
pip install aider-chat
|
||||
|
||||
export OPENAI_API_KEY=<key> # Mac/Linux
|
||||
setx OPENAI_API_KEY <key> # Windows
|
||||
|
||||
aider --browser
|
||||
```
|
327
aider/website/_posts/2024-05-13-models-over-time.md
Normal file
|
@ -0,0 +1,327 @@
|
|||
---
|
||||
title: Drawing graphs with aider, GPT-4o and matplotlib
|
||||
excerpt: Use GPT-4o to draw graphs with matplotlib, including adjusting styles and making visual changes. You get the graph, but you also get the code in your repo.
|
||||
highlight_image: /assets/models-over-time.png
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||

|
||||
|
||||
# Drawing graphs with aider, GPT-4o and matplotlib
|
||||
|
||||
I recently wanted to draw a graph showing how LLM code editing skill has been
|
||||
changing over time as new models have been released by OpenAI, Anthropic and others.
|
||||
I have all the
|
||||
[data in a yaml file](https://github.com/paul-gauthier/aider/blob/main/website/_data/edit_leaderboard.yml) that is used to render
|
||||
[aider's LLM leaderboards](https://aider.chat/docs/leaderboards/).
|
||||
|
||||
Below is the aider chat transcript, which shows:
|
||||
|
||||
- I launch aider with the yaml file, a file with other plots I've done recently (so GPT can crib the style) and an empty file called `over_time.py`.
|
||||
- Then I ask GPT to draw the scatterplot I want.
|
||||
- I run the resulting script and share the error output with GPT so it can fix a small bug.
|
||||
- I ask it to color the points for GPT-4 and GPT-3.5 family models differently, to better see trends within those model families.
|
||||
- I add a PNG of the graph to the chat, so GPT can see the current graph and understand why I want it to move the point labels so they don't overlap the points.
|
||||
- I work through a series of other small style changes, like changing fonts and the graph border.
|
||||
|
||||
In the end I have the graph, but I also have the python code in my repo.
|
||||
So I can update this graph easily whenever I add new entries to the yaml data file.
|
||||
|
||||
|
||||
## Aider chat transcript
|
||||
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
> Aider v0.35.1-dev
|
||||
> Models: gpt-4o with diff edit format, weak model gpt-3.5-turbo
|
||||
> Added _data/edit_leaderboard.yml to the chat.
|
||||
> Added benchmark/over_time.py to the chat.
|
||||
> Added benchmark/plots.py to the chat.
|
||||
|
||||
#### in over_time.py, draw a graph for each of the yaml entries with a `released` date. plot date on x, `pass_rate_2` on y and label with the `model` name. do a scatter plot. use colors/styles similar to the other graphs.
|
||||
|
||||
To draw a scatter plot for each of the YAML entries with a `released` date, we need to:
|
||||
|
||||
1. Read the YAML file.
|
||||
2. Extract the `released` date and `pass_rate_2` for each entry.
|
||||
3. Plot the data using matplotlib.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
=======
|
||||
import matplotlib.pyplot as plt
|
||||
import yaml
|
||||
from datetime import datetime
|
||||
|
||||
def plot_over_time(yaml_file):
|
||||
with open(yaml_file, 'r') as file:
|
||||
data = yaml.safe_load(file)
|
||||
|
||||
dates = []
|
||||
pass_rates = []
|
||||
models = []
|
||||
|
||||
for entry in data:
|
||||
if 'released' in entry and 'pass_rate_2' in entry:
|
||||
dates.append(datetime.strptime(entry['released'], '%Y-%m-%d'))
|
||||
pass_rates.append(entry['pass_rate_2'])
|
||||
models.append(entry['model'])
|
||||
|
||||
plt.figure(figsize=(10, 6))
|
||||
plt.scatter(dates, pass_rates, c='blue', alpha=0.5)
|
||||
|
||||
for i, model in enumerate(models):
|
||||
plt.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)
|
||||
|
||||
plt.xlabel('Release Date')
|
||||
plt.ylabel('Pass Rate 2')
|
||||
plt.title('Model Performance Over Time')
|
||||
plt.grid(True)
|
||||
plt.tight_layout()
|
||||
plt.show()
|
||||
|
||||
# Example usage
|
||||
plot_over_time('_data/edit_leaderboard.yml')
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit 49ac9ca aider: Added scatter plot functionality for YAML entries with a `released` date and `pass_rate_2`.
|
||||
|
||||
</div>
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### /run python benchmark/over_time.py
|
||||
> Traceback (most recent call last):
|
||||
File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 33, in <module>
|
||||
plot_over_time('_data/edit_leaderboard.yml')
|
||||
File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 15, in plot_over_time
|
||||
dates.append(datetime.strptime(entry['released'], '%Y-%m-%d'))
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
TypeError: strptime() argument 1 must be str, not datetime.date
|
||||
> Add the output to the chat? y
|
||||
> Traceback (most recent call last):
|
||||
> File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 33, in <module>
|
||||
> plot_over_time('_data/edit_leaderboard.yml')
|
||||
> File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 15, in plot_over_time
|
||||
> dates.append(datetime.strptime(entry['released'], '%Y-%m-%d'))
|
||||
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
> TypeError: strptime() argument 1 must be str, not datetime.date
|
||||
|
||||
The error indicates that the `released` field in the YAML file is already a `datetime.date` object, so we don't need to parse it with `datetime.strptime`. We can directly use it in the plot.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
dates.append(datetime.strptime(entry['released'], '%Y-%m-%d'))
|
||||
=======
|
||||
dates.append(entry['released'])
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit af2f4b7 aider: Use the `released` field directly in the plot instead of parsing it with `datetime.strptime`.
|
||||
|
||||
</div>
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### color the "gpt-4" models differently
|
||||
|
||||
To color the "gpt-4" models differently, we need to modify the plotting code in `benchmark/over_time.py` to check for "gpt-4" in the model name and assign a different color.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
plt.scatter(dates, pass_rates, c='blue', alpha=0.5)
|
||||
=======
|
||||
colors = ['red' if 'gpt-4' in model else 'blue' for model in models]
|
||||
plt.scatter(dates, pass_rates, c=colors, alpha=0.5)
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
This change will color the points red if the model name contains "gpt-4", otherwise, it will color them blue.
|
||||
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit 0d11357 aider: Color the "gpt-4" models differently.
|
||||
|
||||
</div>
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### color the gpt-3.5 models a new color
|
||||
|
||||
To color the "gpt-3.5" models a new color, we need to modify the plotting code in `benchmark/over_time.py` to check for "gpt-3.5" in the model name and assign a different color.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
colors = ['red' if 'gpt-4' in model else 'blue' for model in models]
|
||||
=======
|
||||
colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models]
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit 85d9d84 aider: Color the "gpt-3.5" models green in the plotting code.
|
||||
|
||||
</div>
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### use the colorscheme and fonts from the other graphs in plots.py
|
||||
|
||||
To use the colorscheme and fonts from the graphs in `plots.py`, we need to:
|
||||
|
||||
1. Import the necessary settings from `plots.py`.
|
||||
2. Apply these settings to the graph in `over_time.py`.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
import yaml
|
||||
from datetime import datetime
|
||||
=======
|
||||
import yaml
|
||||
from datetime import datetime
|
||||
from matplotlib import rc
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
plt.figure(figsize=(10, 6))
|
||||
=======
|
||||
plt.rcParams["hatch.linewidth"] = 0.5
|
||||
plt.rcParams["hatch.color"] = "#444444"
|
||||
|
||||
rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10})
|
||||
|
||||
fig, ax = plt.subplots(figsize=(6, 4))
|
||||
ax.grid(axis="y", zorder=0, lw=0.2)
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
plt.scatter(dates, pass_rates, c=colors, alpha=0.5)
|
||||
=======
|
||||
ax.scatter(dates, pass_rates, c=colors, alpha=0.5)
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
plt.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)
|
||||
|
||||
plt.xlabel('Release Date')
|
||||
plt.ylabel('Pass Rate 2')
|
||||
plt.title('Model Performance Over Time')
|
||||
plt.grid(True)
|
||||
plt.tight_layout()
|
||||
plt.show()
|
||||
=======
|
||||
ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)
|
||||
|
||||
ax.set_xlabel('Release Date')
|
||||
ax.set_ylabel('Pass Rate 2')
|
||||
ax.set_title('Model Performance Over Time')
|
||||
plt.tight_layout()
|
||||
plt.savefig("tmp_over_time.svg")
|
||||
imgcat(fig)
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit 8d670f1 aider: Used the colorscheme and fonts from the graphs in `plots.py` in `over_time.py`.
|
||||
|
||||
|
||||
</div>
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### /add tmp_over_time.png
|
||||
> Added /Users/gauthier/Projects/aider/tmp_over_time.png to the chat
|
||||
|
||||
#### make the labels not overlap the dots
|
||||
|
||||
To make the labels not overlap the dots, we can adjust the position of the labels slightly using the `xytext` parameter in the `annotate` function.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)
|
||||
=======
|
||||
ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75,
|
||||
xytext=(5, 5), textcoords='offset points')
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
> Committing benchmark/over_time.py before applying edits.
|
||||
> Commit 49eb72c Updated image file format from SVG to PNG and added imgcat library for displaying the plot.
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit 75fa061 aider: Adjust the position of labels to prevent overlap with dots.
|
||||
|
||||
|
||||
</div>
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### make the dots bigger
|
||||
|
||||
To make the dots bigger, we can adjust the `s` parameter in the `scatter` function, which controls the size of the dots.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
ax.scatter(dates, pass_rates, c=colors, alpha=0.5)
|
||||
=======
|
||||
ax.scatter(dates, pass_rates, c=colors, alpha=0.5, s=100)
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit a018fdf aider: Make the dots bigger.
|
||||
|
||||
</div>
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### make the border of the graph thinner and lighter
|
||||
|
||||
To make the border of the graph thinner and lighter, we can adjust the `spines` properties of the axes.
|
||||
|
||||
Here are the changes:
|
||||
|
||||
benchmark/over_time.py
|
||||
```python
|
||||
<<<<<<< SEARCH
|
||||
colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models]
|
||||
=======
|
||||
for spine in ax.spines.values():
|
||||
spine.set_edgecolor('#DDDDDD')
|
||||
spine.set_linewidth(0.5)
|
||||
|
||||
colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models]
|
||||
>>>>>>> REPLACE
|
||||
```
|
||||
|
||||
> Applied edit to benchmark/over_time.py
|
||||
> Commit 77ba518 aider: Made the border of the graph thinner and lighter by adjusting the `spines` properties of the axes.
|
||||
|
||||
</div>
|
14
aider/website/_posts/2024-05-22-draft.md
Normal file
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
title: A draft post.
|
||||
excerpt: With a draft summary.
|
||||
highlight_image: /assets/linting.jpg
|
||||
draft: true
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# A draft post
|
||||
|
||||
Content TBD.
|
149
aider/website/_posts/2024-05-22-linting.md
Normal file
|
@ -0,0 +1,149 @@
|
|||
---
|
||||
title: Linting code for LLMs with tree-sitter
|
||||
excerpt: Aider now lints code after every LLM edit and automatically fixes errors, using tree-sitter and AST-aware code context.
|
||||
highlight_image: /assets/linting.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
[](https://aider.chat/assets/linting.jpg)
|
||||
|
||||
# Linting code for LLMs with tree-sitter
|
||||
|
||||
Aider now lints your code after every LLM edit, and offers to automatically fix
|
||||
any linting errors.
|
||||
You can also use aider's lint-and-fix functionality on your source files any time
|
||||
you like, to speedily resolve issues with code written by humans.
|
||||
|
||||
Aider shows linting errors to the LLM in a novel format,
|
||||
using tree-sitter
|
||||
to help display relevant code context for each
|
||||
error.
|
||||
This increases the ability of the LLM to understand the problem and
|
||||
make the correct changes to resolve it.
|
||||
|
||||
Aider ships with basic linters built with tree-sitter that support
|
||||
[most popular programming languages](https://github.com/paul-gauthier/grep-ast/blob/main/grep_ast/parsers.py).
|
||||
These built in linters will detect syntax errors and other fatal problems with the code.
|
||||
|
||||
You can also configure aider to use your preferred linters.
|
||||
This allows aider to check for a larger class of problems, keep the code style
|
||||
aligned with the rest of your team, etc.
|
||||
|
||||
## Linting and fixing your code
|
||||
|
||||
Aider now lints each source file after it applies the edits
|
||||
suggested by an LLM.
|
||||
If problems are found, aider will ask if you'd like it to
|
||||
attempt to fix the errors.
|
||||
If so, aider will send the LLM a report of the lint errors
|
||||
and request changes to fix them. This process may iterate a few times
|
||||
as the LLM works to fully resolve all the issues.
|
||||
|
||||
You can also lint and fix files any time, on demand from within the aider chat or via the
|
||||
command line:
|
||||
|
||||
- The in-chat `/lint` command will lint and fix all the files which have
|
||||
been added to the chat by default. Or you can name any files
|
||||
in your git repo as arguments.
|
||||
- From the command line, you can run `aider --lint` to lint and fix
|
||||
all the dirty files in the repo.
|
||||
Or you can specify specific filenames on the command line.
|
||||
|
||||
|
||||
## An LLM-friendly lint report
|
||||
|
||||
Most linting tools produce terse and cryptic output,
|
||||
which is one reason many engineers appreciate IDEs that highlight
|
||||
linting errors.
|
||||
LLM's don't have the luxury of using an IDE, so aider sends
|
||||
the linting errors in an LLM friendly format.
|
||||
|
||||
Here's an example of raw output of the `flake8` python linter:
|
||||
|
||||
```
|
||||
app.py:23:36: F821 undefined name 'num'
|
||||
app.py:41:16: F541 f-string is missing placeholders
|
||||
```
|
||||
|
||||
This sort of output depends on the user to reference line numbers to find and fix
|
||||
each reported error.
|
||||
LLMs are quite bad at working with source code line numbers, often
|
||||
making off-by-one errors and other mistakes even when provided with
|
||||
a fully numbered code listing.
|
||||
|
||||
Aider augments the raw linter by
|
||||
displaying and
|
||||
highlighting the lines that have errors within their
|
||||
containing functions, methods, classes.
|
||||
To do this, aider uses tree-sitter to obtain the code's AST and analyzes it
|
||||
in light of the linting errors.
|
||||
LLMs are more effective at editing code that's provided
|
||||
with context like this.
|
||||
|
||||
```
|
||||
app.py:23:36: F821 undefined name 'num'
|
||||
app.py:41:16: F541 f-string is missing placeholders
|
||||
|
||||
app.py:
|
||||
...⋮...
|
||||
6│class LongNum:
|
||||
7│ def __init__(self, num):
|
||||
8│ """
|
||||
9│ Initialize the number.
|
||||
10│ """
|
||||
...⋮...
|
||||
19│ def __str__(self):
|
||||
20│ """
|
||||
21│ Render the number as a string.
|
||||
22│ """
|
||||
23█ return str(num)
|
||||
24│
|
||||
25│
|
||||
26│@app.route('/subtract/<int:x>/<int:y>')
|
||||
...⋮...
|
||||
38│@app.route('/divide/<int:x>/<int:y>')
|
||||
39│def divide(x, y):
|
||||
40│ if y == 0:
|
||||
41█ return f"Error: Cannot divide by zero"
|
||||
42│ else:
|
||||
43│ result = x / y
|
||||
44│ return str(result)
|
||||
45│
|
||||
...⋮...
|
||||
```
|
||||
|
||||
## Basic linters for most popular languages
|
||||
|
||||
Aider comes batteries-included with built in linters for
|
||||
[most popular programming languages](https://github.com/paul-gauthier/grep-ast/blob/main/grep_ast/parsers.py).
|
||||
This provides wide support for linting without requiring
|
||||
users to manually install a linter and configure it to work with aider.
|
||||
|
||||
Aider's built in language-agnostic linter uses tree-sitter to parse
|
||||
the AST of each file.
|
||||
When tree-sitter encounters a syntax error or other fatal issue
|
||||
parsing a source file, it inserts an AST node with type `ERROR`.
|
||||
Aider simply uses these `ERROR` nodes to identify all the lines
|
||||
with syntax or other types of fatal error, and displays
|
||||
them in the LLM friendly format described above.
|
||||
|
||||
## Configuring your preferred linters
|
||||
|
||||
You can optionally configure aider to use
|
||||
your preferred linters with the `--lint-cmd` switch.
|
||||
|
||||
```
|
||||
# To lint javascript with jslint
|
||||
aider --lint-cmd javascript:jslint
|
||||
|
||||
# To lint python with flake8 using some specific args:
|
||||
aider --lint-cmd "python:flake8 --select=E9,F821,F823..."
|
||||
```
|
||||
|
||||
You can provide multiple `--lint-cmd` switches
|
||||
to set linters for various languages.
|
||||
You can also durably set linters in your `.aider.conf.yml` file.
|
||||
|
454
aider/website/_posts/2024-05-22-swe-bench-lite.md
Normal file
|
@ -0,0 +1,454 @@
|
|||
---
|
||||
title: How aider scored SOTA 26.3% on SWE Bench Lite
|
||||
excerpt: Aider achieved this result mainly through its existing features that focus on static code analysis, reliable LLM code editing, and pragmatic UX for AI pair programming.
|
||||
highlight_image: /assets/swe_bench_lite.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# How aider scored SOTA 26.3% on SWE Bench Lite
|
||||
|
||||
[Aider scored 26.3%](https://github.com/swe-bench/experiments/pull/7)
|
||||
on the
|
||||
[SWE Bench Lite benchmark](https://www.swebench.com),
|
||||
achieving a state-of-the-art result.
|
||||
The previous top leaderboard entry was 20.3%
|
||||
from Amazon Q Developer Agent.
|
||||
|
||||
See also [aider's SOTA result on the main SWE Bench](https://aider.chat/2024/06/02/main-swe-bench.html).
|
||||
|
||||
[](https://aider.chat/assets/swe_bench_lite.svg)
|
||||
|
||||
**All of aider's results reported here are pass@1 results,
|
||||
obtained without using the SWE Bench `hints_text`.**
|
||||
All results in the above chart are unhinted pass@1 results.
|
||||
Please see the [references](#references)
|
||||
for details on the data presented in this chart.
|
||||
It was corrected on 5/30/24 to reflect apples-to-apples comparisons,
|
||||
using pass@1 results from AutoCodeRover
|
||||
and results from OpenDevin that don't use hints.
|
||||
The [official SWE Bench Lite leaderboard](https://www.swebench.com)
|
||||
only accepts pass@1 results that do not use hints.
|
||||
|
||||
## Interactive, not agentic
|
||||
|
||||
Aider achieved this result mainly through its existing features that focus on static code analysis, reliable LLM code editing, and pragmatic UX for AI pair programming.
|
||||
Aider intentionally has quite limited and narrow "agentic behavior"
|
||||
to avoid long delays, high token costs
|
||||
and the need for users to repeatedly code review incorrect solutions.
|
||||
It's also worth noting that aider currently does not use
|
||||
RAG, vector search, tools or give the LLM access to search the web
|
||||
or unilaterally execute code.
|
||||
|
||||
Aider is first and foremost an interactive tool for engineers to get real work done in
|
||||
real code bases using a chat interface.
|
||||
Aider provides a pair programming UX where users can ask for a change
|
||||
and see the edits performed in real-time.
|
||||
Aider can also offer additional help like fixing lint or test errors,
|
||||
but the user is always in full interactive control.
|
||||
This lets them quickly steer misunderstandings back on course and
|
||||
avoid wasting time and token costs.
|
||||
|
||||
|
||||
## Benchmark methodology
|
||||
|
||||
For the benchmark,
|
||||
aider was launched in each problem's git repository
|
||||
with the problem statement
|
||||
submitted as the opening chat message from "the user."
|
||||
After that aider runs as normal, with the following modifications:
|
||||
|
||||
- Aider's suggestions were always accepted without user approval.
|
||||
- A simple harness was used to retry the SWE Bench problem if aider produced code that wasn't *plausibly correct*.
|
||||
Plausibly correct means that aider reported that it had successfully edited the repo
|
||||
without causing syntax errors or breaking any *pre-existing* tests.
|
||||
- If the solution isn't plausible, the harness launches aider to try again from scratch,
|
||||
alternating between using aider with GPT-4o and Opus.
|
||||
- If no plausible solution is found after six tries, the harness picks the solution
|
||||
with the fewest edit/lint/test problems.
|
||||
|
||||
It's important to be clear that
|
||||
*aider and the benchmark harness
|
||||
only had access to the pre-existing tests in each problem's repo*.
|
||||
The held out "acceptance tests" were *only* used
|
||||
after benchmarking to compute statistics on which problems aider
|
||||
correctly resolved.
|
||||
|
||||
The [full harness to run aider on SWE Bench Lite is available on GitHub](https://github.com/paul-gauthier/aider-swe-bench).
|
||||
|
||||
The benchmarking process was similar to how a developer might use aider to
|
||||
resolve a GitHub issue:
|
||||
|
||||
- They could launch aider in their repo with the command below, which
|
||||
tells aider they want to accept every suggestion
|
||||
and to use pytest to run tests.
|
||||
- `aider --yes --test-cmd pytest`
|
||||
- They could start the chat by pasting in the URL or text of a GitHub issue.
|
||||
Aider will pull in the URL's content and then try and solve the issue.
|
||||
- If aider doesn't produce code that lints and tests clean, the user might decide to revert the changes and try again, maybe using aider with a different LLM this time.
|
||||
[Aider is tightly integrated with git](https://aider.chat/docs/git.html),
|
||||
so it's always easy to revert AI changes that don't pan out.
|
||||
|
||||
Outside a benchmark setting, it's probably
|
||||
unwise or at least highly inefficient
|
||||
to let *any* AI agent run unsupervised on your code base.
|
||||
The reason aider is intended to be used interactively
|
||||
is so that the user can participate and direct aider's work and approve suggestions.
|
||||
This way the user can offer immediate feedback or corrections if their initial
|
||||
instructions turn out to be ambiguous,
|
||||
or if the AI starts going down a wrong path.
|
||||
|
||||
## Aider with GPT-4o alone was SOTA
|
||||
|
||||
Running the benchmark harness
|
||||
only using aider with GPT-4o to find plausible solutions
|
||||
achieved a score of 25.0%.
|
||||
This was itself matching the state-of-the-art, before being surpassed by the main
|
||||
result being reported here
|
||||
that used aider with both GPT-4o & Opus.
|
||||
|
||||
As noted below, a single attempt using Aider with GPT-4o tied
|
||||
the current top entry on the leaderboard.
|
||||
|
||||
## Aider with GPT-4o & Opus
|
||||
|
||||
The benchmark harness alternated between running aider with GPT-4o and Opus.
|
||||
The harness proceeded in a fixed order, always starting with GPT-4o and
|
||||
then alternating with Opus until a plausible solution was found for each
|
||||
problem.
|
||||
|
||||
The table below breaks down the plausible solutions that
|
||||
were found for the 300 problems.
|
||||
It also provides details on the 79 that were ultimately
|
||||
verified as correctly resolving their issue.
|
||||
Some noteworthy observations:
|
||||
|
||||
- *Just the first attempt* of Aider with GPT-4o resolved 20.3% of the problems, which ties the Amazon Q Developer Agent currently atop the official leaderboard.
|
||||
- Including the second attempt, Aider with GPT-4o and Opus scored 23.6% on the benchmark.
|
||||
These first two attempts obtained ~75% of all plausible and ~90% of all resolved solutions.
|
||||
- A long tail of solutions continued to be found using both models including one correctly resolved solution on the final, sixth attempt of that problem.
|
||||
|
||||
|
||||
| Attempt | Agent |Number of<br>plausible<br>solutions|Percent of<br>plausible<br>solutions| Number of<br/>correctly<br>resolved<br>solutions | Percent of<br>correctly<br>resolved<br>solutions | Score on<br>SWE Bench<br>Lite |
|
||||
|:--------:|------------|---------:|---------:|----:|---:|--:|
|
||||
| 1 | Aider with GPT-4o | 208 | 69.3% | 61 | 77.2% | 20.3% |
|
||||
| 2 | Aider with Opus | 49 | 16.3% | 10 | 12.7% | 3.3% |
|
||||
| 3 | Aider with GPT-4o | 20 | 6.7% | 3 | 3.8% | 1.0% |
|
||||
| 4 | Aider with Opus | 9 | 3.0% | 2 | 2.5% | 0.7% |
|
||||
| 5 | Aider with GPT-4o | 11 | 3.7% | 2 | 2.5% | 0.7% |
|
||||
| 6 | Aider with Opus | 3 | 1.0% | 1 | 1.3% | 0.3% |
|
||||
| **Total** | | **300** | **100%** | **79** | **100%** | **26.3%** |
|
||||
|
||||
|
||||
If we break down the solutions solely by model,
|
||||
we can see that aider with GPT-4o outperforms Opus.
|
||||
This isn't a fair and direct comparison, because GPT-4o always took the first
|
||||
turn and therefore got first crack at all the "easiest" problems.
|
||||
Aider with Opus only ever saw problems that GPT-4o failed to
|
||||
find plausible solutions for on its first try.
|
||||
|
||||
Aider with GPT-4o was producing higher quality plausible solutions,
|
||||
with a greater chance of going on to be accepted as resolving the issue.
|
||||
Again, this is biased by the turn ordering.
|
||||
But other anecdotal evidence from earlier runs of the benchmark
|
||||
also supports the observation that aider with GPT-4o is significantly stronger than Opus
|
||||
for this benchmark.
|
||||
|
||||
|
||||
| Agent | Number of<br>plausible<br>solutions | Number of<br>correctly<br>resolved<br>solutions | Percent of<br>plausible<br>which<br>correctly<br>resolved<br>|
|
||||
|------------|---------:|---------:|---:|
|
||||
| Aider with GPT-4o | 239 | 66 |27.6% |
|
||||
| Aider with Opus | 61 | 13 |21.3% |
|
||||
| **Total** | **300** | **79** |**26.3%** |
|
||||
|
||||
## Repository map, not RAG
|
||||
|
||||
The crucial first step in solving a SWE Bench problem is figuring out
|
||||
which parts of the repo are relevant and which files need to be edited.
|
||||
Most coding agents use some combination of RAG, vector search
|
||||
and providing the LLM with
|
||||
tools to interactively explore the code base.
|
||||
|
||||
Aider instead uses a
|
||||
[repository map](https://aider.chat/2023/10/22/repomap.html)
|
||||
to help the LLM understand the
|
||||
layout, code structure, and content of a git repo.
|
||||
The repo map is created through static analysis of the code's
|
||||
abstract syntax tree and call graph
|
||||
to provide a compact and powerful summary of the entire code base.
|
||||
The map is constantly
|
||||
tailored to show
|
||||
repo context that is relevant to the current state of the chat conversation.
|
||||
This is done by performing a graph optimization on the code's call graph.
|
||||
|
||||
When the user asks for a change to their code, the LLM can use the repo map
|
||||
to decide which files to edit.
|
||||
The LLM simply returns a normal text response explaining which files
|
||||
it needs to edit and why.
|
||||
Aider notices when the LLM mentions filenames from the repo,
|
||||
and asks the user if they should be added to the chat.
|
||||
Adding a file to the chat allows the LLM to see the full contents
|
||||
of the file and edit it.
|
||||
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
#### Please add a new /factorial/N endpoint.
|
||||
|
||||
To add a new /factorial/N endpoint, the most likely file that needs to be edited is app.py.
|
||||
Please add app.py to the chat so I can proceed with the changes.
|
||||
|
||||
> app.py
|
||||
> Add these files to the chat? yes
|
||||
|
||||
</div>
|
||||
|
||||
This is a convenient and natural workflow for interactive chat,
|
||||
and it worked well for the SWE Bench problems.
|
||||
Aider successfully identified the correct file to edit
|
||||
in 70.3% of the benchmark tasks.
|
||||
|
||||
We can determine which file needs to be edited using the "gold" patch
|
||||
which is associated with each SWE Bench task.
|
||||
This patch was created by a human developer
|
||||
to solve the issue, and therefore reveals a file which can
|
||||
be edited to solve the problem.
|
||||
Of course aider is not able to see or use the gold patch
|
||||
or the file names it contains in any way.
|
||||
This information was only used to compute
|
||||
statistics outside the benchmarking process.
|
||||
|
||||
|
||||
## Reliable code editing
|
||||
|
||||
Once files have been selected for editing,
|
||||
the next step is of course to edit the source code to fix the problem.
|
||||
|
||||
Aider goes to great lengths to ensure that LLMs can not just write code,
|
||||
but reliably *edit* code.
|
||||
Aider has a collection of prompting strategies and code editing backends which have
|
||||
been honed through
|
||||
[extensive benchmarking](https://aider.chat/docs/leaderboards/).
|
||||
These foundational capabilities help ensure that aider can
|
||||
properly integrate code from LLMs into an existing code base and source files.
|
||||
|
||||
The repository map helps here too, making sure that the LLM
|
||||
can see relevant classes, functions and variables from the entire repo.
|
||||
This helps ensure that the project's existing APIs and conventions are
|
||||
respected and utilized when new code is added.
|
||||
|
||||
Regardless, there are still cases where aider may be unable to cleanly
|
||||
complete the edits specified by the LLM.
|
||||
This is usually because the LLM has failed to conform to the editing
|
||||
instructions in its system prompt.
|
||||
When aider completes, it returns an editing outcome that indicates
|
||||
whether it was able to successfully apply all edits.
|
||||
The benchmark harness uses this editing status as
|
||||
one criteria to determine if aider has
|
||||
created a plausible solution.
|
||||
|
||||
## Linting and fixing
|
||||
|
||||
Another key criteria for a plausible solution is that it passes basic
|
||||
linting, which means that the code has no syntax
|
||||
or other fatal errors.
|
||||
[Aider lints code](https://aider.chat/2024/05/22/linting.html)
|
||||
after every LLM edit and offers to automatically fix
|
||||
any problems.
|
||||
|
||||
Aider ships with built-in linters based on tree-sitter
|
||||
which work with most popular programming languages.
|
||||
Aider shows linting errors to the LLM in a novel format,
|
||||
using the abstract syntax tree to display relevant code context for each
|
||||
error.
|
||||
This context helps LLMs understand the problem and
|
||||
make the correct changes to resolve it.
|
||||
|
||||
<div class="chat-transcript" markdown="1">
|
||||
|
||||
```
|
||||
app.py:23:36: F821 undefined name 'num'
|
||||
|
||||
app.py:
|
||||
...⋮...
|
||||
6│class LongNum:
|
||||
...⋮...
|
||||
19│ def expound(self, threshold):
|
||||
20│ number = self.basis
|
||||
21│ while number < threshold:
|
||||
22│ number *= self.factor
|
||||
23█ return num
|
||||
24│
|
||||
25│
|
||||
...⋮...
|
||||
```
|
||||
|
||||
> Attempt to fix lint errors? yes
|
||||
|
||||
</div>
|
||||
|
||||
In the benchmark, these linting suggestions are always accepted.
|
||||
At completion,
|
||||
aider reports a linting outcome that
|
||||
indicates if it was able to produce
|
||||
code without any outstanding linting errors.
|
||||
The benchmark harness uses this status as
|
||||
one of the criteria to determine if aider has
|
||||
created a plausible solution.
|
||||
|
||||
## Testing and fixing
|
||||
|
||||
The final crtieria for a plausible solution is that
|
||||
all tests must be passing.
|
||||
Aider can be configured with the command to run tests for a repo,
|
||||
and will automatically attempt to fix any test failures.
|
||||
|
||||
A user working on a python project might configure testing
|
||||
by launching aider like this:
|
||||
|
||||
```
|
||||
aider --test-cmd pytest
|
||||
```
|
||||
|
||||
For the benchmark, aider is configured with a test command that will run the
|
||||
tests that already exist in each problem's repository.
|
||||
SWE Bench problems are based on repositories from large open
|
||||
source projects with extensive existing test suites.
|
||||
This means that
|
||||
testing will fail if aider has broken any of these
|
||||
pre-existing tests or if any new
|
||||
tests that it created aren't passing.
|
||||
|
||||
As with editing and linting, aider reports a testing outcome
|
||||
that indicates if it completed with any outstanding failing tests.
|
||||
The benchmark harness uses this status when deciding if aider
|
||||
has produced a plausible solution.
|
||||
|
||||
To be clear, *aider cannot run or even see the held out "acceptance tests"* that
|
||||
are used to judge if a proposed solution correctly
|
||||
resolves the problem.
|
||||
Those tests are only run outside of aider and the benchmark harness,
|
||||
to compute the final benchmark statistics.
|
||||
|
||||
## Finding a plausible solution
|
||||
|
||||
Each time aider executes, it reports
|
||||
the outcome of the editing, linting, and testing
|
||||
steps.
|
||||
Each of these steps may complete successfully or
|
||||
return a status that indicates that there were outstanding
|
||||
problems that remain unresolved.
|
||||
|
||||
The benchmark harness uses these outcomes to determine if
|
||||
aider has produced a plausible
|
||||
solution to the current SWE Bench task.
|
||||
A plausible solution is one where aider
|
||||
returns saying that it
|
||||
edited the repo with no outstanding
|
||||
edit, lint, or test errors.
|
||||
In this case, aider's changes are recorded
|
||||
as the SWE Bench `model_patch` to be evaluated later with the
|
||||
acceptance tests.
|
||||
|
||||
If the solution is not plausible, another
|
||||
instance of aider is launched again from scratch on the same problem.
|
||||
The harness alternates launching aider with GPT-4o and Opus to solve the problem,
|
||||
and gives each model three attempts -- for a total of six attempts.
|
||||
As soon as a plausible solution is found, it is accepted and the
|
||||
harness moves on to the next SWE Bench instance.
|
||||
|
||||
It's worth noting that repositories may have lint or test errors
|
||||
present before aider even starts to edit them.
|
||||
Whether unresolved errors were caused by aider or were pre-existing,
|
||||
there will be instances where
|
||||
no plausible solution is
|
||||
found after six tries.
|
||||
|
||||
If all six attempts fail to produce a plausible solution,
|
||||
then the "best" solution available is selected as the
|
||||
`model_patch`.
|
||||
Which of the non-plausible solutions to use is determined
|
||||
by ignoring the testing outcome
|
||||
and prioritizing solutions in the following order:
|
||||
|
||||
- Pick a solution where editing and linting were completed successfully.
|
||||
- Pick a solution where editing was at least partially successful and linting succeeded.
|
||||
- Pick a solution where editing was successful.
|
||||
- Pick a solution where editing was at least partially successful.
|
||||
|
||||
## Computing the benchmark score
|
||||
|
||||
The benchmark harness produced a plausible solution for each of the 300
|
||||
SWE Bench Lite instances and saved it as the `model_patch`.
|
||||
|
||||
A separate evaluation script was used to
|
||||
test each of these solutions with the full test suite,
|
||||
including the held out acceptance tests.
|
||||
For this final acceptance testing, any edits that aider made to tests
|
||||
are discarded.
|
||||
This ensures that the correct,
|
||||
unmodified test suite is used for acceptance testing.
|
||||
The evaluation script compares the test results
|
||||
with results from testing
|
||||
the "gold" patch that was developed by a human to correctly solve the issue.
|
||||
If they match, the candidate solution has correctly resolved the issue.
|
||||
|
||||
These acceptance tests are only ever run outside of aider
|
||||
and the benchmark harness, and only to compute the number of
|
||||
correctly resolved instances.
|
||||
They are never run, used, or even visible during aider's attempts to solve the problems.
|
||||
|
||||
Aider correctly resolved 79 out of 300 SWE Bench Lite instances, or 26.3%.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
Much thanks to the team behind the
|
||||
[SWE Bench](https://www.swebench.com)
|
||||
family of AI coding benchmarks.
|
||||
Also thanks to Albert Örwall who has
|
||||
[dockerized the SWE Bench evaluation scripts](https://github.com/aorwall/SWE-bench-docker)
|
||||
making it faster, easier, and more reliable to run the acceptance tests.
|
||||
|
||||
|
||||
## References
|
||||
|
||||
All of aider's results reported here are pass@1 results,
|
||||
obtained without using the SWE Bench `hints_text`.
|
||||
|
||||
The "aider agent" internally makes multiple "attempts" at solving the problem,
|
||||
but it picks and returns one single candidate solution.
|
||||
Only that one candidate solution is evaluated with the acceptance tests
|
||||
and contributes to the benchmark score.
|
||||
Thus it is a pass@1 result.
|
||||
|
||||
This is contrast to a pass@N result for N>1, where N attempts are made
|
||||
and all N solutions are evaluated by the acceptance tests.
|
||||
If *any* of the N solution pass, that counts as a pass@N success.
|
||||
|
||||
Below are the references for the other pass@1 unhinted SWE-Bench results
|
||||
displayed in the graph at the beginning of this article.
|
||||
|
||||
- [20.3% Amazon Q Developer Agent (v20240430-dev)](https://www.swebench.com)
|
||||
- [19.0% AutoCodeRover](https://www.swebench.com/)
|
||||
- [18.0% SWE-Agent + GPT-4](https://www.swebench.com)
|
||||
- [16.7% OpenDevin](https://github.com/OpenDevin/OpenDevin/issues/2149)
|
||||
- [11.7% SWE-Agent + Opus](https://www.swebench.com)
|
||||
|
||||
Note, the graph was corrected on 5/30/24 as follows.
|
||||
|
||||
The graph now contains AutoCodeRover's average pass@1 results.
|
||||
Previously it displayed pass@3 results, which are
|
||||
not comparable
|
||||
to the pass@1 results for aider being reported here.
|
||||
The [AutoCodeRover GitHub page](https://github.com/nus-apr/auto-code-rover)
|
||||
features pass@3 results
|
||||
without being clearly labeled.
|
||||
|
||||
The graph now contains the best OpenDevin results obtained without using
|
||||
the SWE Bench `hints_text` to provide hints to the agent.
|
||||
The previous graph contained their hinted result,
|
||||
which is not comparable
|
||||
to the unhinted aider results being reported here.
|
||||
[OpenDevin reported hinted results](https://x.com/gneubig/status/1791498953709752405)
|
||||
without noting that hints were used.
|
70
aider/website/_posts/2024-05-24-self-assembly.md
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
title: Aider has written 7% of its own code
|
||||
excerpt: Aider has written 7% of its own code, via 600+ commits that inserted 4.8K and deleted 1.5K lines of code.
|
||||
highlight_image: /assets/self-assembly.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Aider has written 7% of its own code
|
||||
|
||||
[](https://aider.chat/assets/self-assembly.jpg)
|
||||
|
||||
The
|
||||
[aider git repo](https://github.com/paul-gauthier/aider)
|
||||
currently contains about 4K commits and 14K lines of code.
|
||||
|
||||
Aider made 15% of the commits, inserting 4.8K and deleting 1.5K lines of code.
|
||||
|
||||
About 7% of the code now in the repo is attributable to an aider commit
|
||||
using `git blame`.
|
||||
This number is probably a significant undercount, because periodic reformatting
|
||||
by `black` is likely obscuring aider's authorship of many lines.
|
||||
|
||||
Here's the breakdown of the code aider wrote in the current code base
|
||||
according to `git blame`.
|
||||
|
||||
| File | Lines | Percent |
|
||||
|---|---:|---:|
|
||||
|aider/args.py| 6 of 449 | 1.3% |
|
||||
|aider/coders/base_coder.py| 37 of 1354 | 2.7% |
|
||||
|aider/coders/editblock_coder.py| 14 of 507 | 2.8% |
|
||||
|aider/coders/editblock_func_coder.py| 6 of 141 | 4.3% |
|
||||
|aider/coders/udiff_coder.py| 2 of 421 | 0.5% |
|
||||
|aider/coders/wholefile_coder.py| 5 of 146 | 3.4% |
|
||||
|aider/coders/wholefile_func_coder.py| 4 of 134 | 3.0% |
|
||||
|aider/commands.py| 67 of 703 | 9.5% |
|
||||
|aider/diffs.py| 15 of 129 | 11.6% |
|
||||
|aider/gui.py| 2 of 533 | 0.4% |
|
||||
|aider/history.py| 19 of 124 | 15.3% |
|
||||
|aider/io.py| 55 of 368 | 14.9% |
|
||||
|aider/linter.py| 30 of 240 | 12.5% |
|
||||
|aider/main.py| 30 of 466 | 6.4% |
|
||||
|aider/mdstream.py| 3 of 122 | 2.5% |
|
||||
|aider/models.py| 22 of 549 | 4.0% |
|
||||
|aider/repo.py| 19 of 266 | 7.1% |
|
||||
|aider/repomap.py| 17 of 518 | 3.3% |
|
||||
|aider/scrape.py| 12 of 199 | 6.0% |
|
||||
|aider/versioncheck.py| 10 of 37 | 27.0% |
|
||||
|aider/voice.py| 9 of 104 | 8.7% |
|
||||
|benchmark/benchmark.py| 33 of 730 | 4.5% |
|
||||
|benchmark/over_time.py| 32 of 60 | 53.3% |
|
||||
|benchmark/swe_bench_lite.py| 40 of 71 | 56.3% |
|
||||
|scripts/blame.py| 55 of 212 | 25.9% |
|
||||
|scripts/versionbump.py| 96 of 123 | 78.0% |
|
||||
|setup.py| 11 of 47 | 23.4% |
|
||||
|tests/test_coder.py| 48 of 612 | 7.8% |
|
||||
|tests/test_commands.py| 135 of 588 | 23.0% |
|
||||
|tests/test_editblock.py| 23 of 403 | 5.7% |
|
||||
|tests/test_io.py| 30 of 65 | 46.2% |
|
||||
|tests/test_main.py| 13 of 239 | 5.4% |
|
||||
|tests/test_models.py| 6 of 28 | 21.4% |
|
||||
|tests/test_repo.py| 2 of 296 | 0.7% |
|
||||
|tests/test_repomap.py| 70 of 217 | 32.3% |
|
||||
|tests/test_udiff.py| 7 of 119 | 5.9% |
|
||||
|tests/test_wholefile.py| 37 of 321 | 11.5% |
|
||||
| **Total** | **1022 of 14219** | 7.2% |
|
||||
|
||||
|
267
aider/website/_posts/2024-06-02-main-swe-bench.md
Normal file
|
@ -0,0 +1,267 @@
|
|||
---
|
||||
title: Aider is SOTA for both SWE Bench and SWE Bench Lite
|
||||
excerpt: Aider sets SOTA for the main SWE Bench, after recently setting SOTA for the Lite version.
|
||||
highlight_image: /assets/swe_bench.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Aider is SOTA for both SWE Bench and SWE Bench Lite
|
||||
|
||||
Aider scored 18.9%
|
||||
on the main
|
||||
[SWE Bench benchmark](https://www.swebench.com),
|
||||
achieving a state-of-the-art result.
|
||||
The current top leaderboard entry is 13.8%
|
||||
from Amazon Q Developer Agent.
|
||||
The best result reported elsewhere seems to be
|
||||
[13.9% from Devin](https://www.cognition.ai/post/swe-bench-technical-report).
|
||||
|
||||
This result on the main SWE Bench builds on
|
||||
[aider's recent SOTA result on the easier SWE Bench Lite](https://aider.chat/2024/05/22/swe-bench-lite.html).
|
||||
|
||||
[](https://aider.chat/assets/swe_bench.svg)
|
||||
|
||||
**All of aider's results reported here are pass@1 results,
|
||||
obtained without using the SWE Bench `hints_text`.**
|
||||
Aider was benchmarked on the same
|
||||
[570 randomly selected SWE Bench problems](https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs)
|
||||
that were used in the
|
||||
[Devin evaluation](https://www.cognition.ai/post/swe-bench-technical-report).
|
||||
See the [references](#references)
|
||||
for more details on the data presented in this chart.
|
||||
|
||||
## Interactive, not agentic
|
||||
|
||||
Aider achieved this result mainly through its existing features that focus on static
|
||||
code analysis, reliable LLM code editing, and pragmatic UX for automatically
|
||||
fixing linting and testing errors.
|
||||
Aider intentionally has quite limited and narrow "agentic behavior"
|
||||
to avoid long delays, high token costs
|
||||
and the need for users to repeatedly code review incorrect solutions.
|
||||
It's also worth noting that aider currently does not use
|
||||
RAG, vector search, tools or give the LLM access to search the web
|
||||
or unilaterally execute code.
|
||||
|
||||
Aider is first and foremost an interactive tool for engineers to get real work done in
|
||||
real code bases using a chat interface.
|
||||
Aider provides a pair programming UX where users can ask for a change
|
||||
and see code edits performed in real-time.
|
||||
Aider can also offer additional help like fixing lint or test errors,
|
||||
but the user is always in full interactive control.
|
||||
This allows them to quickly steer misunderstandings back on course and
|
||||
avoid wasting time and token costs.
|
||||
|
||||
|
||||
## Benchmark methodology
|
||||
|
||||
Benchmarking was conducted as follows:
|
||||
|
||||
- Aider with GPT-4o was launched in each problem's git repository
|
||||
with the problem statement
|
||||
submitted as the opening chat message from "the user".
|
||||
- After that aider ran as normal, except all of aider's
|
||||
suggestions were always accepted without user approval.
|
||||
- A [simple harness](https://github.com/paul-gauthier/aider-swe-bench#the-aider-agent) was used to retry the SWE Bench problem if aider produced code that wasn't *plausibly correct*.
|
||||
Plausibly correct means that aider reported that it had successfully edited the repo
|
||||
without causing syntax errors or breaking any *pre-existing* tests.
|
||||
- If the solution from aider with GPT-4o wasn't plausible, the harness launched aider to try again from scratch using Claude 3 Opus.
|
||||
- If no plausible solution was found after those two tries, the harness picked the "most plausible" solution with the fewest edit/lint/test problems.
|
||||
|
||||
It's important to be clear that
|
||||
*aider and the benchmark harness
|
||||
only had access to the pre-existing tests in each problem's repo*.
|
||||
The held out "acceptance tests" were *only* used
|
||||
after benchmarking to compute statistics on which problems aider
|
||||
correctly resolved.
|
||||
|
||||
This is the same approach
|
||||
that was used for
|
||||
[aider's recent SOTA result on SWE Bench Lite](https://aider.chat/2024/05/22/swe-bench-lite.html).
|
||||
For the Lite benchmark,
|
||||
aider alternated between GPT-4o and Opus for up to six total attempts.
|
||||
To manage the cost of running the main SWE Bench benchmark,
|
||||
aider was limited to two total attempts:
|
||||
one with GPT-4o and one with Opus.
|
||||
|
||||
For a detailed discussion of the benchmark
|
||||
methodology, see the
|
||||
[article about aider's SWE Bench Lite results](https://aider.chat/2024/05/22/swe-bench-lite.html).
|
||||
Also, the
|
||||
[aider SWE Bench repository on GitHub](https://github.com/paul-gauthier/aider-swe-bench)
|
||||
contains the harness and statistics code used for the benchmarks.
|
||||
|
||||
The benchmarking process was similar to how a developer might use aider to
|
||||
resolve a GitHub issue:
|
||||
|
||||
- They could launch aider in their repo with the command below, which
|
||||
tells aider they want to accept every suggestion
|
||||
and to use pytest to run tests.
|
||||
- `aider --yes --test-cmd pytest`
|
||||
- They could start the chat by pasting in the URL or text of a GitHub issue.
|
||||
Aider will pull in the URL's content and then try and resolve the issue.
|
||||
- If aider doesn't produce code that lints and tests clean, the user might decide to
|
||||
[use git to revert the changes](https://aider.chat/docs/git.html),
|
||||
and try again with `aider --opus`.
|
||||
|
||||
## Aider with GPT-4o alone was SOTA
|
||||
|
||||
Using aider with GPT-4o to make a single attempt at resolving each problem
|
||||
achieved a score of 17.0%.
|
||||
This was itself a state-of-the-art result, before being surpassed by the main
|
||||
result being reported here
|
||||
that used aider with both GPT-4o & Opus.
|
||||
|
||||
## Aider with GPT-4o & Opus
|
||||
|
||||
The benchmark harness started by using aider with GPT-4o to try
|
||||
and resolve each problem.
|
||||
For problems where this didn't produce a plausible solution,
|
||||
the harness tried again using aider with Opus.
|
||||
So at most, two attempts were made for each problem.
|
||||
|
||||
The table below breaks down the proposed solutions that
|
||||
were found from each attempt at the 570 problems.
|
||||
A proposed solution is either:
|
||||
|
||||
- A plausible solution where
|
||||
aider reported no outstanding errors from editing, linting and testing.
|
||||
- Or, the "most plausible" solution generated by either attempt, with the
|
||||
[fewest outstanding editing, linting or testing errors](https://aider.chat/2024/05/22/swe-bench-lite.html#finding-a-plausible-solution).
|
||||
|
||||
The table also provides details on the 108 solutions that were ultimately
|
||||
verified as correctly resolving their issue.
|
||||
|
||||
| Attempt | Agent |Number of<br>proposed<br>solutions|Percent of<br>proposed<br>solutions| Number of<br/>correctly<br>resolved<br>solutions | Percent of<br>correctly<br>resolved<br>solutions | Score on<br>SWE Bench<br>Lite |
|
||||
|:--------:|------------|---------:|---------:|----:|---:|--:|
|
||||
| 1 | Aider with GPT-4o | 419 | 73.5% | 87 | 80.6% | 15.3% |
|
||||
| 2 | Aider with Opus | 151 | 26.5% | 21 | 19.4% | 3.7% |
|
||||
| **Total** | | **570** | **100%** | **108** | **100%** | **18.9%** |
|
||||
|
||||
## Non-plausible but correct solutions?
|
||||
|
||||
A solution doesn't actually have to be plausible in order to correctly resolve the issue.
|
||||
Recall that plausible is simply defined as aider
|
||||
reporting that it successfully completed all file edits,
|
||||
repaired and resolved any linting errors
|
||||
and resolved any test failures.
|
||||
But there are many reasons why aider might fail to do those things
|
||||
and yet still produce a solution that will pass
|
||||
acceptance testing:
|
||||
|
||||
- There may have been pre-existing failing tests in the repo,
|
||||
before aider even started working on the SWE Bench problem.
|
||||
Aider may not have resolved such issues, and yet they may not be
|
||||
relevant to the acceptance testing.
|
||||
The SWE Bench acceptance testing just confirms that tests pass or fail
|
||||
in the same pattern as the "gold patch" developed by a human to resolve the
|
||||
problem.
|
||||
Some tests may fail during acceptance testing,
|
||||
and that's ok as long as they failed for the gold
|
||||
patch too.
|
||||
- There may have been pre-existing linting problems in the repo.
|
||||
If lingering linting issues affected code paths that are not well tested,
|
||||
they may not impact acceptance testing.
|
||||
- Aider may have reported file editing errors because it thought the LLM
|
||||
specified edits that it wasn't able to successfully apply.
|
||||
This can only happen when the LLM specified edits in
|
||||
a way that doesn't comply with the editing instructions in the system prompt.
|
||||
Given that the LLM isn't complying with the system prompt,
|
||||
it may have become confused and
|
||||
asked for redundant or otherwise irrelevant edits.
|
||||
Such outstanding edit errors might not be fatal for acceptance testing.
|
||||
- Etc.
|
||||
|
||||
Keeping all this in mind, we can understand why
|
||||
GPT-4o accounts for 15.3% of the benchmark score in the table above,
|
||||
but benchmarking with just one attempt of aider with GPT-4o scored 17.0%.
|
||||
When an Opus attempt is allowed after GPT-4o,
|
||||
it may propose some *incorrect* solutions which
|
||||
are "more plausible" than some of GPT-4o's non-plausible solutions.
|
||||
These more plausible, incorrect solutions can
|
||||
eclipse some of
|
||||
the earlier non-plausible correct solutions that GPT-4o generated.
|
||||
This is why GPT-4o's score in the table
|
||||
showing the combined GPT-4o & Opus results (15.3%)
|
||||
is lower than the result from just one try using aider with GPT-4o (17.0%).
|
||||
|
||||
For these reasons, adding additional attempts is not guaranteed to monotonically
|
||||
increase the number of resolved problems.
|
||||
New solutions may resolve some new problems but they may also
|
||||
eclipse and discard some of the previous non-plausible correct solutions.
|
||||
|
||||
Luckily, the net effect of additional attempts
|
||||
usually increases or at least maintains the
|
||||
number of resolved solutions.
|
||||
This was the case for all the attempts made in both this main SWE Bench result and the
|
||||
earlier Lite result.
|
||||
|
||||
## Computing the benchmark score
|
||||
|
||||
The benchmark harness produced one proposed solution for each of
|
||||
the 570 SWE Bench problems.
|
||||
|
||||
A separate evaluation script was used to
|
||||
test each of these solutions with the full test suite,
|
||||
including the held out acceptance tests.
|
||||
For this final acceptance testing, any edits that aider made to tests
|
||||
were discarded.
|
||||
This ensured that the correct,
|
||||
unmodified test suite was used for acceptance testing.
|
||||
The evaluation script compared each proposed solution's test results
|
||||
with results from testing
|
||||
the "gold" patch that was developed by a human to correctly resolve the issue.
|
||||
If they matched, the proposed solution correctly resolved the issue.
|
||||
|
||||
These acceptance tests were only ever run outside of aider
|
||||
and the benchmark harness, and only to compute statistics about the
|
||||
correctly resolved instances.
|
||||
They were never run, used, or even visible during aider's attempts to resolve the problems.
|
||||
|
||||
Aider correctly resolved 108 out of 570 SWE Bench instances that were benchmarked,
|
||||
or 18.9%.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
Much thanks to the team behind the
|
||||
[SWE Bench](https://www.swebench.com)
|
||||
family of AI coding benchmarks.
|
||||
Also thanks to Albert Örwall who has
|
||||
[dockerized the SWE Bench evaluation scripts](https://github.com/aorwall/SWE-bench-docker)
|
||||
making it faster, easier, and more reliable to run the acceptance tests.
|
||||
|
||||
|
||||
## References
|
||||
|
||||
All of aider's results reported here are pass@1 results,
|
||||
obtained without using the SWE Bench `hints_text`.
|
||||
|
||||
The "aider agent" internally makes multiple "attempts" at solving the problem,
|
||||
but it picks and returns one single candidate solution.
|
||||
Only that one candidate solution is evaluated with the acceptance tests
|
||||
and contributes to the benchmark score.
|
||||
Thus it is a pass@1 result.
|
||||
|
||||
This is contrast to a pass@N result for N>1, where N attempts are made
|
||||
and all N solutions are evaluated by the acceptance tests.
|
||||
If *any* of the N solution pass, that counts as a pass@N success.
|
||||
|
||||
Below are the references for the other pass@1 unhinted SWE-Bench results
|
||||
displayed in the graph at the beginning of this article.
|
||||
|
||||
- [13.9% Devin, benchmarked on 570 instances.](https://www.cognition.ai/post/swe-bench-technical-report)
|
||||
- [13.8% Amazon Q Developer Agent, benchmarked on 2,294 instances.](https://www.swebench.com)
|
||||
- [12.5% SWE- Agent + GPT-4, benchmarked on 2,294 instances.](https://www.swebench.com)
|
||||
- [10.6% AutoCode Rover, benchmarked on 2,294 instances.](https://arxiv.org/pdf/2404.05427v2)
|
||||
- [10.5% SWE- Agent + Opus, benchmarked on 2,294 instances.](https://www.swebench.com)
|
||||
|
||||
The graph contains average pass@1 results for AutoCodeRover.
|
||||
The [AutoCodeRover GitHub page](https://github.com/nus-apr/auto-code-rover)
|
||||
features their pass@3 results
|
||||
without being clearly labeled.
|
||||
Table 2 of their
|
||||
[paper](https://arxiv.org/pdf/2404.05427v2)
|
||||
reports an `ACR-avg` result of 10.59% which is an average pass@1 result.
|
||||
|
126
aider/website/_posts/2024-07-01-sonnet-not-lazy.md
Normal file
|
@ -0,0 +1,126 @@
|
|||
---
|
||||
title: Sonnet is the opposite of lazy
|
||||
excerpt: Claude 3.5 Sonnet can easily write more good code than fits in one 4k token API response.
|
||||
highlight_image: /assets/sonnet-not-lazy.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
|
||||
[](https://aider.chat/assets/sonnet-not-lazy.jpg)
|
||||
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Sonnet is the opposite of lazy
|
||||
|
||||
Claude 3.5 Sonnet represents a step change
|
||||
in AI coding.
|
||||
It is incredibly industrious, diligent and hard working.
|
||||
Unexpectedly,
|
||||
this presented a challenge:
|
||||
Sonnet
|
||||
was often writing so much code that
|
||||
it was hitting the 4k output token limit,
|
||||
truncating its coding in mid-stream.
|
||||
|
||||
Aider now works
|
||||
around this 4k limit and allows Sonnet to produce
|
||||
as much code as it wants.
|
||||
The result is surprisingly powerful.
|
||||
Sonnet's score on
|
||||
[aider's refactoring benchmark](https://aider.chat/docs/leaderboards/#code-refactoring-leaderboard)
|
||||
jumped from 55.1% up to 64.0%.
|
||||
This moved Sonnet into second place, ahead of GPT-4o and
|
||||
behind only Opus.
|
||||
|
||||
Users who tested Sonnet with a preview of
|
||||
[aider's latest release](https://aider.chat/HISTORY.html#aider-v0410)
|
||||
were thrilled:
|
||||
|
||||
- *Works like a charm. It is a monster. It refactors files of any size like it is nothing. The continue trick with Sonnet is truly the holy grail. Aider beats [other tools] hands down. I'm going to cancel both subscriptions.* -- [Emasoft](https://github.com/paul-gauthier/aider/issues/705#issuecomment-2200338971)
|
||||
- *Thanks heaps for this feature - it's a real game changer. I can be more ambitious when asking Claude for larger features.* -- [cngarrison](https://github.com/paul-gauthier/aider/issues/705#issuecomment-2196026656)
|
||||
- *Fantastic...! It's such an improvement not being constrained by output token length issues. [I refactored] a single JavaScript file into seven smaller files using a single Aider request.* -- [John Galt](https://discord.com/channels/1131200896827654144/1253492379336441907/1256250487934554143)
|
||||
|
||||
## Hitting the 4k token output limit
|
||||
|
||||
All LLMs have various token limits, the most familiar being their
|
||||
context window size.
|
||||
But they also have a limit on how many tokens they can output
|
||||
in response to a single request.
|
||||
Sonnet and the majority of other
|
||||
models are limited to returning 4k tokens.
|
||||
|
||||
Sonnet's amazing work ethic caused it to
|
||||
regularly hit this 4k output token
|
||||
limit for a few reasons:
|
||||
|
||||
1. Sonnet is capable of outputting a very large amount of correct,
|
||||
complete new code in one response.
|
||||
2. Similarly, Sonnet can specify long sequences of edits in one go,
|
||||
like changing a majority of lines while refactoring a large file.
|
||||
3. Sonnet tends to quote large chunks of a
|
||||
file when performing a SEARCH & REPLACE edits.
|
||||
Beyond token limits, this is very wasteful.
|
||||
|
||||
## Good problems
|
||||
|
||||
Problems (1) and (2) are "good problems"
|
||||
in the sense that Sonnet is
|
||||
able to write more high quality code than any other model!
|
||||
We just don't want it to be interrupted prematurely
|
||||
by the 4k output limit.
|
||||
|
||||
Aider now allows Sonnet to return code in multiple 4k token
|
||||
responses.
|
||||
Aider seamlessly combines them so that Sonnet can return arbitrarily
|
||||
long responses.
|
||||
This gets all the upsides of Sonnet's prolific coding skills,
|
||||
without being constrained by the 4k output token limit.
|
||||
|
||||
|
||||
## Wasting tokens
|
||||
|
||||
Problem (3) is more complicated, as Sonnet isn't just
|
||||
being stopped early -- it's actually wasting a lot
|
||||
of tokens, time and money.
|
||||
|
||||
Faced with a few small changes spread far apart in
|
||||
a source file,
|
||||
Sonnet would often prefer to do one giant SEARCH/REPLACE
|
||||
operation of almost the entire file.
|
||||
It would be far faster and less expensive to instead
|
||||
do a few surgical edits.
|
||||
|
||||
Aider now prompts Sonnet to discourage these long-winded
|
||||
SEARCH/REPLACE operations
|
||||
and promotes much more concise edits.
|
||||
|
||||
|
||||
## Aider with Sonnet
|
||||
|
||||
[The latest release of aider](https://aider.chat/HISTORY.html#aider-v0410)
|
||||
has specialized support for Claude 3.5 Sonnet:
|
||||
|
||||
- Aider allows Sonnet to produce as much code as it wants,
|
||||
by automatically and seamlessly spreading the response
|
||||
out over a sequence of 4k token API responses.
|
||||
- Aider carefully prompts Sonnet to be concise when proposing
|
||||
code edits.
|
||||
This reduces Sonnet's tendency to waste time, tokens and money
|
||||
returning large chunks of unchanging code.
|
||||
- Aider now uses Claude 3.5 Sonnet by default if the `ANTHROPIC_API_KEY` is set in the environment.
|
||||
|
||||
See
|
||||
[aider's install instructions](https://aider.chat/docs/install.html)
|
||||
for more details, but
|
||||
you can get started quickly with aider and Sonnet like this:
|
||||
|
||||
```
|
||||
$ pip install aider-chat
|
||||
|
||||
$ export ANTHROPIC_API_KEY=<key> # Mac/Linux
|
||||
$ setx ANTHROPIC_API_KEY <key> # Windows
|
||||
|
||||
$ aider
|
||||
```
|
||||
|
102
aider/website/_sass/custom/custom.scss
Normal file
|
@ -0,0 +1,102 @@
|
|||
|
||||
.btn {
|
||||
display: inline-block;
|
||||
width: auto;
|
||||
}
|
||||
.btn + .btn {
|
||||
margin-top: 0;
|
||||
margin-left: 0.75rem;
|
||||
}
|
||||
.post {
|
||||
background: #fff;
|
||||
box-shadow: 0 2px 5px rgba(0,0,0,0.1);
|
||||
margin-bottom: 2em;
|
||||
padding: 1em;
|
||||
border-radius: 4px;
|
||||
}
|
||||
.post-date {
|
||||
color: #777;
|
||||
font-size: 0.85em;
|
||||
margin-bottom: 1em;
|
||||
display: block;
|
||||
}
|
||||
|
||||
.post-highlight {
|
||||
max-width: 20em; /* Assuming the base font-size is 16px, 12.5em is equivalent to 200px */
|
||||
margin-right: 1em;
|
||||
margin-left: 1em;
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.post-highlight {
|
||||
max-width: 30em; /* Larger size on mobile */
|
||||
}
|
||||
}
|
||||
|
||||
.post-content {
|
||||
display: flex;
|
||||
align-items: flex-start;
|
||||
flex: 1;
|
||||
}
|
||||
|
||||
.chat-transcript {
|
||||
font-family: 'Courier New', Courier, monospace;
|
||||
background-color: #000;
|
||||
color: #0f0;
|
||||
padding: 1em 1em 1em 1em;
|
||||
border-radius: 5px;
|
||||
margin-top: 50px;
|
||||
}
|
||||
|
||||
.chat-transcript blockquote {
|
||||
padding: 0;
|
||||
margin-left: 0;
|
||||
color: #819198;
|
||||
border-left: none
|
||||
}
|
||||
|
||||
.chat-transcript blockquote>:first-child {
|
||||
margin-top: 0
|
||||
}
|
||||
|
||||
.chat-transcript blockquote>:last-child {
|
||||
margin-bottom: 0
|
||||
}
|
||||
|
||||
|
||||
.chat-transcript li,
|
||||
.chat-transcript p {
|
||||
color: #00FFFF;
|
||||
}
|
||||
|
||||
.chat-transcript h1 {
|
||||
display: none;
|
||||
}
|
||||
|
||||
.chat-transcript h4 {
|
||||
color: #32FF32;
|
||||
border-top: 1px solid #32FF32;
|
||||
padding-top: 10px;
|
||||
text-transform: none;
|
||||
}
|
||||
|
||||
.chat-transcript h4::before {
|
||||
content: '> ';
|
||||
}
|
||||
|
||||
.chat-transcript blockquote p {
|
||||
color: #ffffff;
|
||||
}
|
||||
.chat-transcript::before {
|
||||
content: url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" width="90" height="25"><circle cx="7" cy="15" r="7" fill="%23f92672"/><circle cx="27" cy="15" r="7" fill="%23f4bf75"/><circle cx="47" cy="15" r="7" fill="%23a6e22e"/></svg>');
|
||||
display: block;
|
||||
margin-bottom: 0;
|
||||
position: relative;
|
||||
top: -0.5em;
|
||||
}
|
||||
|
||||
|
||||
.chat-transcript,
|
||||
div.highlighter-rouge pre.highlight, div.highlighter-rouge code {
|
||||
line-height: 1.1;
|
||||
}
|
BIN
aider/website/assets/2024-03-07-claude-3.jpg
Normal file
After Width: | Height: | Size: 34 KiB |
1987
aider/website/assets/2024-03-07-claude-3.svg
Normal file
After Width: | Height: | Size: 55 KiB |
BIN
aider/website/assets/2024-04-09-gpt-4-turbo-laziness.jpg
Normal file
After Width: | Height: | Size: 19 KiB |
1519
aider/website/assets/2024-04-09-gpt-4-turbo-laziness.svg
Normal file
After Width: | Height: | Size: 37 KiB |
BIN
aider/website/assets/2024-04-09-gpt-4-turbo.jpg
Normal file
After Width: | Height: | Size: 25 KiB |
1707
aider/website/assets/2024-04-09-gpt-4-turbo.svg
Normal file
After Width: | Height: | Size: 45 KiB |
BIN
aider/website/assets/aider-browser-social.mp4
Normal file
BIN
aider/website/assets/aider-square.jpg
Normal file
After Width: | Height: | Size: 190 KiB |
BIN
aider/website/assets/aider.jpg
Normal file
After Width: | Height: | Size: 390 KiB |
BIN
aider/website/assets/benchmarks-0125.jpg
Normal file
After Width: | Height: | Size: 22 KiB |
1553
aider/website/assets/benchmarks-0125.svg
Normal file
After Width: | Height: | Size: 38 KiB |
BIN
aider/website/assets/benchmarks-1106.jpg
Normal file
After Width: | Height: | Size: 31 KiB |
1955
aider/website/assets/benchmarks-1106.svg
Normal file
After Width: | Height: | Size: 54 KiB |
BIN
aider/website/assets/benchmarks-speed-1106.jpg
Normal file
After Width: | Height: | Size: 25 KiB |
1780
aider/website/assets/benchmarks-speed-1106.svg
Normal file
After Width: | Height: | Size: 47 KiB |
BIN
aider/website/assets/benchmarks-udiff.jpg
Normal file
After Width: | Height: | Size: 144 KiB |
1749
aider/website/assets/benchmarks-udiff.svg
Normal file
After Width: | Height: | Size: 44 KiB |
BIN
aider/website/assets/benchmarks.jpg
Normal file
After Width: | Height: | Size: 33 KiB |
2198
aider/website/assets/benchmarks.svg
Normal file
After Width: | Height: | Size: 58 KiB |
BIN
aider/website/assets/browser.jpg
Normal file
After Width: | Height: | Size: 397 KiB |
BIN
aider/website/assets/codespaces.jpg
Normal file
After Width: | Height: | Size: 136 KiB |
BIN
aider/website/assets/codespaces.mp4
Normal file
BIN
aider/website/assets/figure.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
aider/website/assets/icons/android-chrome-192x192.png
Normal file
After Width: | Height: | Size: 29 KiB |
BIN
aider/website/assets/icons/android-chrome-384x384.png
Normal file
After Width: | Height: | Size: 92 KiB |
BIN
aider/website/assets/icons/apple-touch-icon.png
Normal file
After Width: | Height: | Size: 26 KiB |
9
aider/website/assets/icons/browserconfig.xml
Normal file
|
@ -0,0 +1,9 @@
|
|||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<browserconfig>
|
||||
<msapplication>
|
||||
<tile>
|
||||
<square150x150logo src="/assets/icons/mstile-150x150.png"/>
|
||||
<TileColor>#da532c</TileColor>
|
||||
</tile>
|
||||
</msapplication>
|
||||
</browserconfig>
|
BIN
aider/website/assets/icons/favicon-16x16.png
Normal file
After Width: | Height: | Size: 1.2 KiB |
BIN
aider/website/assets/icons/favicon-32x32.png
Normal file
After Width: | Height: | Size: 1.8 KiB |
BIN
aider/website/assets/icons/favicon.ico
Normal file
After Width: | Height: | Size: 15 KiB |
BIN
aider/website/assets/icons/mstile-150x150.png
Normal file
After Width: | Height: | Size: 18 KiB |
32
aider/website/assets/icons/safari-pinned-tab.svg
Normal file
|
@ -0,0 +1,32 @@
|
|||
<?xml version="1.0" standalone="no"?>
|
||||
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20010904//EN"
|
||||
"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
|
||||
<svg version="1.0" xmlns="http://www.w3.org/2000/svg"
|
||||
width="436.000000pt" height="436.000000pt" viewBox="0 0 436.000000 436.000000"
|
||||
preserveAspectRatio="xMidYMid meet">
|
||||
<metadata>
|
||||
Created by potrace 1.14, written by Peter Selinger 2001-2017
|
||||
</metadata>
|
||||
<g transform="translate(0.000000,436.000000) scale(0.100000,-0.100000)"
|
||||
fill="#000000" stroke="none">
|
||||
<path d="M0 2180 l0 -2180 2180 0 2180 0 0 2180 0 2180 -2180 0 -2180 0 0
|
||||
-2180z m2705 1818 c20 -20 28 -121 30 -398 l2 -305 216 -5 c118 -3 218 -8 222
|
||||
-12 3 -3 10 -46 15 -95 5 -48 16 -126 25 -172 17 -86 17 -81 -17 -233 -14 -67
|
||||
-13 -365 2 -438 21 -100 22 -159 5 -247 -24 -122 -24 -363 1 -458 23 -88 23
|
||||
-213 1 -330 -9 -49 -17 -109 -17 -132 l0 -43 203 0 c111 0 208 -4 216 -9 10
|
||||
-6 18 -51 27 -148 8 -76 16 -152 20 -168 7 -39 -23 -361 -37 -387 -10 -18 -21
|
||||
-19 -214 -16 -135 2 -208 7 -215 14 -22 22 -33 301 -21 501 6 102 8 189 5 194
|
||||
-8 13 -417 12 -431 -2 -12 -12 -8 -146 8 -261 8 -55 8 -95 1 -140 -6 -35 -14
|
||||
-99 -17 -143 -9 -123 -14 -141 -41 -154 -18 -8 -217 -11 -679 -11 l-653 0 -11
|
||||
33 c-31 97 -43 336 -27 533 5 56 6 113 2 128 l-6 26 -194 0 c-211 0 -252 4
|
||||
-261 28 -12 33 -17 392 -6 522 15 186 -2 174 260 180 115 3 213 8 217 12 4 4
|
||||
1 52 -5 105 -7 54 -17 130 -22 168 -7 56 -5 91 11 171 10 55 22 130 26 166 4
|
||||
36 10 72 15 79 7 12 128 15 665 19 l658 5 8 30 c5 18 4 72 -3 130 -12 115 -7
|
||||
346 11 454 10 61 10 75 -1 82 -8 5 -300 9 -650 9 l-636 0 -27 25 c-18 16 -26
|
||||
34 -26 57 0 18 -5 87 -10 153 -10 128 5 449 22 472 5 7 26 13 46 15 78 6 1281
|
||||
3 1287 -4z"/>
|
||||
<path d="M1360 1833 c0 -5 -1 -164 -3 -356 l-2 -347 625 -1 c704 -1 708 -1
|
||||
722 7 5 4 7 20 4 38 -29 141 -32 491 -6 595 9 38 8 45 -7 57 -15 11 -139 13
|
||||
-675 14 -362 0 -658 -3 -658 -7z"/>
|
||||
</g>
|
||||
</svg>
|
After Width: | Height: | Size: 1.8 KiB |
19
aider/website/assets/icons/site.webmanifest
Normal file
|
@ -0,0 +1,19 @@
|
|||
{
|
||||
"name": "",
|
||||
"short_name": "",
|
||||
"icons": [
|
||||
{
|
||||
"src": "/assets/icons/android-chrome-192x192.png",
|
||||
"sizes": "192x192",
|
||||
"type": "image/png"
|
||||
},
|
||||
{
|
||||
"src": "/assets/icons/android-chrome-384x384.png",
|
||||
"sizes": "384x384",
|
||||
"type": "image/png"
|
||||
}
|
||||
],
|
||||
"theme_color": "#ffffff",
|
||||
"background_color": "#ffffff",
|
||||
"display": "standalone"
|
||||
}
|
BIN
aider/website/assets/install.jpg
Normal file
After Width: | Height: | Size: 139 KiB |
BIN
aider/website/assets/install.mp4
Normal file
BIN
aider/website/assets/leaderboard.jpg
Normal file
After Width: | Height: | Size: 344 KiB |
BIN
aider/website/assets/linting.jpg
Normal file
After Width: | Height: | Size: 154 KiB |
BIN
aider/website/assets/llms.jpg
Normal file
After Width: | Height: | Size: 111 KiB |
BIN
aider/website/assets/models-over-time.png
Normal file
After Width: | Height: | Size: 86 KiB |
1742
aider/website/assets/models-over-time.svg
Normal file
After Width: | Height: | Size: 53 KiB |
BIN
aider/website/assets/robot-ast.png
Normal file
After Width: | Height: | Size: 671 KiB |
BIN
aider/website/assets/robot-flowchart.png
Normal file
After Width: | Height: | Size: 700 KiB |
242
aider/website/assets/sample.aider.conf.yml
Normal file
|
@ -0,0 +1,242 @@
|
|||
##########################################################
|
||||
# Sample .aider.conf.yaml
|
||||
# This file lists *all* the valid configuration entries.
|
||||
# Place in your home dir, or at the root of your git repo.
|
||||
##########################################################
|
||||
|
||||
##########
|
||||
# options:
|
||||
|
||||
## show this help message and exit
|
||||
#help:
|
||||
|
||||
#######
|
||||
# Main:
|
||||
|
||||
## Specify the OpenAI API key
|
||||
#openai-api-key:
|
||||
|
||||
## Specify the Anthropic API key
|
||||
#anthropic-api-key:
|
||||
|
||||
## Specify the model to use for the main chat
|
||||
#model:
|
||||
|
||||
## Use claude-3-opus-20240229 model for the main chat
|
||||
#opus: false
|
||||
|
||||
## Use claude-3-5-sonnet-20240620 model for the main chat
|
||||
#sonnet: false
|
||||
|
||||
## Use gpt-4-0613 model for the main chat
|
||||
#4: false
|
||||
|
||||
## Use gpt-4o model for the main chat
|
||||
#4o: false
|
||||
|
||||
## Use gpt-4-1106-preview model for the main chat
|
||||
#4-turbo: false
|
||||
|
||||
## Use gpt-3.5-turbo model for the main chat
|
||||
#35turbo: false
|
||||
|
||||
#################
|
||||
# Model Settings:
|
||||
|
||||
## List known models which match the (partial) MODEL name
|
||||
#models:
|
||||
|
||||
## Specify the api base url
|
||||
#openai-api-base:
|
||||
|
||||
## Specify the api_type
|
||||
#openai-api-type:
|
||||
|
||||
## Specify the api_version
|
||||
#openai-api-version:
|
||||
|
||||
## Specify the deployment_id
|
||||
#openai-api-deployment-id:
|
||||
|
||||
## Specify the OpenAI organization ID
|
||||
#openai-organization-id:
|
||||
|
||||
## Specify a file with aider model settings for unknown models
|
||||
#model-settings-file: .aider.model.settings.yml
|
||||
|
||||
## Specify a file with context window and costs for unknown models
|
||||
#model-metadata-file: .aider.model.metadata.json
|
||||
|
||||
## Verify the SSL cert when connecting to models (default: True)
|
||||
#verify-ssl: true
|
||||
|
||||
## Specify what edit format the LLM should use (default depends on model)
|
||||
#edit-format:
|
||||
|
||||
## Specify the model to use for commit messages and chat history summarization (default depends on --model)
|
||||
#weak-model:
|
||||
|
||||
## Only work with models that have meta-data available (default: True)
|
||||
#show-model-warnings: true
|
||||
|
||||
## Max number of tokens to use for repo map, use 0 to disable (default: 1024)
|
||||
#map-tokens: true
|
||||
|
||||
## Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.
|
||||
#max-chat-history-tokens:
|
||||
|
||||
## Specify the .env file to load (default: .env in git root)
|
||||
#env-file: .env
|
||||
|
||||
################
|
||||
# History Files:
|
||||
|
||||
## Specify the chat input history file (default: .aider.input.history)
|
||||
#input-history-file: .aider.input.history
|
||||
|
||||
## Specify the chat history file (default: .aider.chat.history.md)
|
||||
#chat-history-file: .aider.chat.history.md
|
||||
|
||||
## Restore the previous chat history messages (default: False)
|
||||
#restore-chat-history: false
|
||||
|
||||
## Log the conversation with the LLM to this file (for example, .aider.llm.history)
|
||||
#llm-history-file:
|
||||
|
||||
##################
|
||||
# Output Settings:
|
||||
|
||||
## Use colors suitable for a dark terminal background (default: False)
|
||||
#dark-mode: false
|
||||
|
||||
## Use colors suitable for a light terminal background (default: False)
|
||||
#light-mode: false
|
||||
|
||||
## Enable/disable pretty, colorized output (default: True)
|
||||
#pretty: true
|
||||
|
||||
## Enable/disable streaming responses (default: True)
|
||||
#stream: true
|
||||
|
||||
## Set the color for user input (default: #00cc00)
|
||||
#user-input-color: #00cc00
|
||||
|
||||
## Set the color for tool output (default: None)
|
||||
#tool-output-color:
|
||||
|
||||
## Set the color for tool error messages (default: red)
|
||||
#tool-error-color: #FF2222
|
||||
|
||||
## Set the color for assistant output (default: #0088ff)
|
||||
#assistant-output-color: #0088ff
|
||||
|
||||
## Set the markdown code theme (default: default, other options include monokai, solarized-dark, solarized-light)
|
||||
#code-theme: default
|
||||
|
||||
## Show diffs when committing changes (default: False)
|
||||
#show-diffs: false
|
||||
|
||||
###############
|
||||
# Git Settings:
|
||||
|
||||
## Enable/disable looking for a git repo (default: True)
|
||||
#git: true
|
||||
|
||||
## Enable/disable adding .aider* to .gitignore (default: True)
|
||||
#gitignore: true
|
||||
|
||||
## Specify the aider ignore file (default: .aiderignore in git root)
|
||||
#aiderignore: .aiderignore
|
||||
|
||||
## Enable/disable auto commit of LLM changes (default: True)
|
||||
#auto-commits: true
|
||||
|
||||
## Enable/disable commits when repo is found dirty (default: True)
|
||||
#dirty-commits: true
|
||||
|
||||
## Attribute aider code changes in the git author name (default: True)
|
||||
#attribute-author: true
|
||||
|
||||
## Attribute aider commits in the git committer name (default: True)
|
||||
#attribute-committer: true
|
||||
|
||||
## Prefix commit messages with 'aider: ' (default: False)
|
||||
#attribute-commit-message: false
|
||||
|
||||
## Perform a dry run without modifying files (default: False)
|
||||
#dry-run: false
|
||||
|
||||
########################
|
||||
# Fixing and committing:
|
||||
|
||||
## Commit all pending changes with a suitable commit message, then exit
|
||||
#commit: false
|
||||
|
||||
## Lint and fix provided files, or dirty files if none provided
|
||||
#lint: false
|
||||
|
||||
## Specify lint commands to run for different languages, eg: "python: flake8 --select=..." (can be used multiple times)
|
||||
#lint-cmd:
|
||||
|
||||
## Enable/disable automatic linting after changes (default: True)
|
||||
#auto-lint: true
|
||||
|
||||
## Specify command to run tests
|
||||
#test-cmd:
|
||||
|
||||
## Enable/disable automatic testing after changes (default: False)
|
||||
#auto-test: false
|
||||
|
||||
## Run tests and fix problems found
|
||||
#test: false
|
||||
|
||||
#################
|
||||
# Other Settings:
|
||||
|
||||
## Use VI editing mode in the terminal (default: False)
|
||||
#vim: false
|
||||
|
||||
## Specify the language for voice using ISO 639-1 code (default: auto)
|
||||
#voice-language: en
|
||||
|
||||
## Show the version number and exit
|
||||
#version:
|
||||
|
||||
## Check for updates and return status in the exit code
|
||||
#check-update: false
|
||||
|
||||
## Skips checking for the update when the program runs
|
||||
#skip-check-update: false
|
||||
|
||||
## Apply the changes from the given file instead of running the chat (debug)
|
||||
#apply:
|
||||
|
||||
## Always say yes to every confirmation
|
||||
#yes: false
|
||||
|
||||
## Enable verbose output
|
||||
#verbose: false
|
||||
|
||||
## Print the repo map and exit (debug)
|
||||
#show-repo-map: false
|
||||
|
||||
## Print the system prompts and exit (debug)
|
||||
#show-prompts: false
|
||||
|
||||
## Do all startup activities then exit before accepting user input (debug)
|
||||
#exit: false
|
||||
|
||||
## Specify a single message to send the LLM, process reply then exit (disables chat mode)
|
||||
#message:
|
||||
|
||||
## Specify a file containing the message to send the LLM, process reply, then exit (disables chat mode)
|
||||
#message-file:
|
||||
|
||||
## Specify the encoding for input and output (default: utf-8)
|
||||
#encoding: utf-8
|
||||
|
||||
## Specify the config file (default: search for .aider.conf.yml in git root, cwd or home directory)
|
||||
#config:
|
||||
|
||||
## Run aider in your browser
|
||||
#gui: false
|
244
aider/website/assets/sample.env
Normal file
|
@ -0,0 +1,244 @@
|
|||
##########################################################
|
||||
# Sample aider .env file.
|
||||
# Place at the root of your git repo.
|
||||
# Or use `aider --env <fname>` to specify.
|
||||
##########################################################
|
||||
|
||||
#################
|
||||
# LLM parameters:
|
||||
#
|
||||
# Include xxx_API_KEY parameters and other params needed for your LLMs.
|
||||
# See https://aider.chat/docs/llms.html for details.
|
||||
|
||||
## OpenAI
|
||||
#OPENAI_API_KEY=
|
||||
|
||||
## Anthropic
|
||||
#ANTHROPIC_API_KEY=
|
||||
|
||||
##...
|
||||
|
||||
#######
|
||||
# Main:
|
||||
|
||||
## Specify the OpenAI API key
|
||||
#OPENAI_API_KEY=
|
||||
|
||||
## Specify the Anthropic API key
|
||||
#ANTHROPIC_API_KEY=
|
||||
|
||||
## Specify the model to use for the main chat
|
||||
#AIDER_MODEL=
|
||||
|
||||
## Use claude-3-opus-20240229 model for the main chat
|
||||
#AIDER_OPUS=
|
||||
|
||||
## Use claude-3-5-sonnet-20240620 model for the main chat
|
||||
#AIDER_SONNET=
|
||||
|
||||
## Use gpt-4-0613 model for the main chat
|
||||
#AIDER_4=
|
||||
|
||||
## Use gpt-4o model for the main chat
|
||||
#AIDER_4O=
|
||||
|
||||
## Use gpt-4-1106-preview model for the main chat
|
||||
#AIDER_4_TURBO=
|
||||
|
||||
## Use gpt-3.5-turbo model for the main chat
|
||||
#AIDER_35TURBO=
|
||||
|
||||
#################
|
||||
# Model Settings:
|
||||
|
||||
## List known models which match the (partial) MODEL name
|
||||
#AIDER_MODELS=
|
||||
|
||||
## Specify the api base url
|
||||
#OPENAI_API_BASE=
|
||||
|
||||
## Specify the api_type
|
||||
#OPENAI_API_TYPE=
|
||||
|
||||
## Specify the api_version
|
||||
#OPENAI_API_VERSION=
|
||||
|
||||
## Specify the deployment_id
|
||||
#OPENAI_API_DEPLOYMENT_ID=
|
||||
|
||||
## Specify the OpenAI organization ID
|
||||
#OPENAI_ORGANIZATION_ID=
|
||||
|
||||
## Specify a file with aider model settings for unknown models
|
||||
#AIDER_MODEL_SETTINGS_FILE=.aider.model.settings.yml
|
||||
|
||||
## Specify a file with context window and costs for unknown models
|
||||
#AIDER_MODEL_METADATA_FILE=.aider.model.metadata.json
|
||||
|
||||
## Verify the SSL cert when connecting to models (default: True)
|
||||
#AIDER_VERIFY_SSL=true
|
||||
|
||||
## Specify what edit format the LLM should use (default depends on model)
|
||||
#AIDER_EDIT_FORMAT=
|
||||
|
||||
## Specify the model to use for commit messages and chat history summarization (default depends on --model)
|
||||
#AIDER_WEAK_MODEL=
|
||||
|
||||
## Only work with models that have meta-data available (default: True)
|
||||
#AIDER_SHOW_MODEL_WARNINGS=true
|
||||
|
||||
## Max number of tokens to use for repo map, use 0 to disable (default: 1024)
|
||||
#AIDER_MAP_TOKENS=true
|
||||
|
||||
## Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.
|
||||
#AIDER_MAX_CHAT_HISTORY_TOKENS=
|
||||
|
||||
## Specify the .env file to load (default: .env in git root)
|
||||
#AIDER_ENV_FILE=.env
|
||||
|
||||
################
|
||||
# History Files:
|
||||
|
||||
## Specify the chat input history file (default: .aider.input.history)
|
||||
#AIDER_INPUT_HISTORY_FILE=.aider.input.history
|
||||
|
||||
## Specify the chat history file (default: .aider.chat.history.md)
|
||||
#AIDER_CHAT_HISTORY_FILE=.aider.chat.history.md
|
||||
|
||||
## Restore the previous chat history messages (default: False)
|
||||
#AIDER_RESTORE_CHAT_HISTORY=false
|
||||
|
||||
## Log the conversation with the LLM to this file (for example, .aider.llm.history)
|
||||
#AIDER_LLM_HISTORY_FILE=
|
||||
|
||||
##################
|
||||
# Output Settings:
|
||||
|
||||
## Use colors suitable for a dark terminal background (default: False)
|
||||
#AIDER_DARK_MODE=false
|
||||
|
||||
## Use colors suitable for a light terminal background (default: False)
|
||||
#AIDER_LIGHT_MODE=false
|
||||
|
||||
## Enable/disable pretty, colorized output (default: True)
|
||||
#AIDER_PRETTY=true
|
||||
|
||||
## Enable/disable streaming responses (default: True)
|
||||
#AIDER_STREAM=true
|
||||
|
||||
## Set the color for user input (default: #00cc00)
|
||||
#AIDER_USER_INPUT_COLOR=#00cc00
|
||||
|
||||
## Set the color for tool output (default: None)
|
||||
#AIDER_TOOL_OUTPUT_COLOR=
|
||||
|
||||
## Set the color for tool error messages (default: red)
|
||||
#AIDER_TOOL_ERROR_COLOR=#FF2222
|
||||
|
||||
## Set the color for assistant output (default: #0088ff)
|
||||
#AIDER_ASSISTANT_OUTPUT_COLOR=#0088ff
|
||||
|
||||
## Set the markdown code theme (default: default, other options include monokai, solarized-dark, solarized-light)
|
||||
#AIDER_CODE_THEME=default
|
||||
|
||||
## Show diffs when committing changes (default: False)
|
||||
#AIDER_SHOW_DIFFS=false
|
||||
|
||||
###############
|
||||
# Git Settings:
|
||||
|
||||
## Enable/disable looking for a git repo (default: True)
|
||||
#AIDER_GIT=true
|
||||
|
||||
## Enable/disable adding .aider* to .gitignore (default: True)
|
||||
#AIDER_GITIGNORE=true
|
||||
|
||||
## Specify the aider ignore file (default: .aiderignore in git root)
|
||||
#AIDER_AIDERIGNORE=.aiderignore
|
||||
|
||||
## Enable/disable auto commit of LLM changes (default: True)
|
||||
#AIDER_AUTO_COMMITS=true
|
||||
|
||||
## Enable/disable commits when repo is found dirty (default: True)
|
||||
#AIDER_DIRTY_COMMITS=true
|
||||
|
||||
## Attribute aider code changes in the git author name (default: True)
|
||||
#AIDER_ATTRIBUTE_AUTHOR=true
|
||||
|
||||
## Attribute aider commits in the git committer name (default: True)
|
||||
#AIDER_ATTRIBUTE_COMMITTER=true
|
||||
|
||||
## Prefix commit messages with 'aider: ' (default: False)
|
||||
#AIDER_ATTRIBUTE_COMMIT_MESSAGE=false
|
||||
|
||||
## Perform a dry run without modifying files (default: False)
|
||||
#AIDER_DRY_RUN=false
|
||||
|
||||
########################
|
||||
# Fixing and committing:
|
||||
|
||||
## Commit all pending changes with a suitable commit message, then exit
|
||||
#AIDER_COMMIT=false
|
||||
|
||||
## Lint and fix provided files, or dirty files if none provided
|
||||
#AIDER_LINT=false
|
||||
|
||||
## Specify lint commands to run for different languages, eg: "python: flake8 --select=..." (can be used multiple times)
|
||||
#AIDER_LINT_CMD=
|
||||
|
||||
## Enable/disable automatic linting after changes (default: True)
|
||||
#AIDER_AUTO_LINT=true
|
||||
|
||||
## Specify command to run tests
|
||||
#AIDER_TEST_CMD=
|
||||
|
||||
## Enable/disable automatic testing after changes (default: False)
|
||||
#AIDER_AUTO_TEST=false
|
||||
|
||||
## Run tests and fix problems found
|
||||
#AIDER_TEST=false
|
||||
|
||||
#################
|
||||
# Other Settings:
|
||||
|
||||
## Use VI editing mode in the terminal (default: False)
|
||||
#AIDER_VIM=false
|
||||
|
||||
## Specify the language for voice using ISO 639-1 code (default: auto)
|
||||
#AIDER_VOICE_LANGUAGE=en
|
||||
|
||||
## Check for updates and return status in the exit code
|
||||
#AIDER_CHECK_UPDATE=false
|
||||
|
||||
## Skips checking for the update when the program runs
|
||||
#AIDER_SKIP_CHECK_UPDATE=false
|
||||
|
||||
## Apply the changes from the given file instead of running the chat (debug)
|
||||
#AIDER_APPLY=
|
||||
|
||||
## Always say yes to every confirmation
|
||||
#AIDER_YES=
|
||||
|
||||
## Enable verbose output
|
||||
#AIDER_VERBOSE=false
|
||||
|
||||
## Print the repo map and exit (debug)
|
||||
#AIDER_SHOW_REPO_MAP=false
|
||||
|
||||
## Print the system prompts and exit (debug)
|
||||
#AIDER_SHOW_PROMPTS=false
|
||||
|
||||
## Do all startup activities then exit before accepting user input (debug)
|
||||
#AIDER_EXIT=false
|
||||
|
||||
## Specify a single message to send the LLM, process reply then exit (disables chat mode)
|
||||
#AIDER_MESSAGE=
|
||||
|
||||
## Specify a file containing the message to send the LLM, process reply, then exit (disables chat mode)
|
||||
#AIDER_MESSAGE_FILE=
|
||||
|
||||
## Specify the encoding for input and output (default: utf-8)
|
||||
#AIDER_ENCODING=utf-8
|
||||
|
||||
## Run aider in your browser
|
||||
#AIDER_GUI=false
|
194
aider/website/assets/screencast.svg
Normal file
After Width: | Height: | Size: 115 KiB |
BIN
aider/website/assets/screenshot.png
Normal file
After Width: | Height: | Size: 162 KiB |
BIN
aider/website/assets/self-assembly.jpg
Normal file
After Width: | Height: | Size: 162 KiB |
BIN
aider/website/assets/sonnet-not-lazy.jpg
Normal file
After Width: | Height: | Size: 209 KiB |
BIN
aider/website/assets/swe_bench.jpg
Normal file
After Width: | Height: | Size: 42 KiB |
2445
aider/website/assets/swe_bench.svg
Normal file
After Width: | Height: | Size: 58 KiB |
BIN
aider/website/assets/swe_bench_lite.jpg
Normal file
After Width: | Height: | Size: 38 KiB |
2306
aider/website/assets/swe_bench_lite.svg
Normal file
After Width: | Height: | Size: 53 KiB |
BIN
aider/website/assets/udiffs.jpg
Normal file
After Width: | Height: | Size: 84 KiB |
37
aider/website/blog/index.html
Normal file
|
@ -0,0 +1,37 @@
|
|||
---
|
||||
title: Aider blog
|
||||
layout: default
|
||||
nav_order: 1000
|
||||
---
|
||||
|
||||
<h1>Aider blog</h1>
|
||||
|
||||
<div class="posts">
|
||||
{% for post in site.posts %}
|
||||
{% unless post.draft %}
|
||||
<article class="post">
|
||||
<h2><a href="{{ site.baseurl }}{{ post.url }}">{{ post.title }}</a></h2>
|
||||
<div class="entry">
|
||||
<div class="post-content">
|
||||
{% if post.excerpt %}
|
||||
{{ post.excerpt }}
|
||||
{% else %}
|
||||
{{ post.content | strip_html | truncatewords: 100 }}...
|
||||
{% endif %}
|
||||
{% if post.highlight_image %}
|
||||
<a href="{{ site.baseurl }}{{ post.url }}">
|
||||
<div class="post-highlight">
|
||||
<img src="{{ site.baseurl }}{{ post.highlight_image }}" alt="Highlight Image">
|
||||
</div>
|
||||
</a>
|
||||
{% endif %}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<p class="post-date">
|
||||
{{ post.date | date: "%b %-d, %Y" | upcase }}
|
||||
</p>
|
||||
</article>
|
||||
{% endunless %}
|
||||
{% endfor %}
|
||||
</div>
|
45
aider/website/docs/benchmarks-0125.md
Normal file
|
@ -0,0 +1,45 @@
|
|||
---
|
||||
title: The January GPT-4 Turbo is lazier than the last version
|
||||
excerpt: The new `gpt-4-0125-preview` model is quantiatively lazier at coding than previous GPT-4 versions, according to a new "laziness" benchmark.
|
||||
highlight_image: /assets/benchmarks-0125.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# The January GPT-4 Turbo is lazier than the last version
|
||||
|
||||
[](https://aider.chat/assets/benchmarks-0125.svg)
|
||||
|
||||
[OpenAI just released a new version of GPT-4 Turbo](https://openai.com/blog/new-embedding-models-and-api-updates).
|
||||
This new model is intended to reduce the "laziness" that has been widely observed with the previous `gpt-4-1106-preview` model:
|
||||
|
||||
> Today, we are releasing an updated GPT-4 Turbo preview model, gpt-4-0125-preview. This model completes tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of “laziness” where the model doesn’t complete a task.
|
||||
|
||||
With that in mind, I've been benchmarking the new model using
|
||||
aider's existing
|
||||
[lazy coding benchmark](https://aider.chat/docs/unified-diffs.html).
|
||||
|
||||
## Benchmark results
|
||||
|
||||
Overall,
|
||||
the new `gpt-4-0125-preview` model seems lazier
|
||||
than the November `gpt-4-1106-preview` model:
|
||||
|
||||
- It gets worse benchmark scores when using the [unified diffs](https://aider.chat/docs/unified-diffs.html) code editing format.
|
||||
- Using aider's older SEARCH/REPLACE block editing format, the new January model outperforms the older November model. But it still performs worse than both models using unified diffs.
|
||||
|
||||
## Related reports
|
||||
|
||||
This is one in a series of reports
|
||||
that use the aider benchmarking suite to assess and compare the code
|
||||
editing capabilities of OpenAI's GPT models.
|
||||
You can review the other reports
|
||||
for additional information:
|
||||
|
||||
- [GPT code editing benchmarks](https://aider.chat/docs/benchmarks.html) evaluates the March and June versions of GPT-3.5 and GPT-4.
|
||||
- [Code editing benchmarks for OpenAI's "1106" models](https://aider.chat/docs/benchmarks-1106.html).
|
||||
- [Aider's lazy coding benchmark](https://aider.chat/docs/unified-diffs.html).
|
||||
|
||||
|
92
aider/website/docs/benchmarks-1106.md
Normal file
|
@ -0,0 +1,92 @@
|
|||
---
|
||||
title: Code editing benchmarks for OpenAI's "1106" models
|
||||
excerpt: A quantitative comparison of the code editing capabilities of the new GPT-3.5 and GPT-4 versions that were released in Nov 2023.
|
||||
highlight_image: /assets/benchmarks-1106.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Code editing benchmarks for OpenAI's "1106" models
|
||||
|
||||
[](https://aider.chat/assets/benchmarks-1106.svg)
|
||||
|
||||
[](https://aider.chat/assets/benchmarks-speed-1106.svg)
|
||||
|
||||
[OpenAI just released new versions of GPT-3.5 and GPT-4](https://openai.com/blog/new-models-and-developer-products-announced-at-devday),
|
||||
and there's a lot
|
||||
of interest about their ability to code compared to the previous versions.
|
||||
With that in mind, I've been benchmarking the new models.
|
||||
|
||||
[Aider](https://github.com/paul-gauthier/aider)
|
||||
is an open source command line chat tool that lets you work with GPT to edit
|
||||
code in your local git repo.
|
||||
To do this, aider needs to be able to reliably recognize when GPT wants to edit
|
||||
your source code,
|
||||
determine which files it wants to modify
|
||||
and accurately apply the changes it's trying to make.
|
||||
Doing a good job on this "code editing" task requires a good LLM, good prompting and
|
||||
a good tool driving the interactions with the LLM.
|
||||
|
||||
Aider relies on a
|
||||
[code editing benchmark](https://aider.chat/docs/benchmarks.html)
|
||||
to quantitatively evaluate
|
||||
performance
|
||||
whenever one of these things changes.
|
||||
For example,
|
||||
whenever I change aider's prompting or the backend which drives LLM conversations,
|
||||
I run the benchmark to make sure these changes produce improvements (not regressions).
|
||||
|
||||
The benchmark uses aider to try and complete
|
||||
[133 Exercism Python coding exercises](https://github.com/exercism/python).
|
||||
For each exercise, Exercism provides a starting python file with stubs for the needed functions,
|
||||
a natural language description of the problem to solve
|
||||
and a test suite to evaluate whether the coder has correctly solved the problem.
|
||||
|
||||
The benchmark gives aider two tries to complete the task:
|
||||
|
||||
1. On the first try, aider gives GPT the stub code file to edit and the natural language instructions that describe the problem. This reflects how you code with aider. You add your source code files to the chat and ask for changes, which are automatically applied.
|
||||
2. If the test suite fails after the first try, aider gives GPT the test error output and asks it to fix the code. Aider supports this sort of interaction using a command like `/run pytest` to run and share pytest results in the chat with GPT. You can `/run` whatever tests/linters/etc make sense for your language/framework/situation.
|
||||
|
||||
## Benchmark results
|
||||
|
||||
### gpt-4-1106-preview
|
||||
|
||||
For now, I have only benchmarked the GPT-4 models using the `diff` edit method.
|
||||
This is the edit format that aider uses by default with gpt-4.
|
||||
|
||||
- The new `gpt-4-1106-preview` model seems **2-2.5X faster** than the June GPT-4 model.
|
||||
- **It seems better at producing correct code on the first try**. It gets
|
||||
53% of the coding exercises correct, without needing to see errors from the test suite. Previous models only get 46-47% of the exercises correct on the first try.
|
||||
- The new model seems to perform similar
|
||||
(~65%) to the old models (63-64%) after their second chance to correct bugs by reviewing test suite error output.
|
||||
|
||||
### gpt-3.5-turbo-1106
|
||||
|
||||
I benchmarked the GPT-3.5 models with both the `whole` and `diff` edit format.
|
||||
None of the gpt-3.5 models seem able to effectively use the `diff` edit format, including the newest November (1106) model.
|
||||
|
||||
The comments below only focus on comparing the `whole` edit format results:
|
||||
|
||||
- The new `gpt-3.5-turbo-1106` model is completing the benchmark **3-4X faster** than the earlier GPT-3.5 models.
|
||||
- The success rate after the first try of 42% is comparable to the previous June (0613) model. The new November and previous June models are both worse than the original March (0301) model's 50% result on the first try.
|
||||
- The new model's 56% success rate after the second try seems comparable to the original March model, and somewhat better than the June model's 50% score.
|
||||
|
||||
|
||||
## Related reports
|
||||
|
||||
This is one in a series of reports
|
||||
that use the aider benchmarking suite to assess and compare the code
|
||||
editing capabilities of OpenAI's GPT models.
|
||||
You can review the other reports
|
||||
for additional information:
|
||||
|
||||
- [GPT code editing benchmarks](https://aider.chat/docs/benchmarks.html) evaluates the March and June versions of GPT-3.5 and GPT-4.
|
||||
- [Code editing speed benchmarks for OpenAI's "1106" models](https://aider.chat/2023/11/06/benchmarks-speed-1106.html) compares the performance of the new GPT models.
|
||||
|
||||
|
||||
## Updates
|
||||
|
||||
Last updated 11/14/23.
|
||||
OpenAI has relaxed rate limits so these results are no longer considered preliminary.
|
59
aider/website/docs/benchmarks-speed-1106.md
Normal file
|
@ -0,0 +1,59 @@
|
|||
---
|
||||
title: Speed benchmarks of GPT-4 Turbo and gpt-3.5-turbo-1106
|
||||
excerpt: This report provides a detailed comparison of the speed of GPT-4 Turbo and gpt-3.5-turbo-1106 models based on the aider benchmarking suite.
|
||||
canonical_url: https://aider.chat/2023/11/06/benchmarks-speed-1106.html
|
||||
highlight_image: /assets/benchmarks-speed-1106.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Speed benchmarks of GPT-4 Turbo and gpt-3.5-turbo-1106
|
||||
|
||||
<p class="post-date">{{ page.date | date: "%b %-d, %Y" }}</p>
|
||||
|
||||
[](https://aider.chat/assets/benchmarks-speed-1106.svg)
|
||||
|
||||
[OpenAI just released new versions of GPT-3.5 and GPT-4](https://openai.com/blog/new-models-and-developer-products-announced-at-devday),
|
||||
and there's a lot
|
||||
of interest about their capabilities and performance.
|
||||
With that in mind, I've been benchmarking the new models.
|
||||
|
||||
[Aider](https://github.com/paul-gauthier/aider)
|
||||
is an open source command line chat tool that lets you work with GPT to edit
|
||||
code in your local git repo.
|
||||
Aider relies on a
|
||||
[code editing benchmark](https://aider.chat/docs/benchmarks.html)
|
||||
to quantitatively evaluate
|
||||
performance.
|
||||
|
||||
This is the latest in a series of reports
|
||||
that use the aider benchmarking suite to assess and compare the code
|
||||
editing capabilities of OpenAI's GPT models. You can review previous
|
||||
reports to get more background on aider's benchmark suite:
|
||||
|
||||
- [GPT code editing benchmarks](https://aider.chat/docs/benchmarks.html) evaluates the March and June versions of GPT-3.5 and GPT-4.
|
||||
- [Code editing skill benchmarks for OpenAI's "1106" models](https://aider.chat/docs/benchmarks-1106.html) compares the olders models to the November (1106) models.
|
||||
|
||||
## Speed
|
||||
|
||||
This report compares the **speed** of the various GPT models.
|
||||
Aider's benchmark measures the response time of the OpenAI chat completion
|
||||
endpoint each time it asks GPT to solve a programming exercise in the benchmark
|
||||
suite. These results measure only the time spent waiting for OpenAI to
|
||||
respond to the prompt.
|
||||
So they are measuring
|
||||
how fast these models can
|
||||
generate responses which primarily consist of source code.
|
||||
|
||||
Some observations:
|
||||
|
||||
- **GPT-3.5 got 6-11x faster.** The `gpt-3.5-turbo-1106` model is 6-11x faster than the June (0613) version which has been the default `gpt-3.5-turbo` model.
|
||||
- **GPT-4 Turbo is 2-2.5x faster.** The new `gpt-4-1106-preview` model is 2-2.5x faster than the June (0613) version which has been the default `gpt-4` model.
|
||||
- The old March (0301) version of GPT-3.5 is actually faster than the June (0613) version. This was a surprising discovery.
|
||||
|
||||
## Updates
|
||||
|
||||
Last updated 11/14/23.
|
||||
OpenAI has relaxed rate limits so these results are no longer considered preliminary.
|
368
aider/website/docs/benchmarks.md
Normal file
|
@ -0,0 +1,368 @@
|
|||
---
|
||||
title: GPT code editing benchmarks
|
||||
excerpt: Benchmarking GPT-3.5 and GPT-4 code editing skill using a new code editing benchmark suite based on the Exercism python exercises.
|
||||
highlight_image: /assets/benchmarks.jpg
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# GPT code editing benchmarks
|
||||
|
||||
[](https://aider.chat/assets/benchmarks.svg)
|
||||
|
||||
Aider is an open source command line chat tool that lets you work with GPT to edit
|
||||
code in your local git repo.
|
||||
To do this, aider needs to be able to reliably recognize when GPT wants to edit local files,
|
||||
determine which files it wants to modify and what changes to save.
|
||||
Such automated
|
||||
code editing hinges on using the system prompt
|
||||
to tell GPT how to structure code edits in its responses.
|
||||
|
||||
Aider currently asks GPT to use simple text based "edit formats", but
|
||||
[OpenAI's new function calling
|
||||
API](https://openai.com/blog/function-calling-and-other-api-updates)
|
||||
looks like a promising way to create more structured edit formats.
|
||||
After implementing a couple of function based edit formats,
|
||||
I wanted
|
||||
to measure the potential benefits
|
||||
of switching aider to use them by default.
|
||||
|
||||
With this in mind, I developed a
|
||||
benchmark based on the [Exercism
|
||||
python](https://github.com/exercism/python) coding exercises.
|
||||
This
|
||||
benchmark evaluates how effectively aider and GPT can translate a
|
||||
natural language coding request into executable code saved into
|
||||
files that pass unit tests.
|
||||
It provides an end-to-end evaluation of not just
|
||||
GPT's coding ability, but also its capacity to *edit existing code*
|
||||
and *format those code edits* so that aider can save the
|
||||
edits to the local source files.
|
||||
|
||||
I ran the benchmark
|
||||
on all the ChatGPT models (except `gpt-4-32k`), using a variety of edit formats.
|
||||
The results were interesting:
|
||||
|
||||
- **Plain text edit formats worked best.** Asking GPT to return an updated copy of the whole file in a standard markdown fenced code block proved to be the most reliable and effective edit format across all GPT-3.5 and GPT-4 models. The results for this `whole` edit format are shown in solid blue in the graph.
|
||||
- **Function calls performed worse.** Using the new functions API for edits performed worse than the above whole file method, for all the models. GPT-3.5 especially produced inferior code and frequently mangled this output format. This was surprising, as the functions API was introduced to enhance the reliability of structured outputs. The results for these `...-func` edit methods are shown as patterned bars in the graph (both green and blue).
|
||||
- **The new June GPT-3.5 models did a bit worse than the old June model.** The performance of the new June (`0613`) versions of GPT-3.5 appears to be a bit worse than the February (`0301`) version. This is visible if you look at the "first attempt" markers on the first three solid blue bars and also by comparing the first three solid green `diff` bars.
|
||||
- **GPT-4 does better than GPT-3.5,** as expected.
|
||||
|
||||
The quantitative benchmark results agree with my intuitions
|
||||
about prompting GPT for complex tasks like coding. It's beneficial to
|
||||
minimize the "cognitive overhead" of formatting the response, allowing
|
||||
GPT to concentrate on the coding task at hand.
|
||||
|
||||
As a thought experiment, imagine a slack conversation with a junior developer where
|
||||
you ask them to write the code to add some new feature to your app.
|
||||
They're going to type the response back to you by hand in the chat.
|
||||
Should they type out the
|
||||
code and wrap it in a normal markdown code block?
|
||||
Or should they type up a properly escaped and
|
||||
syntactically correct json data structure
|
||||
that contains the text of the new code?
|
||||
|
||||
Using more complex output formats with GPT seems to cause two issues:
|
||||
|
||||
- It makes GPT write worse code. Keeping the output format simple seems to allow GPT to devote more attention to the actual coding task.
|
||||
- It reduces GPT's adherence to the output format, making it more challenging for tools like aider to accurately identify and apply the edits GPT is attempting to make.
|
||||
|
||||
I was expecting to start using function call based edits in aider for both GPT-3.5 and GPT-4.
|
||||
But given these benchmark results, I won't be adopting the functions API
|
||||
at this time.
|
||||
I will certainly plan to benchmark functions again with future versions of the models.
|
||||
|
||||
More details on the benchmark, edit formats and results are discussed below.
|
||||
|
||||
|
||||
## The benchmark
|
||||
|
||||
The benchmark uses
|
||||
[133 practice exercises from the Exercism python repository](https://github.com/exercism/python/tree/main/exercises/practice).
|
||||
These
|
||||
exercises were designed to help individuals learn Python and hone
|
||||
their coding skills.
|
||||
|
||||
Each exercise includes:
|
||||
|
||||
- [Instructions](https://github.com/exercism/python/blob/main/exercises/practice/anagram/.docs/instructions.md), provided in markdown files.
|
||||
- [Stub python code](https://github.com/exercism/python/blob/main/exercises/practice/anagram/anagram.py) in an *implementation file*, specifying the functions or classes that need to be implemented.
|
||||
- [Unit tests](https://github.com/exercism/python/blob/main/exercises/practice/anagram/anagram_test.py) in a separate python file.
|
||||
|
||||
The goal is for GPT to read the instructions, implement the provided function/class skeletons
|
||||
and pass all the unit tests. The benchmark measures what percentage of
|
||||
the 133 exercises are completed successfully, causing all the associated unit tests to pass.
|
||||
|
||||
To start each exercise, aider sends GPT
|
||||
the initial contents of the implementation file,
|
||||
the Exercism instructions
|
||||
and a final instruction:
|
||||
|
||||
```
|
||||
Use the above instructions to modify the supplied files: <implementation file>
|
||||
Keep and implement the existing function or class stubs, they will be called from unit tests.
|
||||
Only use standard python libraries, don't suggest installing any packages.
|
||||
```
|
||||
|
||||
Aider updates the implementation file based on GPT's reply and runs
|
||||
the unit tests. If all tests pass, the exercise is considered
|
||||
complete. If some tests fail, aider sends GPT a second message with
|
||||
the test error output. It only sends the first 50 lines of test errors
|
||||
to try and avoid exceeding the context window of the smaller models. Aider
|
||||
also includes this final instruction:
|
||||
|
||||
```
|
||||
See the testing errors above.
|
||||
The tests are correct.
|
||||
Fix the code in <implementation file> to resolve the errors.
|
||||
```
|
||||
|
||||
Requiring GPT to fix its first implementation in response to test failures
|
||||
is another way in which this benchmark stresses code editing skill.
|
||||
This second chance is also important because it
|
||||
gives GPT the opportunity to adjust if the
|
||||
instructions were imprecise with respect to the
|
||||
specific requirements of the unit tests.
|
||||
Many of the exercises have multiple paragraphs of instructions,
|
||||
and most human coders would likely fail some tests on their
|
||||
first try.
|
||||
|
||||
The bars in the graph show the percent of exercises that were completed by
|
||||
each model and edit format combination. The full bar height represents
|
||||
the final outcome following both coding attempts.
|
||||
Each bar also has a horizontal mark that shows
|
||||
the intermediate performance after the first coding attempt,
|
||||
without the benefit of the second try that includes the test error output.
|
||||
|
||||
It's worth noting that GPT never gets to see the source code of the
|
||||
unit tests during the benchmark. It only sees the error output from
|
||||
failed tests. Of course, all of this code was probably part of its
|
||||
original training data!
|
||||
|
||||
In summary, passing an exercise means GPT was able to:
|
||||
|
||||
- Write the required code (possibly after reviewing test error output),
|
||||
- Correctly package all of the code edits into the edit format so that aider can process and save it to the implementation file.
|
||||
|
||||
Conversely, failing an exercise only requires a breakdown in one of
|
||||
those steps. In practice, GPT fails at different steps in different
|
||||
exercises. Sometimes it simply writes the wrong code. Other times, it
|
||||
fails to format the code edits in a way that conforms to the edit
|
||||
format, resulting in the code not being saved correctly.
|
||||
|
||||
It's worth keeping in mind that changing the edit format often affects
|
||||
both aspects of GPT's performance.
|
||||
Complex edit formats often lead GPT to write worse code *and* make it less
|
||||
successful at formatting the edits correctly.
|
||||
|
||||
|
||||
## Edit formats
|
||||
|
||||
I benchmarked 4 different edit formats, described below.
|
||||
Each description includes a sample response that GPT might provide to a user who
|
||||
requests:
|
||||
"Change the print from hello to goodbye."
|
||||
|
||||
### whole
|
||||
|
||||
The
|
||||
[whole](https://github.com/paul-gauthier/aider/blob/main/aider/coders/wholefile_prompts.py)
|
||||
format asks GPT to return an updated copy of the entire file, including any changes.
|
||||
The file should be
|
||||
formatted with normal markdown triple-backtick fences, inlined with the rest of its response text.
|
||||
|
||||
This format is very similar to how ChatGPT returns code snippets during normal chats, except with the addition of a filename right before the opening triple-backticks.
|
||||
|
||||
````
|
||||
Here is the updated copy of your file demo.py:
|
||||
|
||||
demo.py
|
||||
```python
|
||||
def main():
|
||||
print("goodbye")
|
||||
```
|
||||
````
|
||||
|
||||
### diff
|
||||
|
||||
The [diff](https://github.com/paul-gauthier/aider/blob/main/aider/coders/editblock_prompts.py)
|
||||
format also asks GPT to return edits as part of the normal response text,
|
||||
in a simple diff format.
|
||||
Each edit is a fenced code block that
|
||||
specifies the filename and a chunk of ORIGINAL and UPDATED code.
|
||||
GPT provides some original lines from the file and then a new updated set of lines.
|
||||
|
||||
````
|
||||
Here are the changes you requested to demo.py:
|
||||
|
||||
```python
|
||||
demo.py
|
||||
<<<<<<< ORIGINAL
|
||||
print("hello")
|
||||
=======
|
||||
print("goodbye")
|
||||
>>>>>>> UPDATED
|
||||
```
|
||||
````
|
||||
|
||||
### whole-func
|
||||
|
||||
The [whole-func](https://github.com/paul-gauthier/aider/blob/main/aider/coders/wholefile_func_coder.py)
|
||||
format requests updated copies of whole files to be returned using the function call API.
|
||||
|
||||
|
||||
```
|
||||
{
|
||||
"explanation": "Changed hello to goodbye.",
|
||||
"files": [
|
||||
{
|
||||
"path": "demo.py",
|
||||
"content": "def main():\n print(\"goodbye\")\n"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### diff-func
|
||||
|
||||
The
|
||||
[diff-func](https://github.com/paul-gauthier/aider/blob/main/aider/coders/editblock_func_coder.py)
|
||||
format requests a list of
|
||||
original/updated style edits to be returned using the function call API.
|
||||
|
||||
```
|
||||
{
|
||||
"explanation": "Changed hello to goodbye.",
|
||||
"edits": [
|
||||
{
|
||||
"path": "demo.py",
|
||||
"original_lines": [
|
||||
" print(\"hello\")"
|
||||
],
|
||||
"updated_lines": [
|
||||
" print(\"goodbye\")"
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## GPT-3.5's performance
|
||||
|
||||
### The `0613` models seem worse?
|
||||
|
||||
The GPT-3.5 benchmark results have me fairly convinced that the new
|
||||
`gpt-3.5-turbo-0613` and `gpt-3.5-16k-0613` models
|
||||
are a bit worse at code editing than
|
||||
the older `gpt-3.5-turbo-0301` model.
|
||||
|
||||
This is visible in the "first attempt"
|
||||
portion of each result, before GPT gets a second chance to edit the code.
|
||||
Look at the horizontal white line in the middle of the first three blue bars.
|
||||
Performance with the `whole` edit format was 46% for the
|
||||
February model and only 39% for the June models.
|
||||
|
||||
But also note how much the solid green `diff` bars
|
||||
degrade between the February and June GPT-3.5 models.
|
||||
They drop from 30% down to about 19%.
|
||||
|
||||
I saw other signs of this degraded performance
|
||||
in earlier versions of the
|
||||
benchmark as well.
|
||||
|
||||
### Pathological use of `diff`
|
||||
|
||||
When GPT-3.5 is able to correctly generate the `diff` edit format,
|
||||
it often uses it in a pathological manner. It places the *entire*
|
||||
original source file in the ORIGINAL block and the entire updated file
|
||||
in the UPDATED block. This is strictly worse than just using the
|
||||
`whole` edit format, as GPT is sending two full copies of the file.
|
||||
|
||||
### Hallucinated function calls
|
||||
|
||||
When GPT-3.5 uses the functions API
|
||||
it is prone to ignoring the JSON Schema that specifies valid functions.
|
||||
It often returns a completely novel and semantically
|
||||
invalid `function_call` fragment with `"name": "python"`.
|
||||
|
||||
The `arguments` attribute is supposed to be a set of key/value pairs
|
||||
with the arguments to the function specified in the `name` field.
|
||||
Instead, GPT-3.5 frequently just stuffs an entire python
|
||||
file into that field.
|
||||
|
||||
```
|
||||
"function_call": {
|
||||
"name": "python",
|
||||
"arguments": "def main():\n print(\"hello\")\n"
|
||||
},
|
||||
```
|
||||
|
||||
It seems like it might be getting confused by fine-tuning that was
|
||||
done for the ChatGPT code interpreter plugin?
|
||||
|
||||
|
||||
|
||||
|
||||
## Randomness
|
||||
|
||||
The benchmark attempts to be deterministic, always sending identical
|
||||
requests for each exercise on repeated runs.
|
||||
As part of this effort,
|
||||
when sending test error output to GPT,
|
||||
it removes the wall-clock timing information that
|
||||
is normally included by the `unittest` module.
|
||||
|
||||
The benchmark harness also logs SHA hashes of
|
||||
all the OpenAI API requests and replies.
|
||||
This makes it possible to
|
||||
detect randomness or nondeterminism
|
||||
in the benchmarking process.
|
||||
|
||||
It turns out that the OpenAI chat APIs are not deterministic, even at
|
||||
`temperature=0`. The same identical request will produce multiple
|
||||
distinct responses, usually less than 5-10 variations. This suggests
|
||||
that OpenAI may be load balancing their API across a number of
|
||||
slightly different instances of the model?
|
||||
|
||||
For certain exercises, some of these variable responses pass the unit tests while
|
||||
other variants do not. Results for exercises like this, which are
|
||||
"on the bubble",
|
||||
are therefore a bit random, depending on which variant OpenAI returns.
|
||||
|
||||
Given that, it would be ideal to run all 133 exercises many times for each
|
||||
model/edit-format combination and report an average performance.
|
||||
This would average away the effect of the API variance.
|
||||
It would also significantly increase the cost of this sort of benchmarking.
|
||||
So I didn't do that.
|
||||
|
||||
Benchmarking against 133 exercises already provides some robustness, since
|
||||
we are measuring the performance across many exercises.
|
||||
|
||||
But to get a sense of how much the API variance impacts the benchmark outcomes,
|
||||
I ran all 133 exercises 10 times each
|
||||
against `gpt-3.5-turbo-0613` with the `whole` edit format.
|
||||
You'll see one set of error bars in the graph, which show
|
||||
the range of results from those 10 runs.
|
||||
|
||||
The OpenAI API randomness doesn't seem to
|
||||
cause a large variance in the overall benchmark results.
|
||||
|
||||
## Conclusions
|
||||
|
||||
Based on these benchmark results, aider will continue to use
|
||||
the `whole` edit format for GPT-3.5, and `diff` for GPT-4.
|
||||
|
||||
GPT-4 gets comparable results with the `whole` and `diff` edit formats,
|
||||
but using `whole` significantly increases costs and latency compared to `diff`.
|
||||
|
||||
The latency of streaming back the entire updated copy of each edited file
|
||||
is a real challenge with the `whole` format.
|
||||
The GPT-3.5 models are quite responsive, and can
|
||||
stream back entire files at reasonable speed.
|
||||
Aider displays a progress bar and
|
||||
live diffs of the files as they stream in,
|
||||
which helps pass the time.
|
||||
|
||||
The GPT-4 models are much slower, and waiting for even small files
|
||||
to be completely "retyped" on each request is probably unacceptable.
|
57
aider/website/docs/browser.md
Normal file
|
@ -0,0 +1,57 @@
|
|||
---
|
||||
title: Aider in your browser
|
||||
highlight_image: /assets/browser.jpg
|
||||
parent: Usage
|
||||
nav_order: 800
|
||||
description: Aider can run in your browser, not just on the command line.
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Aider in your browser
|
||||
|
||||
<div class="video-container">
|
||||
<video controls loop poster="/assets/browser.jpg">
|
||||
<source src="/assets/aider-browser-social.mp4" type="video/mp4">
|
||||
<a href="/assets/aider-browser-social.mp4">Aider browser UI demo video</a>
|
||||
</video>
|
||||
</div>
|
||||
|
||||
<style>
|
||||
.video-container {
|
||||
position: relative;
|
||||
padding-bottom: 101.89%; /* 1080 / 1060 = 1.0189 */
|
||||
height: 0;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.video-container video {
|
||||
position: absolute;
|
||||
top: 0;
|
||||
left: 0;
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
}
|
||||
</style>
|
||||
|
||||
Use aider's new experimental browser UI to collaborate with LLMs
|
||||
to edit code in your local git repo.
|
||||
Aider will directly edit the code in your local source files,
|
||||
and [git commit the changes](https://aider.chat/docs/git.html)
|
||||
with sensible commit messages.
|
||||
You can start a new project or work with an existing git repo.
|
||||
Aider works well with GPT 3.5, GPT-4, GPT-4 Turbo with Vision,
|
||||
and Claude 3 Opus.
|
||||
It also supports [connecting to almost any LLM](https://aider.chat/docs/llms.html).
|
||||
|
||||
Use the `--browser` switch to launch the browser version of aider:
|
||||
|
||||
```
|
||||
pip install aider-chat
|
||||
|
||||
export OPENAI_API_KEY=<key> # Mac/Linux
|
||||
setx OPENAI_API_KEY <key> # Windows
|
||||
|
||||
aider --browser
|
||||
```
|
79
aider/website/docs/commands.md
Normal file
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
parent: Usage
|
||||
nav_order: 50
|
||||
description: Control aider with in-chat commands like /add, /model, etc.
|
||||
---
|
||||
# In-chat commands
|
||||
|
||||
Aider supports commands from within the chat, which all start with `/`.
|
||||
|
||||
<!--[[[cog
|
||||
from aider.commands import get_help_md
|
||||
cog.out(get_help_md())
|
||||
]]]-->
|
||||
- **/add** Add files to the chat so GPT can edit them or review them in detail
|
||||
- **/clear** Clear the chat history
|
||||
- **/commit** Commit edits to the repo made outside the chat (commit message optional)
|
||||
- **/diff** Display the diff of the last aider commit
|
||||
- **/drop** Remove files from the chat session to free up context space
|
||||
- **/exit** Exit the application
|
||||
- **/git** Run a git command
|
||||
- **/help** Show help about all commands
|
||||
- **/lint** Lint and fix provided files or in-chat files if none provided
|
||||
- **/ls** List all known files and indicate which are included in the chat session
|
||||
- **/model** Switch to a new LLM
|
||||
- **/models** Search the list of available models
|
||||
- **/quit** Exit the application
|
||||
- **/run** Run a shell command and optionally add the output to the chat (alias: !)
|
||||
- **/test** Run a shell command and add the output to the chat on non-zero exit code
|
||||
- **/tokens** Report on the number of tokens used by the current chat context
|
||||
- **/undo** Undo the last git commit if it was done by aider
|
||||
- **/voice** Record and transcribe voice input
|
||||
- **/web** Use headless selenium to scrape a webpage and add the content to the chat
|
||||
<!--[[[end]]]-->
|
||||
|
||||
# Entering multi-line chat messages
|
||||
|
||||
{% include multi-line.md %}
|
||||
|
||||
# Keybindings
|
||||
|
||||
The interactive prompt is built with [prompt-toolkit](https://github.com/prompt-toolkit/python-prompt-toolkit) which provides emacs and vi keybindings.
|
||||
|
||||
## Emacs
|
||||
|
||||
- `Ctrl-A` : Move cursor to the start of the line.
|
||||
- `Ctrl-B` : Move cursor back one character.
|
||||
- `Ctrl-D` : Delete the character under the cursor.
|
||||
- `Ctrl-E` : Move cursor to the end of the line.
|
||||
- `Ctrl-F` : Move cursor forward one character.
|
||||
- `Ctrl-K` : Delete from the cursor to the end of the line.
|
||||
- `Ctrl-L` : Clear the screen.
|
||||
- `Ctrl-N` : Move down to the next history entry.
|
||||
- `Ctrl-P` : Move up to the previous history entry.
|
||||
- `Ctrl-R` : Reverse search in command history.
|
||||
|
||||
|
||||
## Vi
|
||||
|
||||
To use vi/vim keybindings, run aider with the `--vim` switch.
|
||||
|
||||
- `Esc` : Switch to command mode.
|
||||
- `i` : Switch to insert mode.
|
||||
- `a` : Move cursor one character to the right and switch to insert mode.
|
||||
- `A` : Move cursor to the end of the line and switch to insert mode.
|
||||
- `I` : Move cursor to the beginning of the line and switch to insert mode.
|
||||
- `h` : Move cursor one character to the left.
|
||||
- `j` : Move cursor down one line.
|
||||
- `k` : Move cursor up one line.
|
||||
- `l` : Move cursor one character to the right.
|
||||
- `w` : Move cursor forward one word.
|
||||
- `b` : Move cursor backward one word.
|
||||
- `0` : Move cursor to the beginning of the line.
|
||||
- `$` : Move cursor to the end of the line.
|
||||
- `x` : Delete the character under the cursor.
|
||||
- `dd` : Delete the current line.
|
||||
- `u` : Undo the last change.
|
||||
- `Ctrl-R` : Redo the last undone change.
|
||||
|
||||
|
41
aider/website/docs/config.md
Normal file
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
nav_order: 55
|
||||
has_children: true
|
||||
description: Information on all of aider's settings and how to use them.
|
||||
---
|
||||
|
||||
# Configuration
|
||||
|
||||
Aider has many options which can be set with
|
||||
command line switches.
|
||||
Most options can also be set in an `.aider.conf.yml` file
|
||||
which can be placed in your home directory or at the root of
|
||||
your git repo.
|
||||
Or by setting environment variables like `AIDER_xxx`
|
||||
either in your shell or a `.env` file.
|
||||
|
||||
Here are 4 equivalent ways of setting an option.
|
||||
|
||||
With a command line switch:
|
||||
|
||||
```
|
||||
$ aider --dark-mode
|
||||
```
|
||||
|
||||
Using a `.aider.conf.yml` file:
|
||||
|
||||
```yaml
|
||||
dark-mode: true
|
||||
```
|
||||
|
||||
By setting an environgment variable:
|
||||
|
||||
```
|
||||
export AIDER_DARK_MODE=true
|
||||
```
|
||||
|
||||
Using an `.env` file:
|
||||
|
||||
```
|
||||
AIDER_DARK_MODE=true
|
||||
```
|
86
aider/website/docs/config/adv-model-settings.md
Normal file
|
@ -0,0 +1,86 @@
|
|||
---
|
||||
parent: Configuration
|
||||
nav_order: 950
|
||||
description: Configuring advanced settings for LLMs.
|
||||
---
|
||||
|
||||
# Advanced model settings
|
||||
|
||||
## Context window size and token costs
|
||||
|
||||
In most cases, you can safely ignore aider's warning about unknown context
|
||||
window size and model costs.
|
||||
|
||||
But, you can register context window limits and costs for models that aren't known
|
||||
to aider. Create a `.aider.model.metadata.json` file in one of these locations:
|
||||
|
||||
- Your home directory.
|
||||
- The root if your git repo.
|
||||
- The current directory where you launch aider.
|
||||
- Or specify a specific file with the `--model-metadata-file <filename>` switch.
|
||||
|
||||
|
||||
If the files above exist, they will be loaded in that order.
|
||||
Files loaded last will take priority.
|
||||
|
||||
The json file should be a dictionary with an entry for each model, as follows:
|
||||
|
||||
```
|
||||
{
|
||||
"deepseek-chat": {
|
||||
"max_tokens": 4096,
|
||||
"max_input_tokens": 32000,
|
||||
"max_output_tokens": 4096,
|
||||
"input_cost_per_token": 0.00000014,
|
||||
"output_cost_per_token": 0.00000028,
|
||||
"litellm_provider": "deepseek",
|
||||
"mode": "chat"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
See
|
||||
[litellm's model_prices_and_context_window.json file](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json) for more examples.
|
||||
|
||||
## Model settings
|
||||
|
||||
Aider has a number of settings that control how it works with
|
||||
different models.
|
||||
These model settings are pre-configured for most popular models.
|
||||
But it can sometimes be helpful to override them or add settings for
|
||||
a model that aider doesn't know about.
|
||||
|
||||
To do that,
|
||||
create a `.aider.model.settings.yml` file in one of these locations:
|
||||
|
||||
- Your home directory.
|
||||
- The root if your git repo.
|
||||
- The current directory where you launch aider.
|
||||
- Or specify a specific file with the `--model-settings-file <filename>` switch.
|
||||
|
||||
If the files above exist, they will be loaded in that order.
|
||||
Files loaded last will take priority.
|
||||
|
||||
The yaml file should be a a list of dictionary objects for each model, as follows:
|
||||
|
||||
```
|
||||
- name: "gpt-3.5-turbo"
|
||||
edit_format: "whole"
|
||||
weak_model_name: "gpt-3.5-turbo"
|
||||
use_repo_map: false
|
||||
send_undo_reply: false
|
||||
accepts_images: false
|
||||
lazy: false
|
||||
reminder_as_sys_msg: true
|
||||
examples_as_sys_msg: false
|
||||
- name: "gpt-4-turbo-2024-04-09"
|
||||
edit_format: "udiff"
|
||||
weak_model_name: "gpt-3.5-turbo"
|
||||
use_repo_map: true
|
||||
send_undo_reply: true
|
||||
accepts_images: true
|
||||
lazy: true
|
||||
reminder_as_sys_msg: true
|
||||
examples_as_sys_msg: false
|
||||
```
|
||||
|
272
aider/website/docs/config/aider_conf.md
Normal file
|
@ -0,0 +1,272 @@
|
|||
---
|
||||
parent: Configuration
|
||||
nav_order: 15
|
||||
description: How to configure aider with a yaml config file.
|
||||
---
|
||||
|
||||
# YAML config file
|
||||
|
||||
Most of aider's options can be set in an `.aider.conf.yml` file,
|
||||
which can be placed in your home directory or at the root of
|
||||
your git repo.
|
||||
|
||||
{% include special-keys.md %}
|
||||
|
||||
Below is a sample of the file, which you
|
||||
can also
|
||||
[download from GitHub](https://github.com/paul-gauthier/aider/blob/main/website/assets/sample.aider.conf.yml).
|
||||
|
||||
<!--[[[cog
|
||||
from aider.args import get_sample_yaml
|
||||
from pathlib import Path
|
||||
text=get_sample_yaml()
|
||||
Path("website/assets/sample.aider.conf.yml").write_text(text)
|
||||
cog.outl("```")
|
||||
cog.out(text)
|
||||
cog.outl("```")
|
||||
]]]-->
|
||||
```
|
||||
##########################################################
|
||||
# Sample .aider.conf.yaml
|
||||
# This file lists *all* the valid configuration entries.
|
||||
# Place in your home dir, or at the root of your git repo.
|
||||
##########################################################
|
||||
|
||||
##########
|
||||
# options:
|
||||
|
||||
## show this help message and exit
|
||||
#help:
|
||||
|
||||
#######
|
||||
# Main:
|
||||
|
||||
## Specify the OpenAI API key
|
||||
#openai-api-key:
|
||||
|
||||
## Specify the Anthropic API key
|
||||
#anthropic-api-key:
|
||||
|
||||
## Specify the model to use for the main chat
|
||||
#model:
|
||||
|
||||
## Use claude-3-opus-20240229 model for the main chat
|
||||
#opus: false
|
||||
|
||||
## Use claude-3-5-sonnet-20240620 model for the main chat
|
||||
#sonnet: false
|
||||
|
||||
## Use gpt-4-0613 model for the main chat
|
||||
#4: false
|
||||
|
||||
## Use gpt-4o model for the main chat
|
||||
#4o: false
|
||||
|
||||
## Use gpt-4-1106-preview model for the main chat
|
||||
#4-turbo: false
|
||||
|
||||
## Use gpt-3.5-turbo model for the main chat
|
||||
#35turbo: false
|
||||
|
||||
#################
|
||||
# Model Settings:
|
||||
|
||||
## List known models which match the (partial) MODEL name
|
||||
#models:
|
||||
|
||||
## Specify the api base url
|
||||
#openai-api-base:
|
||||
|
||||
## Specify the api_type
|
||||
#openai-api-type:
|
||||
|
||||
## Specify the api_version
|
||||
#openai-api-version:
|
||||
|
||||
## Specify the deployment_id
|
||||
#openai-api-deployment-id:
|
||||
|
||||
## Specify the OpenAI organization ID
|
||||
#openai-organization-id:
|
||||
|
||||
## Specify a file with aider model settings for unknown models
|
||||
#model-settings-file: .aider.model.settings.yml
|
||||
|
||||
## Specify a file with context window and costs for unknown models
|
||||
#model-metadata-file: .aider.model.metadata.json
|
||||
|
||||
## Verify the SSL cert when connecting to models (default: True)
|
||||
#verify-ssl: true
|
||||
|
||||
## Specify what edit format the LLM should use (default depends on model)
|
||||
#edit-format:
|
||||
|
||||
## Specify the model to use for commit messages and chat history summarization (default depends on --model)
|
||||
#weak-model:
|
||||
|
||||
## Only work with models that have meta-data available (default: True)
|
||||
#show-model-warnings: true
|
||||
|
||||
## Max number of tokens to use for repo map, use 0 to disable (default: 1024)
|
||||
#map-tokens: true
|
||||
|
||||
## Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.
|
||||
#max-chat-history-tokens:
|
||||
|
||||
## Specify the .env file to load (default: .env in git root)
|
||||
#env-file: .env
|
||||
|
||||
################
|
||||
# History Files:
|
||||
|
||||
## Specify the chat input history file (default: .aider.input.history)
|
||||
#input-history-file: .aider.input.history
|
||||
|
||||
## Specify the chat history file (default: .aider.chat.history.md)
|
||||
#chat-history-file: .aider.chat.history.md
|
||||
|
||||
## Restore the previous chat history messages (default: False)
|
||||
#restore-chat-history: false
|
||||
|
||||
## Log the conversation with the LLM to this file (for example, .aider.llm.history)
|
||||
#llm-history-file:
|
||||
|
||||
##################
|
||||
# Output Settings:
|
||||
|
||||
## Use colors suitable for a dark terminal background (default: False)
|
||||
#dark-mode: false
|
||||
|
||||
## Use colors suitable for a light terminal background (default: False)
|
||||
#light-mode: false
|
||||
|
||||
## Enable/disable pretty, colorized output (default: True)
|
||||
#pretty: true
|
||||
|
||||
## Enable/disable streaming responses (default: True)
|
||||
#stream: true
|
||||
|
||||
## Set the color for user input (default: #00cc00)
|
||||
#user-input-color: #00cc00
|
||||
|
||||
## Set the color for tool output (default: None)
|
||||
#tool-output-color:
|
||||
|
||||
## Set the color for tool error messages (default: red)
|
||||
#tool-error-color: #FF2222
|
||||
|
||||
## Set the color for assistant output (default: #0088ff)
|
||||
#assistant-output-color: #0088ff
|
||||
|
||||
## Set the markdown code theme (default: default, other options include monokai, solarized-dark, solarized-light)
|
||||
#code-theme: default
|
||||
|
||||
## Show diffs when committing changes (default: False)
|
||||
#show-diffs: false
|
||||
|
||||
###############
|
||||
# Git Settings:
|
||||
|
||||
## Enable/disable looking for a git repo (default: True)
|
||||
#git: true
|
||||
|
||||
## Enable/disable adding .aider* to .gitignore (default: True)
|
||||
#gitignore: true
|
||||
|
||||
## Specify the aider ignore file (default: .aiderignore in git root)
|
||||
#aiderignore: .aiderignore
|
||||
|
||||
## Enable/disable auto commit of LLM changes (default: True)
|
||||
#auto-commits: true
|
||||
|
||||
## Enable/disable commits when repo is found dirty (default: True)
|
||||
#dirty-commits: true
|
||||
|
||||
## Attribute aider code changes in the git author name (default: True)
|
||||
#attribute-author: true
|
||||
|
||||
## Attribute aider commits in the git committer name (default: True)
|
||||
#attribute-committer: true
|
||||
|
||||
## Prefix commit messages with 'aider: ' (default: False)
|
||||
#attribute-commit-message: false
|
||||
|
||||
## Perform a dry run without modifying files (default: False)
|
||||
#dry-run: false
|
||||
|
||||
########################
|
||||
# Fixing and committing:
|
||||
|
||||
## Commit all pending changes with a suitable commit message, then exit
|
||||
#commit: false
|
||||
|
||||
## Lint and fix provided files, or dirty files if none provided
|
||||
#lint: false
|
||||
|
||||
## Specify lint commands to run for different languages, eg: "python: flake8 --select=..." (can be used multiple times)
|
||||
#lint-cmd:
|
||||
|
||||
## Enable/disable automatic linting after changes (default: True)
|
||||
#auto-lint: true
|
||||
|
||||
## Specify command to run tests
|
||||
#test-cmd:
|
||||
|
||||
## Enable/disable automatic testing after changes (default: False)
|
||||
#auto-test: false
|
||||
|
||||
## Run tests and fix problems found
|
||||
#test: false
|
||||
|
||||
#################
|
||||
# Other Settings:
|
||||
|
||||
## Use VI editing mode in the terminal (default: False)
|
||||
#vim: false
|
||||
|
||||
## Specify the language for voice using ISO 639-1 code (default: auto)
|
||||
#voice-language: en
|
||||
|
||||
## Show the version number and exit
|
||||
#version:
|
||||
|
||||
## Check for updates and return status in the exit code
|
||||
#check-update: false
|
||||
|
||||
## Skips checking for the update when the program runs
|
||||
#skip-check-update: false
|
||||
|
||||
## Apply the changes from the given file instead of running the chat (debug)
|
||||
#apply:
|
||||
|
||||
## Always say yes to every confirmation
|
||||
#yes: false
|
||||
|
||||
## Enable verbose output
|
||||
#verbose: false
|
||||
|
||||
## Print the repo map and exit (debug)
|
||||
#show-repo-map: false
|
||||
|
||||
## Print the system prompts and exit (debug)
|
||||
#show-prompts: false
|
||||
|
||||
## Do all startup activities then exit before accepting user input (debug)
|
||||
#exit: false
|
||||
|
||||
## Specify a single message to send the LLM, process reply then exit (disables chat mode)
|
||||
#message:
|
||||
|
||||
## Specify a file containing the message to send the LLM, process reply, then exit (disables chat mode)
|
||||
#message-file:
|
||||
|
||||
## Specify the encoding for input and output (default: utf-8)
|
||||
#encoding: utf-8
|
||||
|
||||
## Specify the config file (default: search for .aider.conf.yml in git root, cwd or home directory)
|
||||
#config:
|
||||
|
||||
## Run aider in your browser
|
||||
#gui: false
|
||||
```
|
||||
<!--[[[end]]]-->
|
281
aider/website/docs/config/dotenv.md
Normal file
|
@ -0,0 +1,281 @@
|
|||
---
|
||||
parent: Configuration
|
||||
nav_order: 900
|
||||
description: Using a .env file to store LLM API keys for aider.
|
||||
---
|
||||
|
||||
# Config with .env
|
||||
|
||||
You can use a `.env` file to store API keys and other settings for the
|
||||
models you use with aider.
|
||||
You can also set many general aider options
|
||||
in the `.env` file.
|
||||
|
||||
{% include special-keys.md %}
|
||||
|
||||
Aider will look for a `.env` file in the
|
||||
root of your git repo or in the current directory.
|
||||
You can give it an explicit file to load with the `--env-file <filename>` parameter.
|
||||
|
||||
Below is a sample `.env` file, which you
|
||||
can also
|
||||
[download from GitHub](https://github.com/paul-gauthier/aider/blob/main/website/assets/sample.env).
|
||||
|
||||
<!--[[[cog
|
||||
from aider.args import get_sample_dotenv
|
||||
from pathlib import Path
|
||||
text=get_sample_dotenv()
|
||||
Path("website/assets/sample.env").write_text(text)
|
||||
cog.outl("```")
|
||||
cog.out(text)
|
||||
cog.outl("```")
|
||||
]]]-->
|
||||
```
|
||||
##########################################################
|
||||
# Sample aider .env file.
|
||||
# Place at the root of your git repo.
|
||||
# Or use `aider --env <fname>` to specify.
|
||||
##########################################################
|
||||
|
||||
#################
|
||||
# LLM parameters:
|
||||
#
|
||||
# Include xxx_API_KEY parameters and other params needed for your LLMs.
|
||||
# See https://aider.chat/docs/llms.html for details.
|
||||
|
||||
## OpenAI
|
||||
#OPENAI_API_KEY=
|
||||
|
||||
## Anthropic
|
||||
#ANTHROPIC_API_KEY=
|
||||
|
||||
##...
|
||||
|
||||
#######
|
||||
# Main:
|
||||
|
||||
## Specify the OpenAI API key
|
||||
#OPENAI_API_KEY=
|
||||
|
||||
## Specify the Anthropic API key
|
||||
#ANTHROPIC_API_KEY=
|
||||
|
||||
## Specify the model to use for the main chat
|
||||
#AIDER_MODEL=
|
||||
|
||||
## Use claude-3-opus-20240229 model for the main chat
|
||||
#AIDER_OPUS=
|
||||
|
||||
## Use claude-3-5-sonnet-20240620 model for the main chat
|
||||
#AIDER_SONNET=
|
||||
|
||||
## Use gpt-4-0613 model for the main chat
|
||||
#AIDER_4=
|
||||
|
||||
## Use gpt-4o model for the main chat
|
||||
#AIDER_4O=
|
||||
|
||||
## Use gpt-4-1106-preview model for the main chat
|
||||
#AIDER_4_TURBO=
|
||||
|
||||
## Use gpt-3.5-turbo model for the main chat
|
||||
#AIDER_35TURBO=
|
||||
|
||||
#################
|
||||
# Model Settings:
|
||||
|
||||
## List known models which match the (partial) MODEL name
|
||||
#AIDER_MODELS=
|
||||
|
||||
## Specify the api base url
|
||||
#OPENAI_API_BASE=
|
||||
|
||||
## Specify the api_type
|
||||
#OPENAI_API_TYPE=
|
||||
|
||||
## Specify the api_version
|
||||
#OPENAI_API_VERSION=
|
||||
|
||||
## Specify the deployment_id
|
||||
#OPENAI_API_DEPLOYMENT_ID=
|
||||
|
||||
## Specify the OpenAI organization ID
|
||||
#OPENAI_ORGANIZATION_ID=
|
||||
|
||||
## Specify a file with aider model settings for unknown models
|
||||
#AIDER_MODEL_SETTINGS_FILE=.aider.model.settings.yml
|
||||
|
||||
## Specify a file with context window and costs for unknown models
|
||||
#AIDER_MODEL_METADATA_FILE=.aider.model.metadata.json
|
||||
|
||||
## Verify the SSL cert when connecting to models (default: True)
|
||||
#AIDER_VERIFY_SSL=true
|
||||
|
||||
## Specify what edit format the LLM should use (default depends on model)
|
||||
#AIDER_EDIT_FORMAT=
|
||||
|
||||
## Specify the model to use for commit messages and chat history summarization (default depends on --model)
|
||||
#AIDER_WEAK_MODEL=
|
||||
|
||||
## Only work with models that have meta-data available (default: True)
|
||||
#AIDER_SHOW_MODEL_WARNINGS=true
|
||||
|
||||
## Max number of tokens to use for repo map, use 0 to disable (default: 1024)
|
||||
#AIDER_MAP_TOKENS=true
|
||||
|
||||
## Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.
|
||||
#AIDER_MAX_CHAT_HISTORY_TOKENS=
|
||||
|
||||
## Specify the .env file to load (default: .env in git root)
|
||||
#AIDER_ENV_FILE=.env
|
||||
|
||||
################
|
||||
# History Files:
|
||||
|
||||
## Specify the chat input history file (default: .aider.input.history)
|
||||
#AIDER_INPUT_HISTORY_FILE=.aider.input.history
|
||||
|
||||
## Specify the chat history file (default: .aider.chat.history.md)
|
||||
#AIDER_CHAT_HISTORY_FILE=.aider.chat.history.md
|
||||
|
||||
## Restore the previous chat history messages (default: False)
|
||||
#AIDER_RESTORE_CHAT_HISTORY=false
|
||||
|
||||
## Log the conversation with the LLM to this file (for example, .aider.llm.history)
|
||||
#AIDER_LLM_HISTORY_FILE=
|
||||
|
||||
##################
|
||||
# Output Settings:
|
||||
|
||||
## Use colors suitable for a dark terminal background (default: False)
|
||||
#AIDER_DARK_MODE=false
|
||||
|
||||
## Use colors suitable for a light terminal background (default: False)
|
||||
#AIDER_LIGHT_MODE=false
|
||||
|
||||
## Enable/disable pretty, colorized output (default: True)
|
||||
#AIDER_PRETTY=true
|
||||
|
||||
## Enable/disable streaming responses (default: True)
|
||||
#AIDER_STREAM=true
|
||||
|
||||
## Set the color for user input (default: #00cc00)
|
||||
#AIDER_USER_INPUT_COLOR=#00cc00
|
||||
|
||||
## Set the color for tool output (default: None)
|
||||
#AIDER_TOOL_OUTPUT_COLOR=
|
||||
|
||||
## Set the color for tool error messages (default: red)
|
||||
#AIDER_TOOL_ERROR_COLOR=#FF2222
|
||||
|
||||
## Set the color for assistant output (default: #0088ff)
|
||||
#AIDER_ASSISTANT_OUTPUT_COLOR=#0088ff
|
||||
|
||||
## Set the markdown code theme (default: default, other options include monokai, solarized-dark, solarized-light)
|
||||
#AIDER_CODE_THEME=default
|
||||
|
||||
## Show diffs when committing changes (default: False)
|
||||
#AIDER_SHOW_DIFFS=false
|
||||
|
||||
###############
|
||||
# Git Settings:
|
||||
|
||||
## Enable/disable looking for a git repo (default: True)
|
||||
#AIDER_GIT=true
|
||||
|
||||
## Enable/disable adding .aider* to .gitignore (default: True)
|
||||
#AIDER_GITIGNORE=true
|
||||
|
||||
## Specify the aider ignore file (default: .aiderignore in git root)
|
||||
#AIDER_AIDERIGNORE=.aiderignore
|
||||
|
||||
## Enable/disable auto commit of LLM changes (default: True)
|
||||
#AIDER_AUTO_COMMITS=true
|
||||
|
||||
## Enable/disable commits when repo is found dirty (default: True)
|
||||
#AIDER_DIRTY_COMMITS=true
|
||||
|
||||
## Attribute aider code changes in the git author name (default: True)
|
||||
#AIDER_ATTRIBUTE_AUTHOR=true
|
||||
|
||||
## Attribute aider commits in the git committer name (default: True)
|
||||
#AIDER_ATTRIBUTE_COMMITTER=true
|
||||
|
||||
## Prefix commit messages with 'aider: ' (default: False)
|
||||
#AIDER_ATTRIBUTE_COMMIT_MESSAGE=false
|
||||
|
||||
## Perform a dry run without modifying files (default: False)
|
||||
#AIDER_DRY_RUN=false
|
||||
|
||||
########################
|
||||
# Fixing and committing:
|
||||
|
||||
## Commit all pending changes with a suitable commit message, then exit
|
||||
#AIDER_COMMIT=false
|
||||
|
||||
## Lint and fix provided files, or dirty files if none provided
|
||||
#AIDER_LINT=false
|
||||
|
||||
## Specify lint commands to run for different languages, eg: "python: flake8 --select=..." (can be used multiple times)
|
||||
#AIDER_LINT_CMD=
|
||||
|
||||
## Enable/disable automatic linting after changes (default: True)
|
||||
#AIDER_AUTO_LINT=true
|
||||
|
||||
## Specify command to run tests
|
||||
#AIDER_TEST_CMD=
|
||||
|
||||
## Enable/disable automatic testing after changes (default: False)
|
||||
#AIDER_AUTO_TEST=false
|
||||
|
||||
## Run tests and fix problems found
|
||||
#AIDER_TEST=false
|
||||
|
||||
#################
|
||||
# Other Settings:
|
||||
|
||||
## Use VI editing mode in the terminal (default: False)
|
||||
#AIDER_VIM=false
|
||||
|
||||
## Specify the language for voice using ISO 639-1 code (default: auto)
|
||||
#AIDER_VOICE_LANGUAGE=en
|
||||
|
||||
## Check for updates and return status in the exit code
|
||||
#AIDER_CHECK_UPDATE=false
|
||||
|
||||
## Skips checking for the update when the program runs
|
||||
#AIDER_SKIP_CHECK_UPDATE=false
|
||||
|
||||
## Apply the changes from the given file instead of running the chat (debug)
|
||||
#AIDER_APPLY=
|
||||
|
||||
## Always say yes to every confirmation
|
||||
#AIDER_YES=
|
||||
|
||||
## Enable verbose output
|
||||
#AIDER_VERBOSE=false
|
||||
|
||||
## Print the repo map and exit (debug)
|
||||
#AIDER_SHOW_REPO_MAP=false
|
||||
|
||||
## Print the system prompts and exit (debug)
|
||||
#AIDER_SHOW_PROMPTS=false
|
||||
|
||||
## Do all startup activities then exit before accepting user input (debug)
|
||||
#AIDER_EXIT=false
|
||||
|
||||
## Specify a single message to send the LLM, process reply then exit (disables chat mode)
|
||||
#AIDER_MESSAGE=
|
||||
|
||||
## Specify a file containing the message to send the LLM, process reply, then exit (disables chat mode)
|
||||
#AIDER_MESSAGE_FILE=
|
||||
|
||||
## Specify the encoding for input and output (default: utf-8)
|
||||
#AIDER_ENCODING=utf-8
|
||||
|
||||
## Run aider in your browser
|
||||
#AIDER_GUI=false
|
||||
```
|
||||
<!--[[[end]]]-->
|
||||
|
||||
|
468
aider/website/docs/config/options.md
Normal file
|
@ -0,0 +1,468 @@
|
|||
---
|
||||
parent: Configuration
|
||||
nav_order: 10
|
||||
description: Details about all of aider's settings.
|
||||
---
|
||||
|
||||
# Options reference
|
||||
{: .no_toc }
|
||||
|
||||
You can use `aider --help` to see all the available options,
|
||||
or review them below.
|
||||
|
||||
- TOC
|
||||
{:toc}
|
||||
|
||||
## Usage summary
|
||||
|
||||
<!--[[[cog
|
||||
from aider.args import get_md_help
|
||||
cog.out(get_md_help())
|
||||
]]]-->
|
||||
```
|
||||
usage: aider [-h] [--openai-api-key] [--anthropic-api-key] [--model]
|
||||
[--opus] [--sonnet] [--4] [--4o] [--4-turbo]
|
||||
[--35turbo] [--models] [--openai-api-base]
|
||||
[--openai-api-type] [--openai-api-version]
|
||||
[--openai-api-deployment-id] [--openai-organization-id]
|
||||
[--model-settings-file] [--model-metadata-file]
|
||||
[--verify-ssl | --no-verify-ssl] [--edit-format]
|
||||
[--weak-model]
|
||||
[--show-model-warnings | --no-show-model-warnings]
|
||||
[--map-tokens] [--max-chat-history-tokens] [--env-file]
|
||||
[--input-history-file] [--chat-history-file]
|
||||
[--restore-chat-history | --no-restore-chat-history]
|
||||
[--llm-history-file] [--dark-mode] [--light-mode]
|
||||
[--pretty | --no-pretty] [--stream | --no-stream]
|
||||
[--user-input-color] [--tool-output-color]
|
||||
[--tool-error-color] [--assistant-output-color]
|
||||
[--code-theme] [--show-diffs] [--git | --no-git]
|
||||
[--gitignore | --no-gitignore] [--aiderignore]
|
||||
[--auto-commits | --no-auto-commits]
|
||||
[--dirty-commits | --no-dirty-commits]
|
||||
[--attribute-author | --no-attribute-author]
|
||||
[--attribute-committer | --no-attribute-committer]
|
||||
[--attribute-commit-message | --no-attribute-commit-message]
|
||||
[--dry-run | --no-dry-run] [--commit] [--lint]
|
||||
[--lint-cmd] [--auto-lint | --no-auto-lint]
|
||||
[--test-cmd] [--auto-test | --no-auto-test] [--test]
|
||||
[--vim] [--voice-language] [--version] [--check-update]
|
||||
[--skip-check-update] [--apply] [--yes] [-v]
|
||||
[--show-repo-map] [--show-prompts] [--exit] [--message]
|
||||
[--message-file] [--encoding] [-c] [--gui]
|
||||
|
||||
```
|
||||
|
||||
## options:
|
||||
|
||||
### `--help`
|
||||
show this help message and exit
|
||||
Aliases:
|
||||
- `-h`
|
||||
- `--help`
|
||||
|
||||
## Main:
|
||||
|
||||
### `--openai-api-key OPENAI_API_KEY`
|
||||
Specify the OpenAI API key
|
||||
Environment variable: `OPENAI_API_KEY`
|
||||
|
||||
### `--anthropic-api-key ANTHROPIC_API_KEY`
|
||||
Specify the Anthropic API key
|
||||
Environment variable: `ANTHROPIC_API_KEY`
|
||||
|
||||
### `--model MODEL`
|
||||
Specify the model to use for the main chat
|
||||
Environment variable: `AIDER_MODEL`
|
||||
|
||||
### `--opus`
|
||||
Use claude-3-opus-20240229 model for the main chat
|
||||
Environment variable: `AIDER_OPUS`
|
||||
|
||||
### `--sonnet`
|
||||
Use claude-3-5-sonnet-20240620 model for the main chat
|
||||
Environment variable: `AIDER_SONNET`
|
||||
|
||||
### `--4`
|
||||
Use gpt-4-0613 model for the main chat
|
||||
Environment variable: `AIDER_4`
|
||||
Aliases:
|
||||
- `--4`
|
||||
- `-4`
|
||||
|
||||
### `--4o`
|
||||
Use gpt-4o model for the main chat
|
||||
Environment variable: `AIDER_4O`
|
||||
|
||||
### `--4-turbo`
|
||||
Use gpt-4-1106-preview model for the main chat
|
||||
Environment variable: `AIDER_4_TURBO`
|
||||
|
||||
### `--35turbo`
|
||||
Use gpt-3.5-turbo model for the main chat
|
||||
Environment variable: `AIDER_35TURBO`
|
||||
Aliases:
|
||||
- `--35turbo`
|
||||
- `--35-turbo`
|
||||
- `--3`
|
||||
- `-3`
|
||||
|
||||
## Model Settings:
|
||||
|
||||
### `--models MODEL`
|
||||
List known models which match the (partial) MODEL name
|
||||
Environment variable: `AIDER_MODELS`
|
||||
|
||||
### `--openai-api-base OPENAI_API_BASE`
|
||||
Specify the api base url
|
||||
Environment variable: `OPENAI_API_BASE`
|
||||
|
||||
### `--openai-api-type OPENAI_API_TYPE`
|
||||
Specify the api_type
|
||||
Environment variable: `OPENAI_API_TYPE`
|
||||
|
||||
### `--openai-api-version OPENAI_API_VERSION`
|
||||
Specify the api_version
|
||||
Environment variable: `OPENAI_API_VERSION`
|
||||
|
||||
### `--openai-api-deployment-id OPENAI_API_DEPLOYMENT_ID`
|
||||
Specify the deployment_id
|
||||
Environment variable: `OPENAI_API_DEPLOYMENT_ID`
|
||||
|
||||
### `--openai-organization-id OPENAI_ORGANIZATION_ID`
|
||||
Specify the OpenAI organization ID
|
||||
Environment variable: `OPENAI_ORGANIZATION_ID`
|
||||
|
||||
### `--model-settings-file MODEL_SETTINGS_FILE`
|
||||
Specify a file with aider model settings for unknown models
|
||||
Default: .aider.model.settings.yml
|
||||
Environment variable: `AIDER_MODEL_SETTINGS_FILE`
|
||||
|
||||
### `--model-metadata-file MODEL_METADATA_FILE`
|
||||
Specify a file with context window and costs for unknown models
|
||||
Default: .aider.model.metadata.json
|
||||
Environment variable: `AIDER_MODEL_METADATA_FILE`
|
||||
|
||||
### `--verify-ssl`
|
||||
Verify the SSL cert when connecting to models (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_VERIFY_SSL`
|
||||
Aliases:
|
||||
- `--verify-ssl`
|
||||
- `--no-verify-ssl`
|
||||
|
||||
### `--edit-format EDIT_FORMAT`
|
||||
Specify what edit format the LLM should use (default depends on model)
|
||||
Environment variable: `AIDER_EDIT_FORMAT`
|
||||
|
||||
### `--weak-model WEAK_MODEL`
|
||||
Specify the model to use for commit messages and chat history summarization (default depends on --model)
|
||||
Environment variable: `AIDER_WEAK_MODEL`
|
||||
|
||||
### `--show-model-warnings`
|
||||
Only work with models that have meta-data available (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_SHOW_MODEL_WARNINGS`
|
||||
Aliases:
|
||||
- `--show-model-warnings`
|
||||
- `--no-show-model-warnings`
|
||||
|
||||
### `--map-tokens VALUE`
|
||||
Max number of tokens to use for repo map, use 0 to disable (default: 1024)
|
||||
Default: 1024
|
||||
Environment variable: `AIDER_MAP_TOKENS`
|
||||
|
||||
### `--max-chat-history-tokens VALUE`
|
||||
Maximum number of tokens to use for chat history. If not specified, uses the model's max_chat_history_tokens.
|
||||
Environment variable: `AIDER_MAX_CHAT_HISTORY_TOKENS`
|
||||
|
||||
### `--env-file ENV_FILE`
|
||||
Specify the .env file to load (default: .env in git root)
|
||||
Default: .env
|
||||
Environment variable: `AIDER_ENV_FILE`
|
||||
|
||||
## History Files:
|
||||
|
||||
### `--input-history-file INPUT_HISTORY_FILE`
|
||||
Specify the chat input history file (default: .aider.input.history)
|
||||
Default: .aider.input.history
|
||||
Environment variable: `AIDER_INPUT_HISTORY_FILE`
|
||||
|
||||
### `--chat-history-file CHAT_HISTORY_FILE`
|
||||
Specify the chat history file (default: .aider.chat.history.md)
|
||||
Default: .aider.chat.history.md
|
||||
Environment variable: `AIDER_CHAT_HISTORY_FILE`
|
||||
|
||||
### `--restore-chat-history`
|
||||
Restore the previous chat history messages (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_RESTORE_CHAT_HISTORY`
|
||||
Aliases:
|
||||
- `--restore-chat-history`
|
||||
- `--no-restore-chat-history`
|
||||
|
||||
### `--llm-history-file LLM_HISTORY_FILE`
|
||||
Log the conversation with the LLM to this file (for example, .aider.llm.history)
|
||||
Environment variable: `AIDER_LLM_HISTORY_FILE`
|
||||
|
||||
## Output Settings:
|
||||
|
||||
### `--dark-mode`
|
||||
Use colors suitable for a dark terminal background (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_DARK_MODE`
|
||||
|
||||
### `--light-mode`
|
||||
Use colors suitable for a light terminal background (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_LIGHT_MODE`
|
||||
|
||||
### `--pretty`
|
||||
Enable/disable pretty, colorized output (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_PRETTY`
|
||||
Aliases:
|
||||
- `--pretty`
|
||||
- `--no-pretty`
|
||||
|
||||
### `--stream`
|
||||
Enable/disable streaming responses (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_STREAM`
|
||||
Aliases:
|
||||
- `--stream`
|
||||
- `--no-stream`
|
||||
|
||||
### `--user-input-color VALUE`
|
||||
Set the color for user input (default: #00cc00)
|
||||
Default: #00cc00
|
||||
Environment variable: `AIDER_USER_INPUT_COLOR`
|
||||
|
||||
### `--tool-output-color VALUE`
|
||||
Set the color for tool output (default: None)
|
||||
Environment variable: `AIDER_TOOL_OUTPUT_COLOR`
|
||||
|
||||
### `--tool-error-color VALUE`
|
||||
Set the color for tool error messages (default: red)
|
||||
Default: #FF2222
|
||||
Environment variable: `AIDER_TOOL_ERROR_COLOR`
|
||||
|
||||
### `--assistant-output-color VALUE`
|
||||
Set the color for assistant output (default: #0088ff)
|
||||
Default: #0088ff
|
||||
Environment variable: `AIDER_ASSISTANT_OUTPUT_COLOR`
|
||||
|
||||
### `--code-theme VALUE`
|
||||
Set the markdown code theme (default: default, other options include monokai, solarized-dark, solarized-light)
|
||||
Default: default
|
||||
Environment variable: `AIDER_CODE_THEME`
|
||||
|
||||
### `--show-diffs`
|
||||
Show diffs when committing changes (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_SHOW_DIFFS`
|
||||
|
||||
## Git Settings:
|
||||
|
||||
### `--git`
|
||||
Enable/disable looking for a git repo (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_GIT`
|
||||
Aliases:
|
||||
- `--git`
|
||||
- `--no-git`
|
||||
|
||||
### `--gitignore`
|
||||
Enable/disable adding .aider* to .gitignore (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_GITIGNORE`
|
||||
Aliases:
|
||||
- `--gitignore`
|
||||
- `--no-gitignore`
|
||||
|
||||
### `--aiderignore AIDERIGNORE`
|
||||
Specify the aider ignore file (default: .aiderignore in git root)
|
||||
Default: .aiderignore
|
||||
Environment variable: `AIDER_AIDERIGNORE`
|
||||
|
||||
### `--auto-commits`
|
||||
Enable/disable auto commit of LLM changes (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_AUTO_COMMITS`
|
||||
Aliases:
|
||||
- `--auto-commits`
|
||||
- `--no-auto-commits`
|
||||
|
||||
### `--dirty-commits`
|
||||
Enable/disable commits when repo is found dirty (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_DIRTY_COMMITS`
|
||||
Aliases:
|
||||
- `--dirty-commits`
|
||||
- `--no-dirty-commits`
|
||||
|
||||
### `--attribute-author`
|
||||
Attribute aider code changes in the git author name (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_ATTRIBUTE_AUTHOR`
|
||||
Aliases:
|
||||
- `--attribute-author`
|
||||
- `--no-attribute-author`
|
||||
|
||||
### `--attribute-committer`
|
||||
Attribute aider commits in the git committer name (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_ATTRIBUTE_COMMITTER`
|
||||
Aliases:
|
||||
- `--attribute-committer`
|
||||
- `--no-attribute-committer`
|
||||
|
||||
### `--attribute-commit-message`
|
||||
Prefix commit messages with 'aider: ' (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_ATTRIBUTE_COMMIT_MESSAGE`
|
||||
Aliases:
|
||||
- `--attribute-commit-message`
|
||||
- `--no-attribute-commit-message`
|
||||
|
||||
### `--dry-run`
|
||||
Perform a dry run without modifying files (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_DRY_RUN`
|
||||
Aliases:
|
||||
- `--dry-run`
|
||||
- `--no-dry-run`
|
||||
|
||||
## Fixing and committing:
|
||||
|
||||
### `--commit`
|
||||
Commit all pending changes with a suitable commit message, then exit
|
||||
Default: False
|
||||
Environment variable: `AIDER_COMMIT`
|
||||
|
||||
### `--lint`
|
||||
Lint and fix provided files, or dirty files if none provided
|
||||
Default: False
|
||||
Environment variable: `AIDER_LINT`
|
||||
|
||||
### `--lint-cmd`
|
||||
Specify lint commands to run for different languages, eg: "python: flake8 --select=..." (can be used multiple times)
|
||||
Default: []
|
||||
Environment variable: `AIDER_LINT_CMD`
|
||||
|
||||
### `--auto-lint`
|
||||
Enable/disable automatic linting after changes (default: True)
|
||||
Default: True
|
||||
Environment variable: `AIDER_AUTO_LINT`
|
||||
Aliases:
|
||||
- `--auto-lint`
|
||||
- `--no-auto-lint`
|
||||
|
||||
### `--test-cmd VALUE`
|
||||
Specify command to run tests
|
||||
Default: []
|
||||
Environment variable: `AIDER_TEST_CMD`
|
||||
|
||||
### `--auto-test`
|
||||
Enable/disable automatic testing after changes (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_AUTO_TEST`
|
||||
Aliases:
|
||||
- `--auto-test`
|
||||
- `--no-auto-test`
|
||||
|
||||
### `--test`
|
||||
Run tests and fix problems found
|
||||
Default: False
|
||||
Environment variable: `AIDER_TEST`
|
||||
|
||||
## Other Settings:
|
||||
|
||||
### `--vim`
|
||||
Use VI editing mode in the terminal (default: False)
|
||||
Default: False
|
||||
Environment variable: `AIDER_VIM`
|
||||
|
||||
### `--voice-language VOICE_LANGUAGE`
|
||||
Specify the language for voice using ISO 639-1 code (default: auto)
|
||||
Default: en
|
||||
Environment variable: `AIDER_VOICE_LANGUAGE`
|
||||
|
||||
### `--version`
|
||||
Show the version number and exit
|
||||
|
||||
### `--check-update`
|
||||
Check for updates and return status in the exit code
|
||||
Default: False
|
||||
Environment variable: `AIDER_CHECK_UPDATE`
|
||||
|
||||
### `--skip-check-update`
|
||||
Skips checking for the update when the program runs
|
||||
Default: False
|
||||
Environment variable: `AIDER_SKIP_CHECK_UPDATE`
|
||||
|
||||
### `--apply FILE`
|
||||
Apply the changes from the given file instead of running the chat (debug)
|
||||
Environment variable: `AIDER_APPLY`
|
||||
|
||||
### `--yes`
|
||||
Always say yes to every confirmation
|
||||
Environment variable: `AIDER_YES`
|
||||
|
||||
### `--verbose`
|
||||
Enable verbose output
|
||||
Default: False
|
||||
Environment variable: `AIDER_VERBOSE`
|
||||
Aliases:
|
||||
- `-v`
|
||||
- `--verbose`
|
||||
|
||||
### `--show-repo-map`
|
||||
Print the repo map and exit (debug)
|
||||
Default: False
|
||||
Environment variable: `AIDER_SHOW_REPO_MAP`
|
||||
|
||||
### `--show-prompts`
|
||||
Print the system prompts and exit (debug)
|
||||
Default: False
|
||||
Environment variable: `AIDER_SHOW_PROMPTS`
|
||||
|
||||
### `--exit`
|
||||
Do all startup activities then exit before accepting user input (debug)
|
||||
Default: False
|
||||
Environment variable: `AIDER_EXIT`
|
||||
|
||||
### `--message COMMAND`
|
||||
Specify a single message to send the LLM, process reply then exit (disables chat mode)
|
||||
Environment variable: `AIDER_MESSAGE`
|
||||
Aliases:
|
||||
- `--message COMMAND`
|
||||
- `--msg COMMAND`
|
||||
- `-m COMMAND`
|
||||
|
||||
### `--message-file MESSAGE_FILE`
|
||||
Specify a file containing the message to send the LLM, process reply, then exit (disables chat mode)
|
||||
Environment variable: `AIDER_MESSAGE_FILE`
|
||||
Aliases:
|
||||
- `--message-file MESSAGE_FILE`
|
||||
- `-f MESSAGE_FILE`
|
||||
|
||||
### `--encoding VALUE`
|
||||
Specify the encoding for input and output (default: utf-8)
|
||||
Default: utf-8
|
||||
Environment variable: `AIDER_ENCODING`
|
||||
|
||||
### `--config CONFIG_FILE`
|
||||
Specify the config file (default: search for .aider.conf.yml in git root, cwd or home directory)
|
||||
Aliases:
|
||||
- `-c CONFIG_FILE`
|
||||
- `--config CONFIG_FILE`
|
||||
|
||||
### `--gui`
|
||||
Run aider in your browser
|
||||
Default: False
|
||||
Environment variable: `AIDER_GUI`
|
||||
Aliases:
|
||||
- `--gui`
|
||||
- `--browser`
|
||||
<!--[[[end]]]-->
|
87
aider/website/docs/conventions.md
Normal file
|
@ -0,0 +1,87 @@
|
|||
---
|
||||
parent: Usage
|
||||
nav_order: 800
|
||||
description: Tell aider to follow your coding conventions when it works on your code.
|
||||
---
|
||||
# Specifying coding conventions
|
||||
|
||||
Sometimes you want GPT to be aware of certain coding guidelines,
|
||||
like whether to provide type hints, which libraries or packages
|
||||
to prefer, etc.
|
||||
|
||||
The easiest way to do that with aider is to simply create
|
||||
a small markdown file and include it in the chat.
|
||||
|
||||
For example, say we want our python code to:
|
||||
|
||||
```
|
||||
- Prefer httpx over requests for making http requests.
|
||||
- Use types everywhere possible.
|
||||
```
|
||||
|
||||
We would simply create a file like `CONVENTIONS.md` with those lines
|
||||
and then we can add it to the aider chat, along with the file(s)
|
||||
that we want to edit:
|
||||
|
||||
<div class="chat-transcript" markdown="1">
|
||||
> $ aider CONVENTIONS.md useragent.py
|
||||
>
|
||||
> Aider v0.24.2-dev
|
||||
> Added CONVENTIONS.md to the chat.
|
||||
> Added useragent.py to the chat.
|
||||
|
||||
#### write a function that uses httpbin to check the user agent on the request. return the ua.
|
||||
|
||||
To fulfill your request, I will:
|
||||
|
||||
1. Add an import for `httpx`.
|
||||
2. Write a function named `check_user_agent` that makes a GET request to `httpbin.org/user-agent` and returns the user agent string.
|
||||
|
||||
Here is the diff for the `useragent.py` file:
|
||||
|
||||
```diff
|
||||
--- useragent.py
|
||||
+++ useragent.py
|
||||
@@ ... @@
|
||||
+import httpx
|
||||
+
|
||||
+def check_user_agent() -> str:
|
||||
+ response = httpx.get('https://httpbin.org/user-agent')
|
||||
+ return response.json()['user-agent']
|
||||
```
|
||||
|
||||
This function uses `httpx` to make a GET request to `httpbin.org/user-agent`, which returns a JSON response containing the user agent of the HTTP client. The function then extracts the user agent string from the JSON response and returns it.
|
||||
|
||||
> Applied edit to useragent.py
|
||||
</div>
|
||||
|
||||
GPT correctly used `httpx` and provided type hints.
|
||||
See below for the transcript of this same
|
||||
coding task conducted *without*
|
||||
using `CONVENTIONS.md`.
|
||||
In this case, GPT used `requests` and skipped types,
|
||||
which is perhaps more typical in small python scripts.
|
||||
|
||||
<div class="chat-transcript" markdown="1">
|
||||
> $ aider useragent.py
|
||||
>
|
||||
> Aider v0.24.2-dev
|
||||
> Added useragent.py to the chat.
|
||||
|
||||
|
||||
#### write a function that uses httpbin to check the user agent on the request. return the ua.
|
||||
|
||||
```diff
|
||||
--- /dev/null
|
||||
+++ useragent.py
|
||||
@@ ... @@
|
||||
+import requests
|
||||
+
|
||||
+def check_user_agent():
|
||||
+ response = requests.get('https://httpbin.org/user-agent')
|
||||
+ user_agent = response.json().get('user-agent')
|
||||
+ return user_agent
|
||||
```
|
||||
|
||||
> Applied edit to useragent.py
|
||||
</div>
|
250
aider/website/docs/ctags.md
Normal file
|
@ -0,0 +1,250 @@
|
|||
---
|
||||
title: Improving GPT-4's codebase understanding with ctags
|
||||
excerpt: Using ctags to build a "repository map" to increase GPT-4's ability to understand a large code base.
|
||||
highlight_image: /assets/robot-flowchart.png
|
||||
nav_exclude: true
|
||||
---
|
||||
{% if page.date %}
|
||||
<p class="post-date">{{ page.date | date: "%B %d, %Y" }}</p>
|
||||
{% endif %}
|
||||
|
||||
# Improving GPT-4's codebase understanding with ctags
|
||||
|
||||

|
||||
|
||||
## Updated
|
||||
|
||||
Aider no longer uses ctags to build a repo map.
|
||||
Please see the newer article about
|
||||
[using tree-sitter to build a better repo map](https://aider.chat/docs/repomap.html).
|
||||
|
||||
-------
|
||||
|
||||
GPT-4 is extremely useful for "self-contained" coding tasks,
|
||||
like generating brand new code or modifying a pure function
|
||||
that has no dependencies.
|
||||
|
||||
But it's difficult to use GPT-4 to modify or extend
|
||||
a large, complex pre-existing codebase.
|
||||
To modify such code, GPT needs to understand the dependencies and APIs
|
||||
which interconnect its subsystems.
|
||||
Somehow we need to provide this "code context" to GPT
|
||||
when we ask it to accomplish a coding task. Specifically, we need to:
|
||||
|
||||
- Help GPT understand the overall codebase, so that it
|
||||
can decifer the meaning of code with complex dependencies and generate
|
||||
new code that respects and utilizes existing abstractions.
|
||||
- Convey all of this "code context" to GPT in an
|
||||
efficient manner that fits within the 8k-token context window.
|
||||
|
||||
To address these issues, `aider` now
|
||||
sends GPT a **concise map of your whole git repository**
|
||||
that includes
|
||||
all declared variables and functions with call signatures.
|
||||
This *repo map* is built automatically using `ctags`, which
|
||||
extracts symbol definitions from source files. Historically,
|
||||
ctags were generated and indexed by IDEs and editors to
|
||||
help humans search and navigate large codebases.
|
||||
Instead, we're going to use ctags to help GPT better comprehend, navigate
|
||||
and edit code in larger repos.
|
||||
|
||||
To get a sense of how effective this can be, this
|
||||
[chat transcript](https://aider.chat/examples/add-test.html)
|
||||
shows GPT-4 creating a black box test case, **without being given
|
||||
access to the source code of the function being tested or any of the
|
||||
other code in the repo.**
|
||||
Using only the meta-data in the repo map, GPT is able to figure out how to
|
||||
call the method to be tested, as well as how to instantiate multiple
|
||||
class objects that are required to prepare for the test.
|
||||
|
||||
To code with GPT-4 using the techniques discussed here:
|
||||
|
||||
|
||||
- Install [aider](https://aider.chat/docs/install.html).
|
||||
- Install universal ctags.
|
||||
- Run `aider` inside your repo, and it should say "Repo-map: universal-ctags using 1024 tokens".
|
||||
|
||||
## The problem: code context
|
||||
|
||||
GPT-4 is great at "self contained" coding tasks, like writing or
|
||||
modifying a pure function with no external dependencies.
|
||||
GPT can easily handle requests like "write a
|
||||
Fibonacci function" or "rewrite the loop using list
|
||||
comprehensions", because they require no context beyond the code
|
||||
being discussed.
|
||||
|
||||
Most real code is not pure and self-contained, it is intertwined with
|
||||
and depends on code from many different files in a repo.
|
||||
If you ask GPT to "switch all the print statements in class Foo to
|
||||
use the BarLog logging system", it needs to see the code in the Foo class
|
||||
with the prints, and it also needs to understand the project's BarLog
|
||||
subsystem.
|
||||
|
||||
A simple solution is to **send the entire codebase** to GPT along with
|
||||
each change request. Now GPT has all the context! But this won't work
|
||||
for even moderately
|
||||
sized repos, because they won't fit into the 8k-token context window.
|
||||
|
||||
A better approach is to be selective,
|
||||
and **hand pick which files to send**.
|
||||
For the example above, you could send the file that
|
||||
contains the Foo class
|
||||
and the file that contains the BarLog logging subsystem.
|
||||
This works pretty well, and is supported by `aider` -- you
|
||||
can manually specify which files to "add to the chat" you are having with GPT.
|
||||
|
||||
But it's not ideal to have to manually identify the right
|
||||
set of files to add to the chat.
|
||||
And sending whole files is a bulky way to send code context,
|
||||
wasting the precious 8k context window.
|
||||
GPT doesn't need to see the entire implementation of BarLog,
|
||||
it just needs to understand it well enough to use it.
|
||||
You may quickly run out of context window if you
|
||||
send many files worth of code just to convey context.
|
||||
|
||||
## Using a repo map to provide context
|
||||
|
||||
The latest version of `aider` sends a **repo map** to GPT along with
|
||||
each change request. The map contains a list of all the files in the
|
||||
repo, along with the symbols which are defined in each file. Callables
|
||||
like functions and methods also include their signatures.
|
||||
|
||||
Here's a
|
||||
sample of the map of the aider repo, just showing the maps of
|
||||
[main.py](https://github.com/paul-gauthier/aider/blob/main/aider/main.py)
|
||||
and
|
||||
[io.py](https://github.com/paul-gauthier/aider/blob/main/aider/io.py)
|
||||
:
|
||||
|
||||
```
|
||||
aider/
|
||||
...
|
||||
main.py:
|
||||
function
|
||||
main (args=None, input=None, output=None)
|
||||
variable
|
||||
status
|
||||
...
|
||||
io.py:
|
||||
class
|
||||
FileContentCompleter
|
||||
InputOutput
|
||||
FileContentCompleter
|
||||
member
|
||||
__init__ (self, fnames, commands)
|
||||
get_completions (self, document, complete_event)
|
||||
InputOutput
|
||||
member
|
||||
__init__ (self, pretty, yes, input_history_file=None, chat_history_file=None, input=None, output=None)
|
||||
ai_output (self, content)
|
||||
append_chat_history (self, text, linebreak=False, blockquote=False)
|
||||
confirm_ask (self, question, default="y")
|
||||
get_input (self, fnames, commands)
|
||||
prompt_ask (self, question, default=None)
|
||||
tool (self, *messages, log_only=False)
|
||||
tool_error (self, message)
|
||||
...
|
||||
```
|
||||
|
||||
Mapping out the repo like this provides some benefits:
|
||||
|
||||
- GPT can see variables, classes, methods and function signatures from everywhere in the repo. This alone may give it enough context to solve many tasks. For example, it can probably figure out how to use the API exported from a module just based on the details shown in the map.
|
||||
- If it needs to see more code, GPT can use the map to figure out by itself which files it needs to look at. GPT will then ask to see these specific files, and `aider` will automatically add them to the chat context (with user approval).
|
||||
|
||||
Of course, for large repositories even just the map might be too large
|
||||
for the context window. However, this mapping approach opens up the
|
||||
ability to collaborate with GPT-4 on larger codebases than previous
|
||||
methods. It also reduces the need to manually curate which files to
|
||||
add to the chat context, empowering GPT to autonomously identify
|
||||
relevant files for the task at hand.
|
||||
|
||||
## Using ctags to make the map
|
||||
|
||||
Under the hood, `aider` uses
|
||||
[universal ctags](https://github.com/universal-ctags/ctags)
|
||||
to build the
|
||||
map. Universal ctags can scan source code written in many
|
||||
languages, and extract data about all the symbols defined in each
|
||||
file.
|
||||
|
||||
Historically, ctags were generated and indexed by IDEs or code editors
|
||||
to make it easier for a human to search and navigate a
|
||||
codebase, find the implementation of functions, etc.
|
||||
Instead, we're going to use ctags to help GPT navigate and understand the codebase.
|
||||
|
||||
Here is the type of output you get when you run ctags on source code. Specifically,
|
||||
this is the
|
||||
`ctags --fields=+S --output-format=json` output for the `main.py` file mapped above:
|
||||
|
||||
```json
|
||||
{
|
||||
"_type": "tag",
|
||||
"name": "main",
|
||||
"path": "aider/main.py",
|
||||
"pattern": "/^def main(args=None, input=None, output=None):$/",
|
||||
"kind": "function",
|
||||
"signature": "(args=None, input=None, output=None)"
|
||||
}
|
||||
{
|
||||
"_type": "tag",
|
||||
"name": "status",
|
||||
"path": "aider/main.py",
|
||||
"pattern": "/^ status = main()$/",
|
||||
"kind": "variable"
|
||||
}
|
||||
```
|
||||
|
||||
The repo map is built using this type of `ctags` data,
|
||||
but formatted into the space
|
||||
efficient hierarchical tree format shown earlier.
|
||||
This is a format that GPT can easily understand
|
||||
and which conveys the map data using a
|
||||
minimal number of tokens.
|
||||
|
||||
## Example chat transcript
|
||||
|
||||
This
|
||||
[chat transcript](https://aider.chat/examples/add-test.html)
|
||||
shows GPT-4 creating a black box test case, **without being given
|
||||
access to the source code of the function being tested or any of the
|
||||
other code in the repo.** Instead, GPT is operating solely off
|
||||
the repo map.
|
||||
|
||||
Using only the meta-data in the map, GPT is able to figure out how to call the method to be tested, as well as how to instantiate multiple class objects that are required to prepare for the test.
|
||||
|
||||
GPT makes one reasonable mistake writing the first version of the test, but is
|
||||
able to quickly fix the issue after being shown the `pytest` error output.
|
||||
|
||||
## Future work
|
||||
|
||||
Just as "send the whole codebase to GPT with every request"
|
||||
is not an efficient solution to this problem,
|
||||
there are probably better approaches than
|
||||
"send the whole repo map with every request".
|
||||
Sending an appropriate subset of the repo map would help `aider` work
|
||||
better with even larger repositories which have large maps.
|
||||
|
||||
Some possible approaches to reducing the amount of map data are:
|
||||
|
||||
- Distill the global map, to prioritize important symbols and discard "internal" or otherwise less globally relevant identifiers. Possibly enlist `gpt-3.5-turbo` to perform this distillation in a flexible and language agnostic way.
|
||||
- Provide a mechanism for GPT to start with a distilled subset of the global map, and let it ask to see more detail about subtrees or keywords that it feels are relevant to the current coding task.
|
||||
- Attempt to analyize the natural language coding task given by the user and predict which subset of the repo map is relevant. Possibly by analysis of prior coding chats within the specific repo. Work on certain files or types of features may require certain somewhat predictable context from elsewhere in the repo. Vector and keyword search against the chat history, repo map or codebase may help here.
|
||||
|
||||
One key goal is to prefer solutions which are language agnostic or
|
||||
which can be easily deployed against most popular code languages.
|
||||
The `ctags` solution has this benefit, since it comes pre-built
|
||||
with support for most popular languages.
|
||||
I suspect that Language Server Protocol might be an even
|
||||
better tool than `ctags` for this problem.
|
||||
But it is more cumbersome to deploy for a broad
|
||||
array of languages.
|
||||
Users would need to stand up an LSP server for their
|
||||
specific language(s) of interest.
|
||||
|
||||
## Try it out
|
||||
|
||||
To use this experimental repo map feature:
|
||||
|
||||
- Install [aider](https://aider.chat/docs/install.html).
|
||||
- Install ctags.
|
||||
- Run `aider` inside your repo, and it should say "Repo-map: universal-ctags using 1024 tokens".
|
104
aider/website/docs/faq.md
Normal file
|
@ -0,0 +1,104 @@
|
|||
---
|
||||
nav_order: 90
|
||||
description: Frequently asked questions about aider.
|
||||
---
|
||||
|
||||
# FAQ
|
||||
{: .no_toc }
|
||||
|
||||
- TOC
|
||||
{:toc}
|
||||
|
||||
## How can I add ALL the files to the chat?
|
||||
|
||||
People regularly ask about how to add **many or all of their repo's files** to the chat.
|
||||
This is probably not a good idea and will likely do more harm than good.
|
||||
|
||||
The best approach is think about which files need to be changed to accomplish
|
||||
the task you are working on. Just add those files to the chat.
|
||||
|
||||
Usually when people want to add "all the files" it's because they think it
|
||||
will give the LLM helpful context about the overall code base.
|
||||
Aider will automatically give the LLM a bunch of additional context about
|
||||
the rest of your git repo.
|
||||
It does this by analyzing your entire codebase in light of the
|
||||
current chat to build a compact
|
||||
[repository map](https://aider.chat/2023/10/22/repomap.html).
|
||||
|
||||
Adding a bunch of files that are mostly irrelevant to the
|
||||
task at hand will often distract or confuse the LLM.
|
||||
The LLM will give worse coding results, and sometimese even fail to correctly edit files.
|
||||
Addings extra files will also increase the token costs on your OpenAI invoice.
|
||||
|
||||
Again, it's usually best to just add the files to the chat that will need to be modified.
|
||||
If you still wish to add lots of files to the chat, you can:
|
||||
|
||||
- Use a wildcard when you launch aider: `aider src/*.py`
|
||||
- Use a wildcard with the in-chat `/add` command: `/add src/*.py`
|
||||
- Give the `/add` command a directory name and it will recurisvely add every file under that dir: `/add src`
|
||||
|
||||
## How can I run aider locally from source code?
|
||||
|
||||
To run the project locally, follow these steps:
|
||||
|
||||
```
|
||||
# Clone the repository:
|
||||
git clone git@github.com:paul-gauthier/aider.git
|
||||
|
||||
# Navigate to the project directory:
|
||||
cd aider
|
||||
|
||||
# Install the dependencies listed in the `requirements.txt` file:
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Run the local version of Aider:
|
||||
python -m aider.main
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Can I run aider in Google Colab?
|
||||
|
||||
User [imabutahersiddik](https://github.com/imabutahersiddik)
|
||||
has provided this
|
||||
[Colab notebook](https://colab.research.google.com/drive/1J9XynhrCqekPL5PR6olHP6eE--rnnjS9?usp=sharing).
|
||||
|
||||
|
||||
## Can I change the system prompts that aider uses?
|
||||
|
||||
Aider is set up to support different system prompts and edit formats
|
||||
in a modular way. If you look in the `aider/coders` subdirectory, you'll
|
||||
see there's a base coder with base prompts, and then there are
|
||||
a number of
|
||||
different specific coder implementations.
|
||||
|
||||
If you're thinking about experimenting with system prompts
|
||||
this document about
|
||||
[benchmarking GPT-3.5 and GPT-4 on code editing](https://aider.chat/docs/benchmarks.html)
|
||||
might be useful background.
|
||||
|
||||
While it's not well documented how to add new coder subsystems, you may be able
|
||||
to modify an existing implementation or use it as a template to add another.
|
||||
|
||||
To get started, try looking at and modifying these files.
|
||||
|
||||
The wholefile coder is currently used by GPT-3.5 by default. You can manually select it with `--edit-format whole`.
|
||||
|
||||
- wholefile_coder.py
|
||||
- wholefile_prompts.py
|
||||
|
||||
The editblock coder is currently used by GPT-4o by default. You can manually select it with `--edit-format diff`.
|
||||
|
||||
- editblock_coder.py
|
||||
- editblock_prompts.py
|
||||
|
||||
The universal diff coder is currently used by GPT-4 Turbo by default. You can manually select it with `--edit-format udiff`.
|
||||
|
||||
- udiff_coder.py
|
||||
- udiff_prompts.py
|
||||
|
||||
When experimenting with coder backends, it helps to run aider with `--verbose --no-pretty` so you can see
|
||||
all the raw information being sent to/from the LLM in the conversation.
|
||||
|
||||
You can also refer to the
|
||||
[instructions for installing a development version of aider](https://aider.chat/docs/install/optional.html#install-the-development-version-of-aider).
|
49
aider/website/docs/git.md
Normal file
|
@ -0,0 +1,49 @@
|
|||
---
|
||||
parent: More info
|
||||
nav_order: 800
|
||||
description: Aider is tightly integrated with git.
|
||||
---
|
||||
|
||||
# Git integration
|
||||
|
||||
Aider works best with code that is part of a git repo.
|
||||
Aider is tightly integrated with git, which makes it easy to:
|
||||
|
||||
- Use git to undo any aider changes that you don't like
|
||||
- Go back in the git history to review the changes that aider made to your code
|
||||
- Manage a series of aider's changes on a git branch
|
||||
|
||||
Aider specifically uses git in these ways:
|
||||
|
||||
- It asks to create a git repo if you launch it in a directory without one.
|
||||
- Whenever aider edits a file, it commits those changes with a descriptive commit message. This makes it easy to undo or review aider's changes.
|
||||
- Aider takes special care before editing files that already have uncommitted changes (dirty files). Aider will first commit any preexisting changes with a descriptive commit message.
|
||||
This keeps your edits separate from aider's edits, and makes sure you never lose your work if aider makes an inappropriate change.
|
||||
|
||||
## In-chat commands
|
||||
|
||||
Aider also allows you to use in-chat commands to `/diff` or `/undo` the last change.
|
||||
To do more complex management of your git history, you cat use raw `git` commands,
|
||||
either by using `/git` within the chat, or with standard git tools outside of aider.
|
||||
|
||||
## Disabling git integration
|
||||
|
||||
While it is not recommended, you can disable aider's use of git in a few ways:
|
||||
|
||||
- `--no-auto-commits` will stop aider from git committing each of its changes.
|
||||
- `--no-dirty-commits` will stop aider from committing dirty files before applying its edits.
|
||||
- `--no-git` will completely stop aider from using git on your files. You should ensure you are keeping sensible backups of the files you are working with.
|
||||
|
||||
|
||||
## Commit attribution
|
||||
|
||||
Aider marks commits that it either authored or committed.
|
||||
|
||||
- If aider authored the changes in a commit, they will have "(aider)" appended to the git author and git committer name metadata.
|
||||
- If aider simply committed changes (found in dirty files), the commit will have "(aider)" appended to the git committer name metadata.
|
||||
|
||||
You can use `--no-attribute-author` and `--no-attribute-committer` to disable
|
||||
modification of the git author and committer name fields.
|
||||
|
||||
Additionally, you can use `--attribute-commit-message` to prefix commit messages with 'aider: '.
|
||||
This option is disabled by default, but can be useful for easily identifying commits made by aider.
|