mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-30 17:24:59 +00:00
Merge branch 'main' into call-graph
This commit is contained in:
commit
1e1feeaa21
9 changed files with 222 additions and 92 deletions
54
README.md
54
README.md
|
@ -33,10 +33,10 @@ You can find more chat transcripts on the [examples page](https://aider.chat/exa
|
|||
* `aider` will apply the edits suggested by GPT-4 directly to your source files.
|
||||
* `aider` will automatically commit each changeset to your local git repo with a descriptive commit message. These frequent, automatic commits provide a safety net. It's easy to undo `aider` changes or use standard git workflows to manage longer sequences of changes.
|
||||
* `aider` can review multiple source files at once and make coordinated code changes across all of them in a single changeset/commit.
|
||||
* `aider` gives GPT a
|
||||
* `aider` can give GPT a
|
||||
[map of your entire git repo](https://aider.chat/docs/ctags.html),
|
||||
so it can ask for permission to review whichever files seem relevant to your requests.
|
||||
* You can also edit the files using your editor while chatting with `aider`.
|
||||
which helps it understand and modify large codebases.
|
||||
* You can edit the files by hand using your editor while chatting with `aider`.
|
||||
* `aider` will notice if you edit the files outside the chat.
|
||||
* It will help you commit these out-of-band changes, if you'd like.
|
||||
* It will bring the updated file contents into the chat.
|
||||
|
@ -49,8 +49,11 @@ so it can ask for permission to review whichever files seem relevant to your req
|
|||
1. Install the package:
|
||||
* From GitHub: `pip install git+https://github.com/paul-gauthier/aider.git`
|
||||
* From your local copy of the repo in develop mode to pick up local edits immediately: `pip install -e .`
|
||||
|
||||
2. Set up your OpenAI API key as an environment variable `OPENAI_API_KEY` or by including it in a `.env` file.
|
||||
|
||||
3. Optionally, install [universal ctags](https://github.com/universal-ctags/ctags). This is helpful if you plan to work with repositories with more than a handful of files. This allows `aider --ctags` to build a [map of your entire git repo](https://aider.chat/docs/ctags.html) and share it with GPT to help it better understand and modify large codebases.
|
||||
|
||||
## Usage
|
||||
|
||||
Run the `aider` tool by executing the following command:
|
||||
|
@ -65,19 +68,40 @@ You can also just launch `aider` anywhere in a git repo without naming files on
|
|||
It will discover all the files in the repo.
|
||||
You can then add and remove individual files in the chat session with the `/add` and `/drop` chat commands described below.
|
||||
|
||||
You can also use additional command-line options to customize the behavior of the tool. The following options are available, along with their corresponding environment variable overrides:
|
||||
You can also use additional command-line options, environment variables or configuration file
|
||||
to set many options:
|
||||
|
||||
- `--input-history-file INPUT_HISTORY_FILE`: Specify the chat input history file (default: .aider.input.history). Override the default with the environment variable `AIDER_INPUT_HISTORY_FILE`.
|
||||
- `--chat-history-file CHAT_HISTORY_FILE`: Specify the chat history file (default: .aider.chat.history.md). Override the default with the environment variable `AIDER_CHAT_HISTORY_FILE`.
|
||||
- `--model MODEL`: Specify the model to use for the main chat (default: gpt-4). Override the default with the environment variable `AIDER_MODEL`.
|
||||
- `-3`: Use gpt-3.5-turbo model for the main chat (not advised). No environment variable override.
|
||||
- `--ctags`: Add ctags to the chat to help GPT understand the codebase (default: False, `AIDER_CTAGS`). Requires [universal ctags](https://github.com/universal-ctags/ctags). Override the default with the environment variable `AIDER_CTAGS`.
|
||||
- `--no-pretty`: Disable pretty, colorized output. Override the default with the environment variable `AIDER_PRETTY` (default: 1 for enabled, 0 for disabled).
|
||||
- `--no-auto-commits`: Disable auto commit of changes. Override the default with the environment variable `AIDER_AUTO_COMMITS` (default: 1 for enabled, 0 for disabled).
|
||||
- `--show-diffs`: Show diffs when committing changes (default: False). Override the default with the environment variable `AIDER_SHOW_DIFFS` (default: 0 for False, 1 for True).
|
||||
- `--yes`: Always say yes to every confirmation (default: False).
|
||||
|
||||
For more information, run `aider --help`.
|
||||
```
|
||||
-h, --help show this help message and exit
|
||||
-c CONFIG_FILE, --config CONFIG_FILE
|
||||
Specify the config file (default: search for
|
||||
.aider.conf.yml in git root or home directory)
|
||||
--input-history-file INPUT_HISTORY_FILE
|
||||
Specify the chat input history file (default:
|
||||
.aider.input.history) [env var: AIDER_INPUT_HISTORY_FILE]
|
||||
--chat-history-file CHAT_HISTORY_FILE
|
||||
Specify the chat history file (default:
|
||||
.aider.chat.history.md) [env var: AIDER_CHAT_HISTORY_FILE]
|
||||
--model MODEL Specify the model to use for the main chat (default: gpt-4)
|
||||
[env var: AIDER_MODEL]
|
||||
-3 Use gpt-3.5-turbo model for the main chat (not advised)
|
||||
--pretty Enable pretty, colorized output (default: True) [env var:
|
||||
AIDER_PRETTY]
|
||||
--no-pretty Disable pretty, colorized output
|
||||
--apply FILE Apply the changes from the given file instead of running
|
||||
the chat (debug)
|
||||
--auto-commits Enable auto commit of changes (default: True) [env var:
|
||||
AIDER_AUTO_COMMIT]
|
||||
--no-auto-commits Disable auto commit of changes
|
||||
--dry-run Perform a dry run without applying changes (default: False)
|
||||
--show-diffs Show diffs when committing changes (default: False) [env
|
||||
var: AIDER_SHOW_DIFFS]
|
||||
--ctags [CTAGS] Add ctags to the chat to help GPT understand the codebase
|
||||
(default: check for ctags executable) [env var:
|
||||
AIDER_CTAGS]
|
||||
--yes Always say yes to every confirmation
|
||||
-v, --verbose Enable verbose output
|
||||
```
|
||||
|
||||
## Chat commands
|
||||
|
||||
|
|
|
@ -168,7 +168,10 @@ class Coder:
|
|||
if self.abs_fnames:
|
||||
files_content = prompts.files_content_prefix
|
||||
files_content += self.get_files_content()
|
||||
all_content += files_content
|
||||
else:
|
||||
files_content = prompts.files_no_full_files
|
||||
|
||||
all_content += files_content
|
||||
|
||||
other_files = set(self.get_all_abs_files()) - set(self.abs_fnames)
|
||||
repo_content = self.repo_map.get_repo_map(self.abs_fnames, other_files)
|
||||
|
@ -204,7 +207,7 @@ class Coder:
|
|||
self.num_control_c += 1
|
||||
if self.num_control_c >= 2:
|
||||
break
|
||||
self.io.tool_error("^C again to quit")
|
||||
self.io.tool_error("^C again or /exit to quit")
|
||||
except EOFError:
|
||||
return
|
||||
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
import sys
|
||||
import os
|
||||
import git
|
||||
import subprocess
|
||||
import shlex
|
||||
from rich.prompt import Confirm
|
||||
from prompt_toolkit.completion import Completion
|
||||
from aider import prompts
|
||||
|
||||
|
@ -16,20 +16,8 @@ class Commands:
|
|||
if inp[0] == "/":
|
||||
return True
|
||||
|
||||
def help(self):
|
||||
"Show help about all commands"
|
||||
commands = self.get_commands()
|
||||
for cmd in commands:
|
||||
cmd_method_name = f"cmd_{cmd[1:]}"
|
||||
cmd_method = getattr(self, cmd_method_name, None)
|
||||
if cmd_method:
|
||||
description = cmd_method.__doc__
|
||||
self.io.tool(f"{cmd} {description}")
|
||||
else:
|
||||
self.io.tool(f"{cmd} No description available.")
|
||||
|
||||
def get_commands(self):
|
||||
commands = ["/help"]
|
||||
commands = []
|
||||
for attr in dir(self):
|
||||
if attr.startswith("cmd_"):
|
||||
commands.append("/" + attr[4:])
|
||||
|
@ -62,15 +50,15 @@ class Commands:
|
|||
all_commands = self.get_commands()
|
||||
matching_commands = [cmd for cmd in all_commands if cmd.startswith(first_word)]
|
||||
if len(matching_commands) == 1:
|
||||
if matching_commands[0] == "/help":
|
||||
self.help()
|
||||
else:
|
||||
return self.do_run(matching_commands[0][1:], rest_inp)
|
||||
return self.do_run(matching_commands[0][1:], rest_inp)
|
||||
elif len(matching_commands) > 1:
|
||||
self.io.tool_error("Ambiguous command: ', '.join(matching_commands)}")
|
||||
self.io.tool_error(f"Ambiguous command: {', '.join(matching_commands)}")
|
||||
else:
|
||||
self.io.tool_error(f"Error: {first_word} is not a valid command.")
|
||||
|
||||
# any method called cmd_xxx becomes a command automatically.
|
||||
# each one must take an args param.
|
||||
|
||||
def cmd_commit(self, args):
|
||||
"Commit edits to the repo made outside the chat (commit message optional)"
|
||||
|
||||
|
@ -251,6 +239,10 @@ class Commands:
|
|||
)
|
||||
return msg
|
||||
|
||||
def cmd_exit(self, args):
|
||||
"Exit the application"
|
||||
sys.exit()
|
||||
|
||||
def cmd_ls(self, args):
|
||||
"List all known files and those included in the chat session"
|
||||
|
||||
|
@ -274,3 +266,15 @@ class Commands:
|
|||
self.io.tool("\nRepo files not in the chat:\n")
|
||||
for file in other_files:
|
||||
self.io.tool(f" {file}")
|
||||
|
||||
def cmd_help(self, args):
|
||||
"Show help about all commands"
|
||||
commands = sorted(self.get_commands())
|
||||
for cmd in commands:
|
||||
cmd_method_name = f"cmd_{cmd[1:]}"
|
||||
cmd_method = getattr(self, cmd_method_name, None)
|
||||
if cmd_method:
|
||||
description = cmd_method.__doc__
|
||||
self.io.tool(f"{cmd} {description}")
|
||||
else:
|
||||
self.io.tool(f"{cmd} No description available.")
|
||||
|
|
109
aider/main.py
109
aider/main.py
|
@ -1,47 +1,85 @@
|
|||
import os
|
||||
import sys
|
||||
import argparse
|
||||
import git
|
||||
import configargparse
|
||||
from dotenv import load_dotenv
|
||||
from aider.coder import Coder
|
||||
from aider.io import InputOutput
|
||||
|
||||
|
||||
def get_git_root():
|
||||
try:
|
||||
repo = git.Repo(search_parent_directories=True)
|
||||
return repo.working_tree_dir
|
||||
except git.InvalidGitRepositoryError:
|
||||
return None
|
||||
|
||||
|
||||
def main(args=None, input=None, output=None):
|
||||
if args is None:
|
||||
args = sys.argv[1:]
|
||||
|
||||
load_dotenv()
|
||||
env_prefix = "AIDER_"
|
||||
parser = argparse.ArgumentParser(description="aider - chat with GPT about your code")
|
||||
|
||||
default_config_files = [
|
||||
os.path.expanduser("~/.aider.conf.yml"),
|
||||
]
|
||||
git_root = get_git_root()
|
||||
if git_root:
|
||||
default_config_files.insert(0, os.path.join(git_root, ".aider.conf.yml"))
|
||||
|
||||
parser = configargparse.ArgumentParser(
|
||||
description="aider - chat with GPT about your code",
|
||||
add_config_file_help=True,
|
||||
default_config_files=default_config_files,
|
||||
config_file_parser_class=configargparse.YAMLConfigFileParser,
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"-c",
|
||||
"--config",
|
||||
is_config_file=True,
|
||||
metavar="CONFIG_FILE",
|
||||
help=(
|
||||
"Specify the config file (default: search for .aider.conf.yml in git root or home"
|
||||
" directory)"
|
||||
),
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"files",
|
||||
metavar="FILE",
|
||||
nargs="*",
|
||||
help="a list of source code files (optional)",
|
||||
)
|
||||
default_input_history_file = (
|
||||
os.path.join(git_root, ".aider.input.history") if git_root else ".aider.input.history"
|
||||
)
|
||||
default_chat_history_file = (
|
||||
os.path.join(git_root, ".aider.chat.history.md") if git_root else ".aider.chat.history.md"
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--input-history-file",
|
||||
metavar="INPUT_HISTORY_FILE",
|
||||
default=os.environ.get(f"{env_prefix}INPUT_HISTORY_FILE", ".aider.input.history"),
|
||||
help=(
|
||||
"Specify the chat input history file (default: .aider.input.history,"
|
||||
f" ${env_prefix}INPUT_HISTORY_FILE)"
|
||||
),
|
||||
env_var=f"{env_prefix}INPUT_HISTORY_FILE",
|
||||
default=default_input_history_file,
|
||||
help=f"Specify the chat input history file (default: {default_input_history_file})",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--chat-history-file",
|
||||
metavar="CHAT_HISTORY_FILE",
|
||||
default=os.environ.get(f"{env_prefix}CHAT_HISTORY_FILE", ".aider.chat.history.md"),
|
||||
help=(
|
||||
"Specify the chat history file (default: .aider.chat.history.md,"
|
||||
f" ${env_prefix}CHAT_HISTORY_FILE)"
|
||||
),
|
||||
env_var=f"{env_prefix}CHAT_HISTORY_FILE",
|
||||
default=default_chat_history_file,
|
||||
help=f"Specify the chat history file (default: {default_chat_history_file})",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--model",
|
||||
metavar="MODEL",
|
||||
default=os.environ.get(f"{env_prefix}MODEL", "gpt-4"),
|
||||
help=f"Specify the model to use for the main chat (default: gpt-4, ${env_prefix}MODEL)",
|
||||
env_var=f"{env_prefix}MODEL",
|
||||
default="gpt-4",
|
||||
help="Specify the model to use for the main chat (default: gpt-4)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"-3",
|
||||
|
@ -50,24 +88,38 @@ def main(args=None, input=None, output=None):
|
|||
const="gpt-3.5-turbo",
|
||||
help="Use gpt-3.5-turbo model for the main chat (not advised)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--pretty",
|
||||
action="store_true",
|
||||
env_var=f"{env_prefix}PRETTY",
|
||||
default=True,
|
||||
help="Enable pretty, colorized output (default: True)",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--no-pretty",
|
||||
action="store_false",
|
||||
dest="pretty",
|
||||
help=f"Disable pretty, colorized output (${env_prefix}PRETTY)",
|
||||
default=bool(int(os.environ.get(f"{env_prefix}PRETTY", 1))),
|
||||
help="Disable pretty, colorized output",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--apply",
|
||||
metavar="FILE",
|
||||
help="Apply the changes from the given file instead of running the chat (debug)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--auto-commits",
|
||||
action="store_true",
|
||||
env_var=f"{env_prefix}AUTO_COMMIT",
|
||||
default=True,
|
||||
help="Enable auto commit of changes (default: True)",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--no-auto-commits",
|
||||
action="store_false",
|
||||
dest="auto_commits",
|
||||
help=f"Disable auto commit of changes (${env_prefix}AUTO_COMMITS)",
|
||||
default=bool(int(os.environ.get(f"{env_prefix}AUTO_COMMITS", 1))),
|
||||
dest="auto_commit",
|
||||
help="Disable auto commit of changes",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--dry-run",
|
||||
|
@ -78,17 +130,21 @@ def main(args=None, input=None, output=None):
|
|||
parser.add_argument(
|
||||
"--show-diffs",
|
||||
action="store_true",
|
||||
help=f"Show diffs when committing changes (default: False, ${env_prefix}SHOW_DIFFS)",
|
||||
default=bool(int(os.environ.get(f"{env_prefix}SHOW_DIFFS", 0))),
|
||||
env_var=f"{env_prefix}SHOW_DIFFS",
|
||||
help="Show diffs when committing changes (default: False)",
|
||||
default=False,
|
||||
)
|
||||
parser.add_argument(
|
||||
"--ctags",
|
||||
action="store_true",
|
||||
type=lambda x: (str(x).lower() == "true"),
|
||||
nargs="?",
|
||||
const=True,
|
||||
default=None,
|
||||
env_var=f"{env_prefix}CTAGS",
|
||||
help=(
|
||||
"Add ctags to the chat to help GPT understand the codebase (default: False,"
|
||||
f" ${env_prefix}CTAGS)"
|
||||
"Add ctags to the chat to help GPT understand the codebase (default: check for ctags"
|
||||
" executable)"
|
||||
),
|
||||
default=bool(int(os.environ.get(f"{env_prefix}CTAGS", 0))),
|
||||
)
|
||||
parser.add_argument(
|
||||
"--yes",
|
||||
|
@ -97,7 +153,8 @@ def main(args=None, input=None, output=None):
|
|||
default=False,
|
||||
)
|
||||
parser.add_argument(
|
||||
"-v", "--verbose",
|
||||
"-v",
|
||||
"--verbose",
|
||||
action="store_true",
|
||||
help="Enable verbose output",
|
||||
default=False,
|
||||
|
|
|
@ -8,14 +8,12 @@ Take requests for changes to the supplied code.
|
|||
If the request is ambiguous, ask questions.
|
||||
|
||||
Once you understand the request you MUST:
|
||||
1. List the files you need to modify.
|
||||
1. List the files you need to modify. If they are *read-only* ask the user to make them *read-write* using the file's full path name.
|
||||
2. Think step-by-step and explain the needed changes.
|
||||
3. Describe each change with an *edit block* per the example below.
|
||||
"""
|
||||
|
||||
system_reminder = """Base any edits off the files shown in the user's last msg.
|
||||
|
||||
You MUST format EVERY code change with an *edit block* like this:
|
||||
system_reminder = """You MUST format EVERY code change with an *edit block* like this:
|
||||
|
||||
```python
|
||||
some/dir/example.py
|
||||
|
@ -29,11 +27,11 @@ some/dir/example.py
|
|||
def add(a,b):
|
||||
>>>>>>> UPDATED
|
||||
|
||||
Every *edit block* must be fenced w/triple backticks with the correct code language.
|
||||
Every *edit block* must start with the full path! *NEVER* propose edit blocks for *read-only* files.
|
||||
The ORIGINAL section must be an *exact* set of lines from the file:
|
||||
- NEVER SKIP LINES!
|
||||
- Include all original leading spaces and indentation!
|
||||
Every *edit block* must be fenced w/triple backticks with the correct code language.
|
||||
Every *edit block* must start with the full path!
|
||||
|
||||
Edits to different parts of a file each need their own *edit block*.
|
||||
|
||||
|
@ -54,11 +52,13 @@ files_content_gpt_no_edits = "I didn't see any properly formatted edits in your
|
|||
|
||||
files_content_local_edits = "I edited the files myself."
|
||||
|
||||
files_content_prefix = "Propose changes to *only* these files (ask before editing others):\n"
|
||||
files_content_prefix = "These are the *read-write* files:\n"
|
||||
|
||||
files_no_full_files = "I am not sharing any *read-write* files yet."
|
||||
|
||||
repo_content_prefix = (
|
||||
"Here is a map of all the {other}files{ctags_msg}. You *must* ask with the"
|
||||
" full path before editing these:\n\n"
|
||||
"All the files below here are *read-only* files. Notice that files in directories are indented."
|
||||
" Use their parent dirs to build their full path.\n"
|
||||
)
|
||||
|
||||
|
||||
|
|
|
@ -3,6 +3,7 @@ import json
|
|||
import sys
|
||||
import subprocess
|
||||
import tiktoken
|
||||
import tempfile
|
||||
from collections import defaultdict
|
||||
|
||||
from aider import prompts, utils
|
||||
|
@ -48,14 +49,20 @@ def fname_to_components(fname, with_colon):
|
|||
|
||||
|
||||
class RepoMap:
|
||||
def __init__(self, use_ctags=True, root=None, main_model="gpt-4"):
|
||||
ctags_cmd = ["ctags", "--fields=+S", "--extras=-F", "--output-format=json"]
|
||||
|
||||
def __init__(self, use_ctags=None, root=None, main_model="gpt-4"):
|
||||
if not root:
|
||||
root = os.getcwd()
|
||||
|
||||
self.use_ctags = use_ctags
|
||||
self.tokenizer = tiktoken.encoding_for_model(main_model)
|
||||
self.root = root
|
||||
|
||||
if use_ctags is None:
|
||||
self.use_ctags = self.check_for_ctags()
|
||||
else:
|
||||
self.use_ctags = use_ctags
|
||||
|
||||
self.tokenizer = tiktoken.encoding_for_model(main_model)
|
||||
|
||||
def get_repo_map(self, chat_files, other_files):
|
||||
res = self.choose_files_listing(other_files)
|
||||
if not res:
|
||||
|
@ -123,7 +130,7 @@ class RepoMap:
|
|||
|
||||
def split_path(self, path):
|
||||
path = os.path.relpath(path, self.root)
|
||||
return fname_to_components(path, True)
|
||||
return [path + ":"]
|
||||
|
||||
def run_ctags(self, filename):
|
||||
# Check if the file is in the cache and if the modification time has not changed
|
||||
|
@ -132,7 +139,7 @@ class RepoMap:
|
|||
if cache_key in TAGS_CACHE and TAGS_CACHE[cache_key]["mtime"] == file_mtime:
|
||||
return TAGS_CACHE[cache_key]["data"]
|
||||
|
||||
cmd = ["ctags", "--fields=+S", "--extras=-F", "--output-format=json", filename]
|
||||
cmd = self.ctags_cmd + [filename]
|
||||
output = subprocess.check_output(cmd).decode("utf-8")
|
||||
output = output.splitlines()
|
||||
|
||||
|
@ -169,6 +176,17 @@ class RepoMap:
|
|||
|
||||
return tags
|
||||
|
||||
def check_for_ctags(self):
|
||||
try:
|
||||
with tempfile.TemporaryDirectory() as tempdir:
|
||||
hello_py = os.path.join(tempdir, "hello.py")
|
||||
with open(hello_py, "w") as f:
|
||||
f.write("def hello():\n print('Hello, world!')\n")
|
||||
self.get_tags(hello_py)
|
||||
except Exception:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def find_py_files(directory):
|
||||
if not os.path.isdir(directory):
|
||||
|
@ -197,6 +215,7 @@ def call_map():
|
|||
"""
|
||||
|
||||
rm = RepoMap()
|
||||
|
||||
# res = rm.get_tags_map(fnames)
|
||||
# print(res)
|
||||
|
||||
|
@ -222,7 +241,7 @@ def call_map():
|
|||
# dump("ref", fname, ident)
|
||||
references[ident].append(show_fname)
|
||||
|
||||
for ident,fname in defines.items():
|
||||
for ident, fname in defines.items():
|
||||
dump(fname, ident)
|
||||
|
||||
idents = set(defines.keys()).intersection(set(references.keys()))
|
||||
|
@ -256,7 +275,9 @@ def call_map():
|
|||
ranked = nx.pagerank(G, weight="weight")
|
||||
|
||||
# drop low weight edges for plotting
|
||||
edges_to_remove = [(node1, node2) for node1, node2, data in G.edges(data=True) if data['weight'] < 1]
|
||||
edges_to_remove = [
|
||||
(node1, node2) for node1, node2, data in G.edges(data=True) if data["weight"] < 1
|
||||
]
|
||||
G.remove_edges_from(edges_to_remove)
|
||||
# Remove isolated nodes (nodes with no edges)
|
||||
dump(G.nodes())
|
||||
|
@ -272,8 +293,8 @@ def call_map():
|
|||
dot.node(fname, penwidth=str(pen))
|
||||
|
||||
max_w = max(edges.values())
|
||||
for refs,defs,data in G.edges(data=True):
|
||||
weight = data['weight']
|
||||
for refs, defs, data in G.edges(data=True):
|
||||
weight = data["weight"]
|
||||
|
||||
r = random.randint(0, 255)
|
||||
g = random.randint(0, 255)
|
||||
|
@ -286,7 +307,7 @@ def call_map():
|
|||
print()
|
||||
print(name)
|
||||
for ident in sorted(labels[name]):
|
||||
print('\t', ident)
|
||||
print("\t", ident)
|
||||
# print(f"{refs} -{weight}-> {defs}")
|
||||
|
||||
top_rank = sorted([(rank, node) for (node, rank) in ranked.items()], reverse=True)
|
||||
|
@ -296,5 +317,6 @@ def call_map():
|
|||
|
||||
dot.render("tmp", format="pdf", view=True)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
call_map()
|
||||
|
|
BIN
assets/robot-flowchart.png
Normal file
BIN
assets/robot-flowchart.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 700 KiB |
|
@ -1,5 +1,7 @@
|
|||
|
||||
# Improving GPT-4's codebase understanding with ctags
|
||||
# Improving GPT-4's codebase understanding with a map
|
||||
|
||||

|
||||
|
||||
GPT-4 is extremely useful for "self-contained" coding tasks,
|
||||
like generating brand new code or modifying a pure function
|
||||
|
@ -8,7 +10,7 @@ that has no dependencies.
|
|||
But it's difficult to use GPT-4 to modify or extend
|
||||
a large, complex pre-existing codebase.
|
||||
To modify such code, GPT needs to understand the dependencies and APIs
|
||||
which interconnect all of its subsystems.
|
||||
which interconnect its subsystems.
|
||||
Somehow we need to provide this "code context" to GPT
|
||||
when we ask it to accomplish a coding task. Specifically, we need to:
|
||||
|
||||
|
@ -22,8 +24,11 @@ To address these issues, `aider` now
|
|||
sends GPT a **concise map of your whole git repository**
|
||||
that includes
|
||||
all declared variables and functions with call signatures.
|
||||
This *repo map* is built using `ctags`
|
||||
and enables GPT to better comprehend, navigate
|
||||
This *repo map* is built automatically using `ctags`, which
|
||||
extracts symbol definitions from source files. Historically,
|
||||
ctags were generated and indexed by IDEs and editors to
|
||||
help humans search and navigate large codebases.
|
||||
Instead, we're going to use ctags to help GPT better comprehend, navigate
|
||||
and edit code in larger repos.
|
||||
|
||||
To get a sense of how effective this can be, this
|
||||
|
@ -35,6 +40,12 @@ Using only the meta-data in the repo map, GPT is able to figure out how to
|
|||
call the method to be tested, as well as how to instantiate multiple
|
||||
class objects that are required to prepare for the test.
|
||||
|
||||
To code with GPT-4 using the techniques discussed here:
|
||||
|
||||
|
||||
- Install [aider](https://github.com/paul-gauthier/aider#installation).
|
||||
- Install [universal ctags](https://github.com/universal-ctags/ctags).
|
||||
- Run `aider --ctags` inside your repo.
|
||||
|
||||
## The problem: code context
|
||||
|
||||
|
@ -63,7 +74,7 @@ For the example above, you could send the file that
|
|||
contains the Foo class
|
||||
and the file that contains the BarLog logging subsystem.
|
||||
This works pretty well, and is supported by `aider` -- you
|
||||
can manually specify which files to "add to the chat".
|
||||
can manually specify which files to "add to the chat" you are having with GPT.
|
||||
|
||||
But it's not ideal to have to manually identify the right
|
||||
set of files to add to the chat.
|
||||
|
@ -139,7 +150,14 @@ map. Universal ctags can scan source code written in many
|
|||
languages, and extract data about all the symbols defined in each
|
||||
file.
|
||||
|
||||
For example, here is the `ctags --fields=+S --output-format=json` output for the `main.py` file mapped above:
|
||||
Historically, ctags were generated and indexed by IDEs or code editors
|
||||
to make it easier for a human to search and navigate a
|
||||
codebase, find the implementation of functions, etc.
|
||||
Instead, we're going to use ctags to help GPT navigate and understand the codebase.
|
||||
|
||||
Here is the type of output you get when you run ctags on source code. Specifically,
|
||||
this is the
|
||||
`ctags --fields=+S --output-format=json` output for the `main.py` file mapped above:
|
||||
|
||||
```json
|
||||
{
|
||||
|
@ -160,8 +178,8 @@ For example, here is the `ctags --fields=+S --output-format=json` output for the
|
|||
```
|
||||
|
||||
The repo map is built using this type of `ctags` data,
|
||||
formatted into the space
|
||||
efficient hierarchical tree format shown above.
|
||||
but formatted into the space
|
||||
efficient hierarchical tree format shown earlier.
|
||||
This is a format that GPT can easily understand
|
||||
and which conveys the map data using a
|
||||
minimal number of tokens.
|
||||
|
@ -198,7 +216,7 @@ Some possible approaches to reducing the amount of map data are:
|
|||
One key goal is to prefer solutions which are language agnostic or
|
||||
which can be easily deployed against most popular code languages.
|
||||
The `ctags` solution has this benefit, since it comes pre-built
|
||||
with tooling for most popular languages.
|
||||
with support for most popular languages.
|
||||
I suspect that Language Server Protocol might be an even
|
||||
better tool than `ctags` for this problem.
|
||||
But it is more cumbersome to deploy for a broad
|
||||
|
@ -212,5 +230,5 @@ To use this experimental repo map feature:
|
|||
|
||||
- Install [aider](https://github.com/paul-gauthier/aider#installation).
|
||||
- Install [universal ctags](https://github.com/universal-ctags/ctags).
|
||||
- Run `aider` with the `--ctags` option inside your repo.
|
||||
- Run `aider --ctags` inside your repo.
|
||||
|
|
@ -24,3 +24,5 @@ wcwidth==0.2.6
|
|||
yarl==1.9.2
|
||||
pytest==7.3.1
|
||||
tiktoken==0.4.0
|
||||
configargparse
|
||||
PyYAML
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue