Commit graph

1137 commits

Author SHA1 Message Date
Paul Gauthier
081781dcf1 copy 2023-05-21 16:00:58 -07:00
Paul Gauthier
64acbaecac copy 2023-05-21 15:58:57 -07:00
Paul Gauthier
f4f1c0cb1c added test example 2023-05-21 15:57:44 -07:00
Paul Gauthier
6df03a8dfb aider: Updated title of docs/ctags.md to "Improving GPT-4's codebase understanding with ctags".
# Context:
USER: can you suggest 10 different options for the title?
ASSISTANT: Here are 10 different title options for the `docs/ctags.md` file:

1. "Leveraging ctags for Enhanced Code Navigation in GPT-4"
2. "Improving GPT-4's Codebase Understanding with ctags"
3. "ctags: A Powerful Tool for GPT-4's Repo Navigation"
4. "Boosting GPT-4's Codebase Comprehension using ctags"
5. "Navigating Large Repositories with GPT-4 and ctags"
6. "Enhancing GPT-4's Repo Exploration Capabilities with ctags"
7. "ctags: Empowering GPT-4 to Handle Bigger Codebases"
8. "Supercharging GPT-4's Codebase Navigation with ctags"
9. "ctags: A Key to Unlocking GPT-4's Potential in Large Repos"
10. "Optimizing GPT-4's Repo Understanding through ctags Integration"

Please let me know which title you would like to use, and I will provide the necessary edit block to update the title in the `docs/ctags.md` file.
USER: 2, but normal capitalization
ASSISTANT: Here's the edit block to update the title in the `docs/ctags.md` file with the chosen option "Improving GPT-4's codebase understanding with ctags" using normal capitalization:

```markdown
docs/ctags.md
<<<<<<< ORIGINAL
# Enhancing GPT-4's understanding of repos using ctags
=======
# Improving GPT-4's codebase understanding with ctags
>>>>>>> UPDATED
```

This will replace the original title with the new one.
2023-05-21 15:52:20 -07:00
paul-gauthier
cf5fa778c8
Update ctags.md 2023-05-21 15:04:22 -07:00
paul-gauthier
9ccae2e5ef
Update ctags.md 2023-05-21 14:58:01 -07:00
paul-gauthier
d1ccd1328c
Update ctags.md 2023-05-21 14:55:32 -07:00
Paul Gauthier
cc3de6730a copy 2023-05-21 14:45:59 -07:00
Paul Gauthier
c99f242555 copy 2023-05-21 14:43:25 -07:00
Paul Gauthier
a81c94b197 copy 2023-05-21 14:42:23 -07:00
Paul Gauthier
eab65ee49d copy 2023-05-21 14:40:56 -07:00
Paul Gauthier
ff4dbb8973 copy 2023-05-21 14:38:23 -07:00
Paul Gauthier
386a262434 copy 2023-05-21 14:35:27 -07:00
Paul Gauthier
a78d8a7121 copy 2023-05-21 14:30:05 -07:00
Paul Gauthier
b081dd9e19 copy 2023-05-21 14:29:47 -07:00
Paul Gauthier
dde6577cdf aider: Enhanced GPT-4's understanding of entire repos using ctags.
# Context:
USER: improve the first para
ASSISTANT: ```markdown
docs/ctags.md
<<<<<<< ORIGINAL
# Using ctags to help GPT-4 understand an entire repo

GPT-4 is great at coding, but it's hard to work with it to make
modifications inside a large code base.
GPT can't really understand and navigate more code than can fit
in its context window.
And many
common types of code changes will need knowledge and context from
multiple files strewn about the repo.
A new experimental feature of `aider` uses `ctags` to give GPT a
**repo map**, so that it can better understand and navigate larger repos.
=======
# Enhancing GPT-4's understanding of entire repos using ctags

While GPT-4 excels at coding tasks, it struggles with making modifications within large code bases. Its limited context window prevents it from understanding and navigating code beyond its capacity. Many common code changes require knowledge and context from multiple files scattered throughout the repo. To address this issue, `aider` introduces an experimental feature that utilizes `ctags` to provide GPT with a **repo map**, enabling it to better comprehend and navigate larger repositories.
>>>>>>> UPDATED
```
2023-05-21 14:28:26 -07:00
Paul Gauthier
f35bc2f157 copy 2023-05-21 14:27:37 -07:00
Paul Gauthier
adcfbf5d85 copy 2023-05-21 14:26:46 -07:00
Paul Gauthier
8e6774babd copy 2023-05-21 14:24:26 -07:00
Paul Gauthier
7cef19e455 copy 2023-05-21 14:23:54 -07:00
Paul Gauthier
aac77d9659 aider: Improve para starting 'Of course for large repos' in ctags.md.
# Context:
USER: improve the para starting "Of course for large repos"
ASSISTANT: ```markdown
docs/ctags.md
<<<<<<< ORIGINAL
Of course, for large repos even their map might be too large for
the context window.
But this mapping approach enables
collaboration with GPT-4 on larger code bases
than was previously possible.
And it reduces the need to manually curate which files to add
to the chat, giving GPT itself more ability to identify which files
are relevant to the task at hand.
=======
Of course, for large repositories, even their map might be too large for the context window. However, this mapping approach significantly improves collaboration with GPT-4 on larger codebases compared to previous methods. It also minimizes the need for manual curation of files to add to the chat, empowering GPT to autonomously identify relevant files for the task at hand.
>>>>>>> UPDATED
```
2023-05-21 14:22:51 -07:00
Paul Gauthier
29bb9398df copy 2023-05-21 14:21:41 -07:00
Paul Gauthier
35d99ee08b copy 2023-05-21 14:19:13 -07:00
Paul Gauthier
e64353ada1 copy 2023-05-21 14:18:57 -07:00
Paul Gauthier
fe1d0b0e09 copy 2023-05-21 14:17:27 -07:00
Paul Gauthier
3f1eaf566c copy 2023-05-21 14:16:13 -07:00
Paul Gauthier
e047976ce4 copy 2023-05-21 14:15:44 -07:00
Paul Gauthier
8b28539b4e copy 2023-05-21 14:14:30 -07:00
Paul Gauthier
897e19372e copy 2023-05-21 14:12:42 -07:00
Paul Gauthier
7c8bc9453c copy 2023-05-21 14:10:32 -07:00
Paul Gauthier
cfab008989 copy 2023-05-21 12:55:54 -07:00
Paul Gauthier
ab4b379db3 copy 2023-05-21 12:51:05 -07:00
Paul Gauthier
5db6346a39 added ctags doc 2023-05-21 12:50:31 -07:00
Paul Gauthier
4f2e1d14ef sort the file list in the prompt 2023-05-20 07:28:23 -07:00
Paul Gauthier
ffe3da4df1 do not sent file lists longer than 100 files 2023-05-19 16:17:40 -07:00
paul-gauthier
7d08c346bb
Merge pull request #4 from mobyvb/prompt-compression
aider/prompts.py: reduce token size of prompts
2023-05-19 16:10:26 -07:00
Paul Gauthier
c441906a22 aider: Made -v a short option for --verbose.
# Context:
USER: make -v short for --verbose
ASSISTANT: To make `-v` a short option for `--verbose`, you can modify the `aider/main.py` file. Here's the change:

```python
aider/main.py
<<<<<<< ORIGINAL
    parser.add_argument(
        "--verbose",
        action="store_true",
        help="Enable verbose output",
        default=False,
    )
=======
    parser.add_argument(
        "-v", "--verbose",
        action="store_true",
        help="Enable verbose output",
        default=False,
    )
>>>>>>> UPDATED
```
2023-05-19 10:57:59 -07:00
Paul Gauthier
7e6ea0660d Refactor show_messages function to accept an optional title parameter and remove unused code. 2023-05-19 10:57:24 -07:00
Paul Gauthier
10e38e1fe9 aider: Added --verbose option to enable verbose output.
# Context:
USER: Add --verbose with no env default. pass into Coder and set self.verbose
ASSISTANT: To add the `--verbose` option, we need to modify the `aider/main.py` file. Here's the step-by-step process:

1. Add the `--verbose` argument to the `argparse.ArgumentParser` in the `main` function.
2. Pass the `args.verbose` value to the `Coder` class when initializing it.
3. Update the `Coder` class in the `aider/coder.py` file to accept the `verbose` parameter and set `self.verbose`.

Here are the changes to the code:

```python
aider/main.py
<<<<<<< ORIGINAL
    parser.add_argument(
        "--yes",
        action="store_true",
        help="Always say yes to every confirmation",
        default=False,
    )
=======
    parser.add_argument(
        "--yes",
        action="store_true",
        help="Always say yes to every confirmation",
        default=False,
    )
    parser.add_argument(
        "--verbose",
        action="store_true",
        help="Enable verbose output",
        default=False,
    )
>>>>>>> UPDATED
```

```python
aider/main.py
<<<<<<< ORIGINAL
    coder = Coder(
        io,
        main_model=args.model,
        fnames=args.files,
        pretty=args.pretty,
        show_diffs=args.show_diffs,
        auto_commits=args.auto_commits,
        dry_run=args.dry_run,
        use_ctags=args.ctags,
    )
=======
    coder = Coder(
        io,
        main_model=args.model,
        fnames=args.files,
        pretty=args.pretty,
        show_diffs=args.show_diffs,
        auto_commits=args.auto_commits,
        dry_run=args.dry_run,
        use_ctags=args.ctags,
        verbose=args.verbose,
    )
>>>>>>> UPDATED
```

```python
aider/coder.py
<<<<<<< ORIGINAL
    def __init__(
        self,
        io,
        main_model="gpt-4",
        fnames=None,
        pretty=True,
        show_diffs=False,
        auto_commits=True,
        dry_run=False,
        use_ctags=False,
    ):
=======
    def __init__(
        self,
        io,
        main_model="gpt-4",
        fnames=None,
        pretty=True,
        show_diffs=False,
        auto_commits=True,
        dry_run=False,
        use_ctags=False,
        verbose=False,
    ):
        self.verbose = verbose
>>>>>>> UPDATED
```
2023-05-19 10:55:48 -07:00
Paul Gauthier
7d2c59432e copy 2023-05-19 10:52:23 -07:00
Paul Gauthier
a7d9a4790f aider: Added --ctags option to the usage section in README.md.
# Context:
USER: update the usage to include --ctags
ASSISTANT: ```markdown
README.md
<<<<<<< ORIGINAL
You can also use additional command-line options to customize the behavior of the tool. The following options are available, along with their corresponding environment variable overrides:

- `--input-history-file INPUT_HISTORY_FILE`: Specify the chat input history file (default: .aider.input.history). Override the default with the environment variable `AIDER_INPUT_HISTORY_FILE`.
- `--chat-history-file CHAT_HISTORY_FILE`: Specify the chat history file (default: .aider.chat.history.md). Override the default with the environment variable `AIDER_CHAT_HISTORY_FILE`.
- `--model MODEL`: Specify the model to use for the main chat (default: gpt-4). Override the default with the environment variable `AIDER_MODEL`.
- `-3`: Use gpt-3.5-turbo model for the main chat (not advised). No environment variable override.
=======
You can also use additional command-line options to customize the behavior of the tool. The following options are available, along with their corresponding environment variable overrides:

- `--input-history-file INPUT_HISTORY_FILE`: Specify the chat input history file (default: .aider.input.history). Override the default with the environment variable `AIDER_INPUT_HISTORY_FILE`.
- `--chat-history-file CHAT_HISTORY_FILE`: Specify the chat history file (default: .aider.chat.history.md). Override the default with the environment variable `AIDER_CHAT_HISTORY_FILE`.
- `--model MODEL`: Specify the model to use for the main chat (default: gpt-4). Override the default with the environment variable `AIDER_MODEL`.
- `-3`: Use gpt-3.5-turbo model for the main chat (not advised). No environment variable override.
- `--ctags`: Add ctags to the chat to help GPT understand the codebase (default: False, `AIDER_CTAGS`). Override the default with the environment variable `AIDER_CTAGS`.
>>>>>>> UPDATED
```
I've updated the usage section in the README.md to include the `--ctags` option.
2023-05-19 10:50:53 -07:00
Paul Gauthier
b82bb65b86 help msg for --ctags 2023-05-19 10:49:11 -07:00
Paul Gauthier
32e40a3e1c aider: Prevent gpt-3.5-turbo from generating commit messages for diffs larger than 4k tokens and catch InvalidRequestError.
# Context:
USER: `Coder.get_commit_message()` uses gpt-3.5-turbo to summarize the diffs into a sensible commit message. If the diff is large, it exceeds the 4K context window.

A simple fix would be:

1. Check if the diff is close to or larger than 4 kbytes (4k tokens * 4 bytes/token) and don't even bother sending the messages to gpt. Just return.
2. Catch the `openai.error.InvalidRequestError` and just return.

Maybe print a nice tool_error() message for each of these.

ASSISTANT: To implement the suggested changes, we will modify the `get_commit_message()` method in the `aider/coder.py` file. Here are the step-by-step changes:

1. Check if the length of `diffs` is close to or larger than 4 kbytes (4k tokens * 4 bytes/token). If it is, print an error message and return.
2. Catch the `openai.error.InvalidRequestError` exception when calling `self.send()` and print an error message, then return.

Here are the code changes:

```python
aider/coder.py
<<<<<<< ORIGINAL
    def get_commit_message(self, diffs, context):
        diffs = "# Diffs:\n" + diffs

        messages = [
            dict(role="system", content=prompts.commit_system),
            dict(role="user", content=context + diffs),
        ]

        commit_message, interrupted = self.send(
            messages,
            model="gpt-3.5-turbo",
            silent=True,
        )
=======
    def get_commit_message(self, diffs, context):
        if len(diffs) >= 4 * 1024 * 4:
            self.io.tool_error("Diff is too large for gpt-3.5-turbo to generate a commit message.")
            return

        diffs = "# Diffs:\n" + diffs

        messages = [
            dict(role="system", content=prompts.commit_system),
            dict(role="user", content=context + diffs),
        ]

        try:
            commit_message, interrupted = self.send(
                messages,
                model="gpt-3.5-turbo",
                silent=True,
            )
        except openai.error.InvalidRequestError:
            self.io.tool_error("Failed to generate commit message using gpt-3.5-turbo due to an invalid request.")
            return
>>>>>>> UPDATED
```
2023-05-19 10:41:43 -07:00
Paul Gauthier
af85c81c13 fixed missing case of just the filename 2023-05-19 10:32:55 -07:00
Paul Gauthier
046e6ed9d4 aider: Added in-memory cache to get_tags function using a tuple (filename, root_dname) as the cache key.
# Context:
USER: add an in memory cache to get_tags that only updates if the st_mtime of the filename changes
ASSISTANT: To implement an in-memory cache for the `get_tags` function, we will need to modify the `aider/ctags.py` file. We will use a dictionary to store the cache and check the file's modification time to decide whether to update the cache or not.

Here are the steps to implement the cache:

1. Create a global variable `TAGS_CACHE` to store the cache.
2. In the `get_tags` function, check if the filename is in the cache and if the modification time has not changed.
3. If the conditions in step 2 are met, return the cached value.
4. If the conditions in step 2 are not met, update the cache with the new tags and return them.

Here are the changes to the `aider/ctags.py` file:

```python
aider/ctags.py
<<<<<<< ORIGINAL
import os
import json
import sys
import subprocess

# from aider.dump import dump
=======
import os
import json
import sys
import subprocess

# Global cache for tags
TAGS_CACHE = {}

# from aider.dump import dump
>>>>>>> UPDATED
```

```python
aider/ctags.py
<<<<<<< ORIGINAL
def get_tags(filename, root_dname):
    cmd = ["ctags", "--fields=+S", "--output-format=json", filename]
    output = subprocess.check_output(cmd).decode("utf-8")
    output = output.splitlines()
=======
def get_tags(filename, root_dname):
    # Check if the file is in the cache and if the modification time has not changed
    file_mtime = os.path.getmtime(filename)
    if filename in TAGS_CACHE and TAGS_CACHE[filename]["mtime"] == file_mtime:
        return TAGS_CACHE[filename]["tags"]

    cmd = ["ctags", "--fields=+S", "--output-format=json", filename]
    output = subprocess.check_output(cmd).decode("utf-8")
    output = output.splitlines()
>>>>>>> UPDATED
```

```python
aider/ctags.py
<<<<<<< ORIGINAL
    for line in output:
        tag = json.loads(line)
        path = tag.get("path")
        scope = tag.get("scope")
        kind = tag.get("kind")
        name = tag.get("name")
        signature = tag.get("signature")

        last = name
        if signature:
            last += " " + signature

        res = split_path(path, root_dname)
        if scope:
            res.append(scope)
        res += [kind, last]

        yield res
=======
    tags = []
    for line in output:
        tag = json.loads(line)
        path = tag.get("path")
        scope = tag.get("scope")
        kind = tag.get("kind")
        name = tag.get("name")
        signature = tag.get("signature")

        last = name
        if signature:
            last += " " + signature

        res = split_path(path, root_dname)
        if scope:
            res.append(scope)
        res += [kind, last]
        tags.append(res)

    #
^C KeyboardInterrupt
USER: sorry, please do that again. but set key=(filename, root_dname) and use that as the cache key
ASSISTANT: No problem. I will modify the cache implementation to use a tuple (filename, root_dname) as the cache key. Here are the updated changes to the `aider/ctags.py` file:

```python
aider/ctags.py
<<<<<<< ORIGINAL
import os
import json
import sys
import subprocess

# from aider.dump import dump
=======
import os
import json
import sys
import subprocess

# Global cache for tags
TAGS_CACHE = {}

# from aider.dump import dump
>>>>>>> UPDATED
```

```python
aider/ctags.py
<<<<<<< ORIGINAL
def get_tags(filename, root_dname):
    cmd = ["ctags", "--fields=+S", "--output-format=json", filename]
    output = subprocess.check_output(cmd).decode("utf-8")
    output = output.splitlines()
=======
def get_tags(filename, root_dname):
    # Check if the file is in the cache and if the modification time has not changed
    file_mtime = os.path.getmtime(filename)
    cache_key = (filename, root_dname)
    if cache_key in TAGS_CACHE and TAGS_CACHE[cache_key]["mtime"] == file_mtime:
        return TAGS_CACHE[cache_key]["tags"]

    cmd = ["ctags", "--fields=+S", "--output-format=json", filename]
    output = subprocess.check_output(cmd).decode("utf-8")
    output = output.splitlines()
>>>>>>> UPDATED
```

```python
aider/ctags.py
<<<<<<< ORIGINAL
    for line in output:
        tag = json.loads(line)
        path = tag.get("path")
        scope = tag.get("scope")
        kind = tag.get("kind")
        name = tag.get("name")
        signature = tag.get("signature")

        last = name
        if signature:
            last += " " + signature

        res = split_path(path, root_dname)
        if scope:
            res.append(scope)
        res += [kind, last]

        yield res
=======
    tags = []
    for line in output:
        tag = json.loads(line)
        path = tag.get("path")
        scope = tag.get("scope")
        kind = tag.get("kind")
        name = tag.get("name")
        signature = tag.get("signature")

        last = name
        if signature:
            last += " " + signature

        res = split_path(path, root_dname)
        if scope:
            res.append(scope)
        res += [kind, last]
        tags.append(res)

    # Update the cache
    TAGS_CACHE[cache_key] = {"mtime": file_mtime, "tags": tags}

    return tags
>>>>>>> UPDATED
```
2023-05-19 10:32:02 -07:00
Paul Gauthier
9a243d7adc sssh 2023-05-19 10:19:23 -07:00
Paul Gauthier
1391e9de9b aider: Updated cmd_run method in aider/commands.py to use the io version of Confirm.ask().
# Context:
USER: in commands, switch the Confirm.ask() calls to use the io versions
ASSISTANT: I will update the `cmd_add` method in the `aider/commands.py` file to use the `io` versions of `Confirm.ask()`.

^C KeyboardInterrupt
USER: switch the Confirm.ask() calls to use the io versions
ASSISTANT: I will update the `cmd_add` method in the `aider/commands.py` file to use the `io` versions of `Confirm.ask()`.

```python
aider/commands.py
<<<<<<< ORIGINAL
                if self.coder.repo is not None:
                    create_file = self.io.confirm_ask(
                        (
                            f"No files matched
^C KeyboardInterrupt
USER: no, they already do. cmd_run
ASSISTANT: I apologize for the confusion. I will update the `cmd_run` method in the `aider/commands.py` file to use the `io` version of `Confirm.ask()`.

```python
aider/commands.py
<<<<<<< ORIGINAL
        ok = Confirm.ask("Add the output to the chat?", default="y")
=======
        ok = self.io.confirm_ask("Add the output to the chat?", default="y")
>>>>>>> UPDATED
```
2023-05-19 10:18:49 -07:00
Paul Gauthier
71f6d09c78 cleanup 2023-05-19 10:16:49 -07:00
Paul Gauthier
fd66aac159 cleanup 2023-05-19 10:12:56 -07:00
Paul Gauthier
f6aaa19299 implement --ctags 2023-05-19 10:07:52 -07:00