Commit graph

7087 commits

Author SHA1 Message Date
Paul Gauthier
d996fbe1c3 aider: Updated the Coder class in aider/coder.py to accept the map_tokens parameter and pass it to the RepoMap class instead of use_ctags.
# Context:
USER: I added these *read-write* files: aider/coder.py
ASSISTANT: Great! Now I will modify the `aider/coder.py` file to accept and use the `map_tokens` parameter instead of `use_ctags`.

```python
aider/coder.py
<<<<<<< ORIGINAL
    def __init__(
        self,
        io,
        main_model="gpt-4",
        fnames=None,
        pretty=True,
        show_diffs=False,
        auto_commits=True,
        dirty_commits=True,
        dry_run=False,
        use_ctags=False,
        verbose=False,
        openai_api_key=None,
    ):
        ...
        self.repo_map = RepoMap(use_ctags, self.root, self.main_model, rm_io)
=======
    def __init__(
        self,
        io,
        main_model="gpt-4",
        fnames=None,
        pretty=True,
        show_diffs=False,
        auto_commits=True,
        dirty_commits=True,
        dry_run=False,
        map_tokens=1024,
        verbose=False,
        openai_api_key=None,
    ):
        ...
        self.repo_map = RepoMap(map_tokens, self.root, self.main_model, rm_io)
>>>>>>> UPDATED
```

This change updates the `Coder` class in `aider/coder.py` to accept the `map_tokens` parameter and pass it to the `RepoMap` class instead of `use_ctags`.
2023-06-03 17:47:45 -07:00
Paul Gauthier
15f34e2864 aider: Changed use_ctags to map_tokens parameter in main.py to pass to Coder class.
# Context:
USER: pass it to Coder as map_tokens instead of `use_ctags`
ASSISTANT: I will modify the `aider/main.py` file to pass `map_tokens` to the `Coder` class instead of `use_ctags`.

```python
aider/main.py
<<<<<<< ORIGINAL
    coder = Coder(
        io,
        main_model=args.model,
        fnames=args.files,
        pretty=args.pretty,
        show_diffs=args.show_diffs,
        auto_commits=args.auto_commits,
        dirty_commits=args.dirty_commits,
        dry_run=args.dry_run,
        use_ctags=args.ctags,
        verbose=args.verbose,
        openai_api_key=args.openai_api_key,
    )
=======
    coder = Coder(
        io,
        main_model=args.model,
        fnames=args.files,
        pretty=args.pretty,
        show_diffs=args.show_diffs,
        auto_commits=args.auto_commits,
        dirty_commits=args.dirty_commits,
        dry_run=args.dry_run,
        map_tokens=args.map_tokens,
        verbose=args.verbose,
        openai_api_key=args.openai_api_key,
    )
>>>>>>> UPDATED
```

Now, the `map_tokens` value is passed to the `Coder` class instead of `use_ctags`. Please make sure to update the `Coder` class in `aider/coder.py` to accept and use the `map_tokens` parameter.
2023-06-03 17:47:05 -07:00
Paul Gauthier
dcdad393c8 aider: Changed the --ctags switch into --map-tokens, which is an int that defaults to 1024.
# Context:
USER: change the --ctags switch into --map-tokens, which is an int that defaults to 1024
ASSISTANT: To make this change, I need to modify the `aider/main.py` file. Please make it *read-write* and let me know when it's done.
USER: I added these *read-write* files: aider/main.py
ASSISTANT: Great! I will now modify the `aider/main.py` file to change the `--ctags` switch into `--map-tokens`, which is an int that defaults to 1024.

```python
aider/main.py
<<<<<<< ORIGINAL
    parser.add_argument(
        "--ctags",
        type=lambda x: (str(x).lower() == "true"),
        nargs="?",
        const=True,
        default=None,
        help=(
            "Add ctags to the chat to help GPT understand the codebase (default: check for ctags"
            " executable)"
        ),
    )
=======
    parser.add_argument(
        "--map-tokens",
        type=int,
        default=1024,
        help="Number of tokens to map for code understanding (default: 1024)",
    )
>>>>>>> UPDATED
```

This change replaces the `--ctags` argument with the `--map-tokens` argument, which takes an integer value and defaults to 1024.
2023-06-03 17:46:11 -07:00
Paul Gauthier
05e3d2bfdb Merge branch 'main' into call-graph 2023-06-03 17:43:45 -07:00
Paul Gauthier
c27483193b refac 2023-06-03 06:18:45 -07:00
Paul Gauthier
1882cf55db Handle ConnectionError 2023-06-03 06:16:33 -07:00
Paul Gauthier
575c6e2db6 Stop closing 2023-06-02 16:40:21 -07:00
Paul Gauthier
c5ed12e0c3 aider: Replaced shelve with diskcache for persistent dictionary-like object.
# Context:
USER: is there another persistent dictionary besides shelve?
ASSISTANT: Yes, there are several alternatives to `shelve` for creating persistent dictionaries in Python. Some popular options include:

1. `pickle`: A built-in Python module for object serialization and deserialization. You can use it to store dictionaries in binary format and load them back into memory when needed.

2. `json`: Another built-in Python module for working with JSON data. You can use it to store dictionaries in a human-readable format and load them back into memory when needed.

3. `sqlite3`: A built-in Python module for working with SQLite databases. You can use it to create a simple key-value store with SQL queries.

4. `TinyDB`: A lightweight, document-oriented database for Python
^C KeyboardInterrupt
USER: i want something which presents as a dict though
ASSISTANT: In that case, you can use the `sqlite3` module to create a custom class that behaves like a dictionary. Here's an example of how you can create a persistent dictionary using SQLite:

```python
import sqlite3

class
^C KeyboardInterrupt
USER: i don't want to build something. i want an existing package.
ASSISTANT: I apologize for the confusion. In that case, you can use the `diskcache` package, which provides a dictionary-like interface for disk-based storage. It is easy to use and has good performance.

To install `diskcache`, you can use pip:

```
pip install diskcache
```

Here's an example of how to use `diskcache`:

```python
from diskcache import Cache

cache = Cache('path/to/cache/directory')

# Set a key-value pair
cache['key'] = 'value'

# Get the value for a key
value = cache['key']

# Check if a key is in the cache
if 'key' in cache:
    print('Key exists')

# Delete a key-value pair
del cache['key']

# Close the cache when done
cache.close()
```

This will create a persistent dictionary-like object using the `diskcache` package. The data will be stored on disk in the specified directory, and you can use the standard dictionary methods to interact with it.
USER: great. use that.
ASSISTANT: To use `diskcache` in the `aider/repomap.py` file, you'll need to replace the `shelve` usage with `diskcache`. Here are the changes you need to make:

1. Install `diskcache` using pip:

```
pip install diskcache
```

2. Modify the `aider/repomap.py` file:

```python
aider/repomap.py
<<<<<<< ORIGINAL
import colorsys
import json
import os
import random
import subprocess
import sys
import tempfile
from collections import Counter, defaultdict

# import shelve
=======
import colorsys
import json
import os
import random
import subprocess
import sys
import tempfile
from collections import Counter, defaultdict
from diskcache import Cache

>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    IDENT_CACHE_FILE = ".aider.ident.cache"
    TAGS_CACHE_FILE = ".aider.tags.cache"
=======
    IDENT_CACHE_DIR = ".aider.ident.cache"
    TAGS_CACHE_DIR = ".aider.tags.cache"
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def load_tags_cache(self):
        self.TAGS_CACHE = dict()  # shelve.open(self.TAGS_CACHE_FILE)

    def save_tags_cache(self):
        # self.TAGS_CACHE.sync()
        pass

    def load_ident_cache(self):
        self.IDENT_CACHE = dict()  # shelve.open(self.IDENT_CACHE_FILE)

    def save_ident_cache(self):
        # self.IDENT_CACHE.sync()
        pass
=======
    def load_tags_cache(self):
        self.TAGS_CACHE = Cache(self.TAGS_CACHE_DIR)

    def save_tags_cache(self):
        self.TAGS_CACHE.close()

    def load_ident_cache(self):
        self.IDENT_CACHE = Cache(self.IDENT_CACHE_DIR)

    def save_ident_cache(self):
        self.IDENT_CACHE.close()
>>>>>>> UPDATED
```

These changes will replace the usage of `shelve` with `diskcache` for the `IDENT_CACHE` and `TAGS_CACHE` dictionaries. Make sure to install the `diskcache` package before running the updated code.
2023-06-02 16:36:03 -07:00
Paul Gauthier
100deccbff Merge branch 'main' into call-graph 2023-06-02 16:30:26 -07:00
Paul Gauthier
4f1e1181c4 Tweaked prompt 2023-06-02 16:30:19 -07:00
Paul Gauthier
eb38a470a3 Merge branch 'main' into call-graph 2023-06-02 13:28:42 -07:00
Paul Gauthier
f5b68a4f69 personalization 2023-06-02 13:28:40 -07:00
Paul Gauthier
68434cbaa1 cleanup 2023-06-02 13:24:29 -07:00
Paul Gauthier
47293c5ba9 removed graph 2023-06-02 13:22:25 -07:00
Paul Gauthier
36f21fcfb6 refac 2023-06-02 13:22:00 -07:00
Paul Gauthier
360335f624 small repo map 2023-06-02 13:09:52 -07:00
Paul Gauthier
bd81ef6201 added newline to prompt 2023-06-02 12:54:28 -07:00
Paul Gauthier
ac21a4a992 wip 2023-06-02 12:54:06 -07:00
Paul Gauthier
777a36d366 aider: Modified get_ranked_tags_map to use binary search.
# Context:
USER: make `get_ranked_tags_map` use binary search
ASSISTANT: To modify the `get_ranked_tags_map` function to use binary search, follow these steps:

1. Replace the for loop with a binary search implementation.
2. Update the loop to calculate the middle index and generate the tree with the middle index.
3. Check if the number of tokens is less than `max_map_tokens`. If it is, update the lower bound. Otherwise, update the upper bound.
4. Continue the binary search until the lower bound is greater than or equal to the upper bound.

Here's the updated `get_ranked_tags_map` function:

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def get_ranked_tags_map(self, fnames):
        # 1/4 of gpt-4's context window
        max_map_tokens = 2048

        ranked_tags = self.get_ranked_tags(fnames)
        num_tags = len(ranked_tags)

        for i in range(num_tags, 0, -1):
            tree = to_tree(ranked_tags[:i])
            num_tokens = self.token_count(tree)
            dump(i, num_tokens)
            if num_tokens < max_map_tokens:
                return tree
=======
    def get_ranked_tags_map(self, fnames):
        # 1/4 of gpt-4's context window
        max_map_tokens = 2048

        ranked_tags = self.get_ranked_tags(fnames)
        num_tags = len(ranked_tags)

        lower_bound = 0
        upper_bound = num_tags
        best_tree = None

        while lower_bound <= upper_bound:
            middle = (lower_bound + upper_bound) // 2
            tree = to_tree(ranked_tags[:middle])
            num_tokens = self.token_count(tree)

            if num_tokens < max_map_tokens:
                best_tree = tree
                lower_bound = middle + 1
            else:
                upper_bound = middle - 1

        return best_tree
>>>>>>> UPDATED
```
2023-06-02 12:46:03 -07:00
Paul Gauthier
1bbe8e63af find the best tags map that fits in max_tokens 2023-06-02 12:44:46 -07:00
Paul Gauthier
f92249a7ea refac 2023-06-02 12:40:40 -07:00
Paul Gauthier
e3aed997bc ranked_tags 2023-06-02 12:34:28 -07:00
Paul Gauthier
6e51675072 graph 2023-06-02 11:25:10 -07:00
Paul Gauthier
903d318a8a no weights 2023-06-02 10:55:43 -07:00
Paul Gauthier
746f4ccb56 rank flows evenly *out* from a node 2023-06-01 14:58:15 -07:00
Paul Gauthier
33f649dbbd noop 2023-06-01 14:54:04 -07:00
Paul Gauthier
594c4b600b cleanup 2023-06-01 13:24:52 -07:00
Paul Gauthier
bb0d58020c use in-edges 2023-06-01 13:12:15 -07:00
Paul Gauthier
915a9132de working ranked defs 2023-06-01 11:26:49 -07:00
Paul Gauthier
27566353a6 ssh 2023-06-01 10:57:48 -07:00
Paul Gauthier
11a7f9a80a aider: Calculated the sum of the weights of every inbound edge to each node and stored the result in the inbound_weights dictionary.
# Context:
USER: do the todo
ASSISTANT: I see a "todo" comment in the `aider/repomap.py` file. To complete the task, we need to calculate the sum of the weights of every inbound edge to each node in the graph. Here's how to do it:

```python
aider/repomap.py
<<<<<<< ORIGINAL
    for node in G.nodes:
        # todo: sum up the weight of every inbound edge to node
        pass
=======
    inbound_weights = {}
    for node in G.nodes:
        inbound_weights[node] = sum(data["weight"] for _, _, data in G.in_edges(node, data=True))
>>>>>>> UPDATED
```

This code calculates the sum of the weights of every inbound edge to each node and stores the result in the `inbound_weights` dictionary.
2023-06-01 10:56:41 -07:00
Paul Gauthier
d913d08f39 Merge branch 'main' into call-graph 2023-06-01 10:55:49 -07:00
Paul Gauthier
d1f05ca18a Handle io=none 2023-06-01 10:55:39 -07:00
Paul Gauthier
1aa8c3983d wip 2023-06-01 10:54:43 -07:00
Paul Gauthier
3c9991b38f Merge branch 'main' into call-graph 2023-06-01 09:07:26 -07:00
Paul Gauthier
7bc26e6762 Added --verbose support to repomap 2023-06-01 09:04:08 -07:00
Paul Gauthier
6472c152b4 Merge branch 'main' into call-graph 2023-06-01 08:59:00 -07:00
Paul Gauthier
1f2aa1a6c8 Auto fallback to 3.5 2023-06-01 08:39:27 -07:00
Paul Gauthier
1b046e8111 aider: Added a new method to check the availability of OpenAI models and called it in the __init__ method to ensure the main_model is available in the list of OpenAI models.
# Context:
USER: in init, call check the list of available openai models and make sure the `main_model` is in there. if not, print an err msg
ASSISTANT:
^C KeyboardInterrupt
USER: in init, call check the list of available openai models and make sure the `main_model` is in there. if not, print an err msg
ASSISTANT: To check the list of available OpenAI models and make sure the `main_model` is in there, you need to modify the `__init__` method in the `Coder` class. Here's how you can do it:

1. Add a new method `check_model_availability` to the `Coder` class that takes the `main_model` as an argument and checks if it's available in the list of OpenAI models.
2. Call this method in the `__init__` method of the `Coder` class.

Here are the changes you need to make:

```python
aider/coder.py
<<<<<<< ORIGINAL
class Coder:
    abs_fnames = None
    repo = None
    last_aider_commit_hash = None
    last_asked_for_commit_time = 0

    def __init__(
=======
class Coder:
    abs_fnames = None
    repo = None
    last_aider_commit_hash = None
    last_asked_for_commit_time = 0

    def check_model_availability(self, main_model):
        available_models = openai.Model.list()
        model_ids = [model.id for model in available_models['data']]
        if main_model not in model_ids:
            raise ValueError(f"Model {main_model} is not available. Please choose from the available models: {model_ids}")

    def __init__(
>>>>>>> UPDATED
```

```python
aider/coder.py
<<<<<<< ORIGINAL
        self.main_model = main_model
        if main_model == "gpt-3.5-turbo":
            self.io.tool_error(
                f"aider doesn't work well with {main_model}, use gpt-4 for best results."
            )

        self.set_repo(fnames)
=======
        self.check_model_availability(main_model)
        self.main_model = main_model
        if main_model == "gpt-3.5-turbo":
            self.io.tool_error(
                f"aider doesn't work well with {main_model}, use gpt-4 for best results."
            )

        self.set_repo(fnames)
>>>>>>> UPDATED
```

These changes will add a new method `check_model_availability` to the `Coder` class and call it in the `__init__` method to ensure the `main_model` is available in the list of OpenAI models. If not, it will raise a `ValueError` with an appropriate error message.
2023-06-01 08:36:48 -07:00
Paul Gauthier
6567577855 works 2023-06-01 07:16:47 -07:00
Paul Gauthier
800ad12ae4 wip 2023-06-01 06:48:39 -07:00
Paul Gauthier
c46e8daa32 use networkx exclusively to maintain the graph structure 2023-05-31 16:53:42 -07:00
Paul Gauthier
b88dc7828c noop 2023-05-31 16:43:52 -07:00
Paul Gauthier
a553d71ca2 noop 2023-05-31 15:34:17 -07:00
Paul Gauthier
c635217d6a wip 2023-05-31 05:58:56 -07:00
Paul Gauthier
a216754c9f noqa 2023-05-30 17:07:12 -07:00
Paul Gauthier
4297e75b9b fixed up shelve 2023-05-30 17:06:02 -07:00
Paul Gauthier
a9b53569ac aider: Switched both caches to use shelve instead of json.
# Context:
USER: switch both caches to use shelve instead of json
ASSISTANT: To switch both caches to use `shelve` instead of `json`, we need to modify the `aider/repomap.py` file. We will update the `load_tags_cache`, `save_tags_cache`, `load_ident_cache`, and `save_ident_cache` methods.

Here are the changes:

```python
aider/repomap.py
<<<<<<< ORIGINAL
import json
=======
import shelve
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    IDENT_CACHE_FILE = ".aider.ident.cache.json"
    TAGS_CACHE_FILE = ".aider.tags.cache.json"

    def load_tags_cache(self):
        if os.path.exists(self.TAGS_CACHE_FILE):
            with open(self.TAGS_CACHE_FILE, "r") as f:
                self.TAGS_CACHE = json.load(f)
        else:
            self.TAGS_CACHE = dict()

    def save_tags_cache(self):
        with open(self.TAGS_CACHE_FILE, "w") as f:
            json.dump(self.TAGS_CACHE, f)
=======
    IDENT_CACHE_FILE = ".aider.ident.cache"
    TAGS_CACHE_FILE = ".aider.tags.cache"

    def load_tags_cache(self):
        if os.path.exists(self.TAGS_CACHE_FILE):
            with shelve.open(self.TAGS_CACHE_FILE) as db:
                self.TAGS_CACHE = dict(db)
        else:
            self.TAGS_CACHE = dict()

    def save_tags_cache(self):
        with shelve.open(self.TAGS_CACHE_FILE) as db:
            db.update(self.TAGS_CACHE)
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def load_ident_cache(self):
        if os.path.exists(self.IDENT_CACHE_FILE):
            with open(self.IDENT_CACHE_FILE, "r") as f:
                self.IDENT_CACHE = json.load(f)
        else:
            self.IDENT_CACHE = dict()

    def save_ident_cache(self):
        with open(self.IDENT_CACHE_FILE, "w") as f:
            json.dump(self.IDENT_CACHE, f)
=======
    def load_ident_cache(self):
        if os.path.exists(self.IDENT_CACHE_FILE):
            with shelve.open(self.IDENT_CACHE_FILE) as db:
                self.IDENT_CACHE = dict(db)
        else:
            self.IDENT_CACHE = dict()

    def save_ident_cache(self):
        with shelve.open(self.IDENT_CACHE_FILE) as db:
            db.update(self.IDENT_CACHE)
>>>>>>> UPDATED
```

These changes will replace the usage of `json` with `shelve` for both caches.
2023-05-30 17:03:31 -07:00
Paul Gauthier
bd57c71898 fix up json cache 2023-05-30 17:01:30 -07:00
Paul Gauthier
d1bbb85437 aider: Added methods to load and save tags cache to .aider.tags.cache file.
# Context:
USER: persist the tags cache to .aider.tags.cache
ASSISTANT: To persist the tags cache to `.aider.tags.cache`, you need to modify the `aider/repomap.py` file. Add two new methods `load_tags_cache` and `save_tags_cache` to the `RepoMap` class, and call them in the `__init__` method and `run_ctags` method, respectively.

Here are the changes you need to make:

```python
aider/repomap.py
<<<<<<< ORIGINAL
    TAGS_CACHE_FILE = ".aider.tags.cache"
=======
    TAGS_CACHE_FILE = ".aider.tags.cache"

    def load_tags_cache(self):
        if os.path.exists(self.TAGS_CACHE_FILE):
            with open(self.TAGS_CACHE_FILE, "r") as f:
                self.TAGS_CACHE = json.load(f)
        else:
            self.TAGS_CACHE = dict()

    def save_tags_cache(self):
        with open(self.TAGS_CACHE_FILE, "w") as f:
            json.dump(self.TAGS_CACHE, f)
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def __init__(self, use_ctags=None, root=None, main_model="gpt-4"):
=======
    def __init__(self, use_ctags=None, root=None, main_model="gpt-4"):
        self.load_tags_cache()
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def run_ctags(self, filename):
        # Check if the file is in the cache and if the modification time has not changed
        file_mtime = os.path.getmtime(filename)
        cache_key = filename
        if cache_key in self.TAGS_CACHE and self.TAGS_CACHE[cache_key]["mtime"] == file_mtime:
            return self.TAGS_CACHE[cache_key]["data"]

        cmd = self.ctags_cmd + [filename]
        output = subprocess.check_output(cmd).decode("utf-8")
        output = output.splitlines()

        data = [json.loads(line) for line in output]

        # Update the cache
        self.TAGS_CACHE[cache_key] = {"mtime": file_mtime, "data": data}
=======
    def run_ctags(self, filename):
        # Check if the file is in the cache and if the modification time has not changed
        file_mtime = os.path.getmtime(filename)
        cache_key = filename
        if cache_key in self.TAGS_CACHE and self.TAGS_CACHE[cache_key]["mtime"] == file_mtime:
            return self.TAGS_CACHE[cache_key]["data"]

        cmd = self.ctags_cmd + [filename]
        output = subprocess.check_output(cmd).decode("utf-8")
        output = output.splitlines()

        data = [json.loads(line) for line in output]

        # Update the cache
        self.TAGS_CACHE[cache_key] = {"mtime": file_mtime, "data": data}
        self.save_tags_cache()
>>>>>>> UPDATED
```

These changes will load the tags cache from the `.aider.tags.cache` file when initializing the `RepoMap` class and save the updated cache to the file after running ctags.
2023-05-30 16:54:36 -07:00