Commit graph

93 commits

Author SHA1 Message Date
Paul Gauthier
fcd8a6df73 Use the Models enum instead of .value 2023-06-05 17:06:34 -07:00
Paul Gauthier
95b32a74a9 Moved all model names into an enum 2023-06-05 09:19:29 -07:00
Paul Gauthier
efb8cad881 Merge branch 'main' into gpt-35 2023-06-05 09:07:03 -07:00
Paul Gauthier
4121b9900e aider: Added cache versioning to RepoMap class.
# Context:
USER: is there a standard way to version these cache files?
ASSISTANT: Yes, a standard way to version cache files is to include a version number in the cache file name or within the cache data. This allows you to easily identify and manage different versions of the cache, and update the cache handling logic when needed.

Here's how you can implement cache versioning in your `RepoMap` class:

1. Add a constant for the cache version at the beginning of the class.
2. Modify the cache directory names to include the version number.
3. When loading the cache, check if the version matches the current version. If not, clear the cache and start fresh.

Here are the changes you need to make in `aider/repomap.py`:

```python
aider/repomap.py
<<<<<<< ORIGINAL
class RepoMap:
    ctags_cmd = ["ctags", "--fields=+S", "--extras=-F", "--output-format=json"]
    IDENT_CACHE_DIR = ".aider.ident.cache"
    TAGS_CACHE_DIR = ".aider.tags.cache"
=======
class RepoMap:
    CACHE_VERSION = 1
    ctags_cmd = ["ctags", "--fields=+S", "--extras=-F", "--output-format=json"]
    IDENT_CACHE_DIR = f".aider.ident.cache.v{CACHE_VERSION}"
    TAGS_CACHE_DIR = f".aider.tags.cache.v{CACHE_VERSION}"
>>>>>>> UPDATED
```

Now, whenever you need to change the cache format or handling logic, you can simply increment the `CACHE_VERSION` constant, and the cache will be automatically invalidated and recreated with the new version.
2023-06-05 08:30:00 -07:00
Paul Gauthier
baa08e12f8 ssh 2023-06-05 08:29:30 -07:00
Paul Gauthier
27db4c038f bugfix 2023-06-05 08:29:08 -07:00
Paul Gauthier
93ca588bab sssh 2023-06-05 08:24:02 -07:00
Paul Gauthier
0256b8e260 Merge branch 'call-graph' into gpt-35 2023-06-04 15:03:46 -07:00
Paul Gauthier
53e8c65745 Return files without any tags at all at the tail of the ranked_tags list 2023-06-04 14:53:14 -07:00
Paul Gauthier
dfcf813ad6 include individual files in the ranked tags list, following specific ranked idents 2023-06-04 14:45:57 -07:00
Paul Gauthier
b339df3f70 Merge branch 'call-graph' into gpt-35 2023-06-04 14:28:28 -07:00
Paul Gauthier
95cb482938 bugfix 2023-06-04 14:13:56 -07:00
Paul Gauthier
9088d139b2 refactor prompts 2023-06-04 11:36:50 -07:00
Paul Gauthier
12512df707 Handle the case where there is no rank flow 2023-06-04 09:53:09 -07:00
Paul Gauthier
91f48dfdf1 aider: Added a check for an empty tags list in to_tree function.
# Context:
USER:  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 107, in choose_files_listing
    files_listing = self.get_ranked_tags_map(chat_files, other_files)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 318, in get_ranked_tags_map
    tree = to_tree(ranked_tags[:middle])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 26, in to_tree
    last = [None] * len(tags[0])
                        ~~~~^^^
ASSISTANT: I see the code snippets from the `aider/repomap.py` file. Please let me know what changes you would like to make, and make sure to provide me with the *read-write* access to the file.
USER: I added these *read-write* files: aider/repomap.py
ASSISTANT: Great! Now, please let me know what changes you would like to make in the `aider/repomap.py` file.
USER:   File "/Users/gauthier/Projects/aider/aider/repomap.py", line 318, in get_ranked_tags_map
    tree = to_tree(ranked_tags[:middle])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 26, in to_tree
    last = [None] * len(tags[0])
                        ~~~~^^^
IndexError: list index out of range
ASSISTANT: It seems like you are encountering an IndexError due to an empty `tags` list. To fix this issue, you can add a check to ensure that the `tags` list is not empty before accessing its elements. Here's the change you need to make in the `aider/repomap.py` file:

```python
aider/repomap.py
<<<<<<< ORIGINAL
def to_tree(tags):
    tags = sorted(tags)

    output = ""
    last = [None] * len(tags[0])
=======
def to_tree(tags):
    if not tags:
        return ""

    tags = sorted(tags)

    output = ""
    last = [None] * len(tags[0])
>>>>>>> UPDATED
```

This change adds a check for an empty `tags` list and returns an empty string if the list is empty, preventing the IndexError.
2023-06-04 06:43:30 -07:00
Paul Gauthier
b4fc6c2caa sssh 2023-06-03 18:11:45 -07:00
Paul Gauthier
ef22f44d19 final details on --map-tokens 2023-06-03 18:06:14 -07:00
Paul Gauthier
1f9e9be64d aider: Replaced use_ctags with map_tokens parameter in RepoMap class.
# Context:
USER: update `aider/repomap.py`
ASSISTANT: Please provide the specific changes you would like me to make in the `aider/repomap.py` file.
USER: to use map_tokens not `use_ctags`
ASSISTANT: I will update the `aider/repomap.py` file to use `map_tokens` instead of `use_ctags`.

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def __init__(self, use_ctags=None, root=None, main_model="gpt-4", io=None):
        self.io = io

        if not root:
            root = os.getcwd()
        self.root = root

        self.load_ident_cache()
        self.load_tags_cache()

        if use_ctags is None:
            self.use_ctags = self.check_for_ctags()
        else:
            self.use_ctags = use_ctags

        self.tokenizer = tiktoken.encoding_for_model(main_model)
=======
    def __init__(self, map_tokens=1024, root=None, main_model="gpt-4", io=None):
        self.io = io

        if not root:
            root = os.getcwd()
        self.root = root

        self.load_ident_cache()
        self.load_tags_cache()

        self.map_tokens = map_tokens

        self.tokenizer = tiktoken.encoding_for_model(main_model)
>>>>>>> UPDATED
```

This change replaces the `use_ctags` parameter with the `map_tokens` parameter in the `__init__` method of the `RepoMap` class. Make sure to update any other parts of the code that use `use_ctags` to use `map_tokens` instead.
2023-06-03 17:48:45 -07:00
Paul Gauthier
575c6e2db6 Stop closing 2023-06-02 16:40:21 -07:00
Paul Gauthier
c5ed12e0c3 aider: Replaced shelve with diskcache for persistent dictionary-like object.
# Context:
USER: is there another persistent dictionary besides shelve?
ASSISTANT: Yes, there are several alternatives to `shelve` for creating persistent dictionaries in Python. Some popular options include:

1. `pickle`: A built-in Python module for object serialization and deserialization. You can use it to store dictionaries in binary format and load them back into memory when needed.

2. `json`: Another built-in Python module for working with JSON data. You can use it to store dictionaries in a human-readable format and load them back into memory when needed.

3. `sqlite3`: A built-in Python module for working with SQLite databases. You can use it to create a simple key-value store with SQL queries.

4. `TinyDB`: A lightweight, document-oriented database for Python
^C KeyboardInterrupt
USER: i want something which presents as a dict though
ASSISTANT: In that case, you can use the `sqlite3` module to create a custom class that behaves like a dictionary. Here's an example of how you can create a persistent dictionary using SQLite:

```python
import sqlite3

class
^C KeyboardInterrupt
USER: i don't want to build something. i want an existing package.
ASSISTANT: I apologize for the confusion. In that case, you can use the `diskcache` package, which provides a dictionary-like interface for disk-based storage. It is easy to use and has good performance.

To install `diskcache`, you can use pip:

```
pip install diskcache
```

Here's an example of how to use `diskcache`:

```python
from diskcache import Cache

cache = Cache('path/to/cache/directory')

# Set a key-value pair
cache['key'] = 'value'

# Get the value for a key
value = cache['key']

# Check if a key is in the cache
if 'key' in cache:
    print('Key exists')

# Delete a key-value pair
del cache['key']

# Close the cache when done
cache.close()
```

This will create a persistent dictionary-like object using the `diskcache` package. The data will be stored on disk in the specified directory, and you can use the standard dictionary methods to interact with it.
USER: great. use that.
ASSISTANT: To use `diskcache` in the `aider/repomap.py` file, you'll need to replace the `shelve` usage with `diskcache`. Here are the changes you need to make:

1. Install `diskcache` using pip:

```
pip install diskcache
```

2. Modify the `aider/repomap.py` file:

```python
aider/repomap.py
<<<<<<< ORIGINAL
import colorsys
import json
import os
import random
import subprocess
import sys
import tempfile
from collections import Counter, defaultdict

# import shelve
=======
import colorsys
import json
import os
import random
import subprocess
import sys
import tempfile
from collections import Counter, defaultdict
from diskcache import Cache

>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    IDENT_CACHE_FILE = ".aider.ident.cache"
    TAGS_CACHE_FILE = ".aider.tags.cache"
=======
    IDENT_CACHE_DIR = ".aider.ident.cache"
    TAGS_CACHE_DIR = ".aider.tags.cache"
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def load_tags_cache(self):
        self.TAGS_CACHE = dict()  # shelve.open(self.TAGS_CACHE_FILE)

    def save_tags_cache(self):
        # self.TAGS_CACHE.sync()
        pass

    def load_ident_cache(self):
        self.IDENT_CACHE = dict()  # shelve.open(self.IDENT_CACHE_FILE)

    def save_ident_cache(self):
        # self.IDENT_CACHE.sync()
        pass
=======
    def load_tags_cache(self):
        self.TAGS_CACHE = Cache(self.TAGS_CACHE_DIR)

    def save_tags_cache(self):
        self.TAGS_CACHE.close()

    def load_ident_cache(self):
        self.IDENT_CACHE = Cache(self.IDENT_CACHE_DIR)

    def save_ident_cache(self):
        self.IDENT_CACHE.close()
>>>>>>> UPDATED
```

These changes will replace the usage of `shelve` with `diskcache` for the `IDENT_CACHE` and `TAGS_CACHE` dictionaries. Make sure to install the `diskcache` package before running the updated code.
2023-06-02 16:36:03 -07:00
Paul Gauthier
f5b68a4f69 personalization 2023-06-02 13:28:40 -07:00
Paul Gauthier
68434cbaa1 cleanup 2023-06-02 13:24:29 -07:00
Paul Gauthier
47293c5ba9 removed graph 2023-06-02 13:22:25 -07:00
Paul Gauthier
36f21fcfb6 refac 2023-06-02 13:22:00 -07:00
Paul Gauthier
360335f624 small repo map 2023-06-02 13:09:52 -07:00
Paul Gauthier
ac21a4a992 wip 2023-06-02 12:54:06 -07:00
Paul Gauthier
777a36d366 aider: Modified get_ranked_tags_map to use binary search.
# Context:
USER: make `get_ranked_tags_map` use binary search
ASSISTANT: To modify the `get_ranked_tags_map` function to use binary search, follow these steps:

1. Replace the for loop with a binary search implementation.
2. Update the loop to calculate the middle index and generate the tree with the middle index.
3. Check if the number of tokens is less than `max_map_tokens`. If it is, update the lower bound. Otherwise, update the upper bound.
4. Continue the binary search until the lower bound is greater than or equal to the upper bound.

Here's the updated `get_ranked_tags_map` function:

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def get_ranked_tags_map(self, fnames):
        # 1/4 of gpt-4's context window
        max_map_tokens = 2048

        ranked_tags = self.get_ranked_tags(fnames)
        num_tags = len(ranked_tags)

        for i in range(num_tags, 0, -1):
            tree = to_tree(ranked_tags[:i])
            num_tokens = self.token_count(tree)
            dump(i, num_tokens)
            if num_tokens < max_map_tokens:
                return tree
=======
    def get_ranked_tags_map(self, fnames):
        # 1/4 of gpt-4's context window
        max_map_tokens = 2048

        ranked_tags = self.get_ranked_tags(fnames)
        num_tags = len(ranked_tags)

        lower_bound = 0
        upper_bound = num_tags
        best_tree = None

        while lower_bound <= upper_bound:
            middle = (lower_bound + upper_bound) // 2
            tree = to_tree(ranked_tags[:middle])
            num_tokens = self.token_count(tree)

            if num_tokens < max_map_tokens:
                best_tree = tree
                lower_bound = middle + 1
            else:
                upper_bound = middle - 1

        return best_tree
>>>>>>> UPDATED
```
2023-06-02 12:46:03 -07:00
Paul Gauthier
1bbe8e63af find the best tags map that fits in max_tokens 2023-06-02 12:44:46 -07:00
Paul Gauthier
f92249a7ea refac 2023-06-02 12:40:40 -07:00
Paul Gauthier
e3aed997bc ranked_tags 2023-06-02 12:34:28 -07:00
Paul Gauthier
6e51675072 graph 2023-06-02 11:25:10 -07:00
Paul Gauthier
903d318a8a no weights 2023-06-02 10:55:43 -07:00
Paul Gauthier
746f4ccb56 rank flows evenly *out* from a node 2023-06-01 14:58:15 -07:00
Paul Gauthier
33f649dbbd noop 2023-06-01 14:54:04 -07:00
Paul Gauthier
594c4b600b cleanup 2023-06-01 13:24:52 -07:00
Paul Gauthier
bb0d58020c use in-edges 2023-06-01 13:12:15 -07:00
Paul Gauthier
915a9132de working ranked defs 2023-06-01 11:26:49 -07:00
Paul Gauthier
27566353a6 ssh 2023-06-01 10:57:48 -07:00
Paul Gauthier
11a7f9a80a aider: Calculated the sum of the weights of every inbound edge to each node and stored the result in the inbound_weights dictionary.
# Context:
USER: do the todo
ASSISTANT: I see a "todo" comment in the `aider/repomap.py` file. To complete the task, we need to calculate the sum of the weights of every inbound edge to each node in the graph. Here's how to do it:

```python
aider/repomap.py
<<<<<<< ORIGINAL
    for node in G.nodes:
        # todo: sum up the weight of every inbound edge to node
        pass
=======
    inbound_weights = {}
    for node in G.nodes:
        inbound_weights[node] = sum(data["weight"] for _, _, data in G.in_edges(node, data=True))
>>>>>>> UPDATED
```

This code calculates the sum of the weights of every inbound edge to each node and stores the result in the `inbound_weights` dictionary.
2023-06-01 10:56:41 -07:00
Paul Gauthier
d913d08f39 Merge branch 'main' into call-graph 2023-06-01 10:55:49 -07:00
Paul Gauthier
d1f05ca18a Handle io=none 2023-06-01 10:55:39 -07:00
Paul Gauthier
1aa8c3983d wip 2023-06-01 10:54:43 -07:00
Paul Gauthier
3c9991b38f Merge branch 'main' into call-graph 2023-06-01 09:07:26 -07:00
Paul Gauthier
7bc26e6762 Added --verbose support to repomap 2023-06-01 09:04:08 -07:00
Paul Gauthier
6567577855 works 2023-06-01 07:16:47 -07:00
Paul Gauthier
800ad12ae4 wip 2023-06-01 06:48:39 -07:00
Paul Gauthier
c46e8daa32 use networkx exclusively to maintain the graph structure 2023-05-31 16:53:42 -07:00
Paul Gauthier
b88dc7828c noop 2023-05-31 16:43:52 -07:00
Paul Gauthier
a553d71ca2 noop 2023-05-31 15:34:17 -07:00
Paul Gauthier
c635217d6a wip 2023-05-31 05:58:56 -07:00