Commit graph

239 commits

Author SHA1 Message Date
Paul Gauthier
baa08e12f8 ssh 2023-06-05 08:29:30 -07:00
Paul Gauthier
27db4c038f bugfix 2023-06-05 08:29:08 -07:00
Paul Gauthier
93ca588bab sssh 2023-06-05 08:24:02 -07:00
Paul Gauthier
0256b8e260 Merge branch 'call-graph' into gpt-35 2023-06-04 15:03:46 -07:00
Paul Gauthier
53e8c65745 Return files without any tags at all at the tail of the ranked_tags list 2023-06-04 14:53:14 -07:00
Paul Gauthier
dfcf813ad6 include individual files in the ranked tags list, following specific ranked idents 2023-06-04 14:45:57 -07:00
Paul Gauthier
b339df3f70 Merge branch 'call-graph' into gpt-35 2023-06-04 14:28:28 -07:00
Paul Gauthier
95cb482938 bugfix 2023-06-04 14:13:56 -07:00
Paul Gauthier
9088d139b2 refactor prompts 2023-06-04 11:36:50 -07:00
Paul Gauthier
12512df707 Handle the case where there is no rank flow 2023-06-04 09:53:09 -07:00
Paul Gauthier
91f48dfdf1 aider: Added a check for an empty tags list in to_tree function.
# Context:
USER:  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 107, in choose_files_listing
    files_listing = self.get_ranked_tags_map(chat_files, other_files)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 318, in get_ranked_tags_map
    tree = to_tree(ranked_tags[:middle])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 26, in to_tree
    last = [None] * len(tags[0])
                        ~~~~^^^
ASSISTANT: I see the code snippets from the `aider/repomap.py` file. Please let me know what changes you would like to make, and make sure to provide me with the *read-write* access to the file.
USER: I added these *read-write* files: aider/repomap.py
ASSISTANT: Great! Now, please let me know what changes you would like to make in the `aider/repomap.py` file.
USER:   File "/Users/gauthier/Projects/aider/aider/repomap.py", line 318, in get_ranked_tags_map
    tree = to_tree(ranked_tags[:middle])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/repomap.py", line 26, in to_tree
    last = [None] * len(tags[0])
                        ~~~~^^^
IndexError: list index out of range
ASSISTANT: It seems like you are encountering an IndexError due to an empty `tags` list. To fix this issue, you can add a check to ensure that the `tags` list is not empty before accessing its elements. Here's the change you need to make in the `aider/repomap.py` file:

```python
aider/repomap.py
<<<<<<< ORIGINAL
def to_tree(tags):
    tags = sorted(tags)

    output = ""
    last = [None] * len(tags[0])
=======
def to_tree(tags):
    if not tags:
        return ""

    tags = sorted(tags)

    output = ""
    last = [None] * len(tags[0])
>>>>>>> UPDATED
```

This change adds a check for an empty `tags` list and returns an empty string if the list is empty, preventing the IndexError.
2023-06-04 06:43:30 -07:00
Paul Gauthier
b4fc6c2caa sssh 2023-06-03 18:11:45 -07:00
Paul Gauthier
ef22f44d19 final details on --map-tokens 2023-06-03 18:06:14 -07:00
Paul Gauthier
1f9e9be64d aider: Replaced use_ctags with map_tokens parameter in RepoMap class.
# Context:
USER: update `aider/repomap.py`
ASSISTANT: Please provide the specific changes you would like me to make in the `aider/repomap.py` file.
USER: to use map_tokens not `use_ctags`
ASSISTANT: I will update the `aider/repomap.py` file to use `map_tokens` instead of `use_ctags`.

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def __init__(self, use_ctags=None, root=None, main_model="gpt-4", io=None):
        self.io = io

        if not root:
            root = os.getcwd()
        self.root = root

        self.load_ident_cache()
        self.load_tags_cache()

        if use_ctags is None:
            self.use_ctags = self.check_for_ctags()
        else:
            self.use_ctags = use_ctags

        self.tokenizer = tiktoken.encoding_for_model(main_model)
=======
    def __init__(self, map_tokens=1024, root=None, main_model="gpt-4", io=None):
        self.io = io

        if not root:
            root = os.getcwd()
        self.root = root

        self.load_ident_cache()
        self.load_tags_cache()

        self.map_tokens = map_tokens

        self.tokenizer = tiktoken.encoding_for_model(main_model)
>>>>>>> UPDATED
```

This change replaces the `use_ctags` parameter with the `map_tokens` parameter in the `__init__` method of the `RepoMap` class. Make sure to update any other parts of the code that use `use_ctags` to use `map_tokens` instead.
2023-06-03 17:48:45 -07:00
Paul Gauthier
575c6e2db6 Stop closing 2023-06-02 16:40:21 -07:00
Paul Gauthier
c5ed12e0c3 aider: Replaced shelve with diskcache for persistent dictionary-like object.
# Context:
USER: is there another persistent dictionary besides shelve?
ASSISTANT: Yes, there are several alternatives to `shelve` for creating persistent dictionaries in Python. Some popular options include:

1. `pickle`: A built-in Python module for object serialization and deserialization. You can use it to store dictionaries in binary format and load them back into memory when needed.

2. `json`: Another built-in Python module for working with JSON data. You can use it to store dictionaries in a human-readable format and load them back into memory when needed.

3. `sqlite3`: A built-in Python module for working with SQLite databases. You can use it to create a simple key-value store with SQL queries.

4. `TinyDB`: A lightweight, document-oriented database for Python
^C KeyboardInterrupt
USER: i want something which presents as a dict though
ASSISTANT: In that case, you can use the `sqlite3` module to create a custom class that behaves like a dictionary. Here's an example of how you can create a persistent dictionary using SQLite:

```python
import sqlite3

class
^C KeyboardInterrupt
USER: i don't want to build something. i want an existing package.
ASSISTANT: I apologize for the confusion. In that case, you can use the `diskcache` package, which provides a dictionary-like interface for disk-based storage. It is easy to use and has good performance.

To install `diskcache`, you can use pip:

```
pip install diskcache
```

Here's an example of how to use `diskcache`:

```python
from diskcache import Cache

cache = Cache('path/to/cache/directory')

# Set a key-value pair
cache['key'] = 'value'

# Get the value for a key
value = cache['key']

# Check if a key is in the cache
if 'key' in cache:
    print('Key exists')

# Delete a key-value pair
del cache['key']

# Close the cache when done
cache.close()
```

This will create a persistent dictionary-like object using the `diskcache` package. The data will be stored on disk in the specified directory, and you can use the standard dictionary methods to interact with it.
USER: great. use that.
ASSISTANT: To use `diskcache` in the `aider/repomap.py` file, you'll need to replace the `shelve` usage with `diskcache`. Here are the changes you need to make:

1. Install `diskcache` using pip:

```
pip install diskcache
```

2. Modify the `aider/repomap.py` file:

```python
aider/repomap.py
<<<<<<< ORIGINAL
import colorsys
import json
import os
import random
import subprocess
import sys
import tempfile
from collections import Counter, defaultdict

# import shelve
=======
import colorsys
import json
import os
import random
import subprocess
import sys
import tempfile
from collections import Counter, defaultdict
from diskcache import Cache

>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    IDENT_CACHE_FILE = ".aider.ident.cache"
    TAGS_CACHE_FILE = ".aider.tags.cache"
=======
    IDENT_CACHE_DIR = ".aider.ident.cache"
    TAGS_CACHE_DIR = ".aider.tags.cache"
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def load_tags_cache(self):
        self.TAGS_CACHE = dict()  # shelve.open(self.TAGS_CACHE_FILE)

    def save_tags_cache(self):
        # self.TAGS_CACHE.sync()
        pass

    def load_ident_cache(self):
        self.IDENT_CACHE = dict()  # shelve.open(self.IDENT_CACHE_FILE)

    def save_ident_cache(self):
        # self.IDENT_CACHE.sync()
        pass
=======
    def load_tags_cache(self):
        self.TAGS_CACHE = Cache(self.TAGS_CACHE_DIR)

    def save_tags_cache(self):
        self.TAGS_CACHE.close()

    def load_ident_cache(self):
        self.IDENT_CACHE = Cache(self.IDENT_CACHE_DIR)

    def save_ident_cache(self):
        self.IDENT_CACHE.close()
>>>>>>> UPDATED
```

These changes will replace the usage of `shelve` with `diskcache` for the `IDENT_CACHE` and `TAGS_CACHE` dictionaries. Make sure to install the `diskcache` package before running the updated code.
2023-06-02 16:36:03 -07:00
Paul Gauthier
f5b68a4f69 personalization 2023-06-02 13:28:40 -07:00
Paul Gauthier
68434cbaa1 cleanup 2023-06-02 13:24:29 -07:00
Paul Gauthier
47293c5ba9 removed graph 2023-06-02 13:22:25 -07:00
Paul Gauthier
36f21fcfb6 refac 2023-06-02 13:22:00 -07:00
Paul Gauthier
360335f624 small repo map 2023-06-02 13:09:52 -07:00
Paul Gauthier
ac21a4a992 wip 2023-06-02 12:54:06 -07:00
Paul Gauthier
777a36d366 aider: Modified get_ranked_tags_map to use binary search.
# Context:
USER: make `get_ranked_tags_map` use binary search
ASSISTANT: To modify the `get_ranked_tags_map` function to use binary search, follow these steps:

1. Replace the for loop with a binary search implementation.
2. Update the loop to calculate the middle index and generate the tree with the middle index.
3. Check if the number of tokens is less than `max_map_tokens`. If it is, update the lower bound. Otherwise, update the upper bound.
4. Continue the binary search until the lower bound is greater than or equal to the upper bound.

Here's the updated `get_ranked_tags_map` function:

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def get_ranked_tags_map(self, fnames):
        # 1/4 of gpt-4's context window
        max_map_tokens = 2048

        ranked_tags = self.get_ranked_tags(fnames)
        num_tags = len(ranked_tags)

        for i in range(num_tags, 0, -1):
            tree = to_tree(ranked_tags[:i])
            num_tokens = self.token_count(tree)
            dump(i, num_tokens)
            if num_tokens < max_map_tokens:
                return tree
=======
    def get_ranked_tags_map(self, fnames):
        # 1/4 of gpt-4's context window
        max_map_tokens = 2048

        ranked_tags = self.get_ranked_tags(fnames)
        num_tags = len(ranked_tags)

        lower_bound = 0
        upper_bound = num_tags
        best_tree = None

        while lower_bound <= upper_bound:
            middle = (lower_bound + upper_bound) // 2
            tree = to_tree(ranked_tags[:middle])
            num_tokens = self.token_count(tree)

            if num_tokens < max_map_tokens:
                best_tree = tree
                lower_bound = middle + 1
            else:
                upper_bound = middle - 1

        return best_tree
>>>>>>> UPDATED
```
2023-06-02 12:46:03 -07:00
Paul Gauthier
1bbe8e63af find the best tags map that fits in max_tokens 2023-06-02 12:44:46 -07:00
Paul Gauthier
f92249a7ea refac 2023-06-02 12:40:40 -07:00
Paul Gauthier
e3aed997bc ranked_tags 2023-06-02 12:34:28 -07:00
Paul Gauthier
6e51675072 graph 2023-06-02 11:25:10 -07:00
Paul Gauthier
903d318a8a no weights 2023-06-02 10:55:43 -07:00
Paul Gauthier
746f4ccb56 rank flows evenly *out* from a node 2023-06-01 14:58:15 -07:00
Paul Gauthier
33f649dbbd noop 2023-06-01 14:54:04 -07:00
Paul Gauthier
594c4b600b cleanup 2023-06-01 13:24:52 -07:00
Paul Gauthier
bb0d58020c use in-edges 2023-06-01 13:12:15 -07:00
Paul Gauthier
915a9132de working ranked defs 2023-06-01 11:26:49 -07:00
Paul Gauthier
27566353a6 ssh 2023-06-01 10:57:48 -07:00
Paul Gauthier
11a7f9a80a aider: Calculated the sum of the weights of every inbound edge to each node and stored the result in the inbound_weights dictionary.
# Context:
USER: do the todo
ASSISTANT: I see a "todo" comment in the `aider/repomap.py` file. To complete the task, we need to calculate the sum of the weights of every inbound edge to each node in the graph. Here's how to do it:

```python
aider/repomap.py
<<<<<<< ORIGINAL
    for node in G.nodes:
        # todo: sum up the weight of every inbound edge to node
        pass
=======
    inbound_weights = {}
    for node in G.nodes:
        inbound_weights[node] = sum(data["weight"] for _, _, data in G.in_edges(node, data=True))
>>>>>>> UPDATED
```

This code calculates the sum of the weights of every inbound edge to each node and stores the result in the `inbound_weights` dictionary.
2023-06-01 10:56:41 -07:00
Paul Gauthier
d913d08f39 Merge branch 'main' into call-graph 2023-06-01 10:55:49 -07:00
Paul Gauthier
d1f05ca18a Handle io=none 2023-06-01 10:55:39 -07:00
Paul Gauthier
1aa8c3983d wip 2023-06-01 10:54:43 -07:00
Paul Gauthier
3c9991b38f Merge branch 'main' into call-graph 2023-06-01 09:07:26 -07:00
Paul Gauthier
7bc26e6762 Added --verbose support to repomap 2023-06-01 09:04:08 -07:00
Paul Gauthier
6567577855 works 2023-06-01 07:16:47 -07:00
Paul Gauthier
800ad12ae4 wip 2023-06-01 06:48:39 -07:00
Paul Gauthier
c46e8daa32 use networkx exclusively to maintain the graph structure 2023-05-31 16:53:42 -07:00
Paul Gauthier
b88dc7828c noop 2023-05-31 16:43:52 -07:00
Paul Gauthier
a553d71ca2 noop 2023-05-31 15:34:17 -07:00
Paul Gauthier
c635217d6a wip 2023-05-31 05:58:56 -07:00
Paul Gauthier
a216754c9f noqa 2023-05-30 17:07:12 -07:00
Paul Gauthier
4297e75b9b fixed up shelve 2023-05-30 17:06:02 -07:00
Paul Gauthier
a9b53569ac aider: Switched both caches to use shelve instead of json.
# Context:
USER: switch both caches to use shelve instead of json
ASSISTANT: To switch both caches to use `shelve` instead of `json`, we need to modify the `aider/repomap.py` file. We will update the `load_tags_cache`, `save_tags_cache`, `load_ident_cache`, and `save_ident_cache` methods.

Here are the changes:

```python
aider/repomap.py
<<<<<<< ORIGINAL
import json
=======
import shelve
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    IDENT_CACHE_FILE = ".aider.ident.cache.json"
    TAGS_CACHE_FILE = ".aider.tags.cache.json"

    def load_tags_cache(self):
        if os.path.exists(self.TAGS_CACHE_FILE):
            with open(self.TAGS_CACHE_FILE, "r") as f:
                self.TAGS_CACHE = json.load(f)
        else:
            self.TAGS_CACHE = dict()

    def save_tags_cache(self):
        with open(self.TAGS_CACHE_FILE, "w") as f:
            json.dump(self.TAGS_CACHE, f)
=======
    IDENT_CACHE_FILE = ".aider.ident.cache"
    TAGS_CACHE_FILE = ".aider.tags.cache"

    def load_tags_cache(self):
        if os.path.exists(self.TAGS_CACHE_FILE):
            with shelve.open(self.TAGS_CACHE_FILE) as db:
                self.TAGS_CACHE = dict(db)
        else:
            self.TAGS_CACHE = dict()

    def save_tags_cache(self):
        with shelve.open(self.TAGS_CACHE_FILE) as db:
            db.update(self.TAGS_CACHE)
>>>>>>> UPDATED
```

```python
aider/repomap.py
<<<<<<< ORIGINAL
    def load_ident_cache(self):
        if os.path.exists(self.IDENT_CACHE_FILE):
            with open(self.IDENT_CACHE_FILE, "r") as f:
                self.IDENT_CACHE = json.load(f)
        else:
            self.IDENT_CACHE = dict()

    def save_ident_cache(self):
        with open(self.IDENT_CACHE_FILE, "w") as f:
            json.dump(self.IDENT_CACHE, f)
=======
    def load_ident_cache(self):
        if os.path.exists(self.IDENT_CACHE_FILE):
            with shelve.open(self.IDENT_CACHE_FILE) as db:
                self.IDENT_CACHE = dict(db)
        else:
            self.IDENT_CACHE = dict()

    def save_ident_cache(self):
        with shelve.open(self.IDENT_CACHE_FILE) as db:
            db.update(self.IDENT_CACHE)
>>>>>>> UPDATED
```

These changes will replace the usage of `json` with `shelve` for both caches.
2023-05-30 17:03:31 -07:00
Paul Gauthier
bd57c71898 fix up json cache 2023-05-30 17:01:30 -07:00