Commit graph

2795 commits

Author SHA1 Message Date
Paul Gauthier
35ccfac3fd aider: Updated the output of /tokens to include the total cost.
# Aider chat conversation:

USER: udpate the output of /tokens to include the total $ cost

ASSISTANT: To include the total cost in the output of the `/tokens` command, we need to modify the `cmd_tokens` method in the `Commands` class in the `aider/commands.py` file.

We will add a new variable `total_cost` to keep track of the total cost. For each token count, we will calculate the cost by multiplying the token count by the model's `completion_price` (since we're considering the cost of generating these tokens). We will then add this cost to `total_cost`.

Finally, we will add a line to the output to display the total cost.

Here are the changes:

<source>python
aider/commands.py
<<<<<<< ORIGINAL
        total = 0
        for tk, msg, tip in res:
            total += tk
            msg = msg.ljust(col_width)
            self.io.tool_output(f"{fmt(tk)} {msg} {tip}")

        self.io.tool_output("=" * width)
        self.io.tool_output(f"{fmt(total)} tokens total")
=======
        total = 0
        total_cost = 0.0
        for tk, msg, tip in res:
            total += tk
            cost = tk * self.coder.main_model.completion_price
            total_cost += cost
            msg = msg.ljust(col_width)
            self.io.tool_output(f"{fmt(tk)} {msg} {tip} (cost: ${cost:.2f})")

        self.io.tool_output("=" * width)
        self.io.tool_output(f"{fmt(total)} tokens total (cost: ${total_cost:.2f})")
>>>>>>> UPDATED
</source>
2023-07-26 17:34:50 -03:00
Paul Gauthier
08ee61617f Updated discord link 2023-07-26 15:52:13 -03:00
Paul Gauthier
3e4b4d1b0d retry on APIConnectionError 2023-07-26 07:40:36 -03:00
Paul Gauthier
8ea4629e73 Merge branch 'main' into chat-history 2023-07-26 07:29:54 -03:00
paul-gauthier
86309f336c
Merge pull request #137 from paul-gauthier/refactor-repo
Refactor git repo code into a new file
2023-07-26 07:24:10 -03:00
Paul Gauthier
3e77df7587 /tokens should warn when <1k tokens left 2023-07-25 09:25:01 -03:00
Paul Gauthier
133eff8bdf Merge branch 'refactor-repo' into chat-history 2023-07-25 09:20:22 -03:00
Paul Gauthier
5bb3935f7e ssh 2023-07-25 09:19:01 -03:00
Paul Gauthier
b3746db08b Merge branch 'main' into chat-history 2023-07-25 09:13:28 -03:00
Paul Gauthier
53e332526b Merge branch 'main' into refactor-repo 2023-07-25 09:13:20 -03:00
Paul Gauthier
05c1b3bc99 cleanup 2023-07-25 09:13:07 -03:00
Paul Gauthier
9050b9bd93 threading 2023-07-25 07:57:05 -03:00
Paul Gauthier
4e348102bf wip 2023-07-25 07:46:05 -03:00
Paul Gauthier
b849d16ec7 Updated HISTORY 2023-07-25 07:22:22 -03:00
Paul Gauthier
fcdd640a09 Strong prompt to never elide code 2023-07-25 07:04:14 -03:00
Paul Gauthier
f0ec6cb36a Expand the tail to ensure head ends with assistant msg 2023-07-25 06:58:41 -03:00
Paul Gauthier
4b41eb7525 aider: Fix variable name error in summarize method in history.py.
# Aider chat conversation:

USER:   File "/Users/gauthier/Projects/aider/aider/coders/base_coder.py", line 395, in run_loop
    return self.send_new_user_message(inp)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/coders/base_coder.py", line 479, in send_new_user_message
    self.move_back_cur_messages(saved_message)
  File "/Users/gauthier/Projects/aider/aider/coders/base_coder.py", line 365, in move_back_cur_messages
    self.done_messages = self.summarizer.summarize(self.done_messages)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/history.py", line 60, in summarize
    tail_tokens = sum(tokens for tokens, msg in sized[num:])
                                                      ^^^
NameError: name 'num' is not defined. Did you mean: 'sum'?

ASSISTANT: It seems like there's a typo in the code. The variable `num` is not defined in the current scope. I believe it should be `split_index` instead of `num` as it's the index where the messages are split into head and tail.

Here's the change that needs to be made in the `aider/history.py` file:

```python
aider/history.py
<<<<<<< ORIGINAL
        tail_tokens = sum(tokens for tokens, msg in sized[num:])
=======
        tail_tokens = sum(tokens for tokens, msg in sized[split_index:])
>>>>>>> UPDATED
```

Please confirm if this is the change you want to make.
2023-07-24 19:10:58 -03:00
Paul Gauthier
3c656fe3be aider: Ensure the head ends with an assistant message.
# Aider chat conversation:

USER: but we want the head to end with an assistant msg

ASSISTANT: To ensure that the head ends with an assistant message, we need to modify the `summarize` method in the `ChatSummary` class in the `aider/history.py` file.

After splitting the messages based on the token count, we need to check if the last message in the head is from the assistant. If not, we need to move messages from the head to the tail until we find an assistant message.

Here is the *edit block* for this change:

```python
aider/history.py
<<<<<<< ORIGINAL
    def summarize(self, messages):
        if len(messages) < 2:
            return messages

        sized = self.tokenize(messages)
        total = sum(tokens for tokens, _msg in sized)
        if total <= self.max_tokens:
            return messages

        tail_tokens = 0
        split_index = len(messages)
        half_max_tokens = self.max_tokens // 2

        # Iterate over the messages in reverse order
        for i in range(len(sized)-1, -1, -1):
            tokens, _msg = sized[i]
            if tail_tokens + tokens < half_max_tokens:
                tail_tokens += tokens
                split_index = i
            else:
                break

        head = messages[:split_index]
        tail = messages[split_index:]
=======
    def summarize(self, messages):
        if len(messages) < 2:
            return messages

        sized = self.tokenize(messages)
        total = sum(tokens for tokens, _msg in sized)
        if total <= self.max_tokens:
            return messages

        tail_tokens = 0
        split_index = len(messages)
        half_max_tokens = self.max_tokens // 2

        # Iterate over the messages in reverse order
        for i in range(len(sized)-1, -1, -1):
            tokens, _msg = sized[i]
            if tail_tokens + tokens < half_max_tokens:
                tail_tokens += tokens
                split_index = i
            else:
                break

        # Ensure the head ends with an assistant message
        while messages[split_index - 1]["role"] != "assistant" and split_index < len(messages):
            split_index += 1

        head = messages[:split_index]
        tail = messages[split_index:]
>>>>>>> UPDATED
```

This change ensures that the head of the messages ends with an assistant message.
2023-07-24 19:08:51 -03:00
Paul Gauthier
270ea21c05 aider: Change the summarize method in the ChatSummary class to split the messages so that the tail holds as close as possible but less than max_tokens//2 tokens and the head holds the rest.
# Aider chat conversation:

USER: instead of splitting the history in half based on the msg number, split it in half based on the token count

ASSISTANT: To implement this change, we need to modify the `summarize` method in the `ChatSummary` class in the `aider/history.py` file.

The current implementation splits the messages in half based on the number of messages. We need to change this to split the messages based on the token count.

Here are the steps to implement this change:

1. Calculate the total token count of all messages.
2. Iterate over the messages and keep a running total of the token count.
3. Once the running total exceeds half of the total token count, split the messages at that point.

Here is the *edit block* for this change:

```python
aider/history.py
<<<<<<< ORIGINAL
    def summarize(self, messages):
        num = len(messages)
        dump(num)
        if num < 2:
            return messages

        sized = self.tokenize(messages)
        total = sum(tokens for tokens, _msg in sized)
        if total <= self.max_tokens:
            return messages

        num = num // 2

        # we want the head to end with an assistant msg
        while messages[num]["role"] == "assistant" and num < len(messages) - 1:
            num += 1

        head = messages[:num]
        tail = messages[num:]
=======
    def summarize(self, messages):

^C KeyboardInterrupt

USER: instead of splitting the history in half based on the msg number, split it so that the tail holds as close as possible but less than `max_tokens`//2 tokens and head holds the rest

ASSISTANT: To implement this change, we need to modify the `summarize` method in the `ChatSummary` class in the `aider/history.py` file.

The current implementation splits the messages in half based on the number of messages. We need to change this to split the messages so that the tail holds as close as possible but less than `max_tokens`//2 tokens and head holds the rest.

Here are the steps to implement this change:

1. Iterate over the messages in reverse order and keep a running total of the token count.
2. Once the running total exceeds `max_tokens`//2, split the messages at that point.

Here is the *edit block* for this change:

```python
aider/history.py
<<<<<<< ORIGINAL
    def summarize(self, messages):
        num = len(messages)
        dump(num)
        if num < 2:
            return messages

        sized = self.tokenize(messages)
        total = sum(tokens for tokens, _msg in sized)
        if total <= self.max_tokens:
            return messages

        num = num // 2

        # we want the head to end with an assistant msg
        while messages[num]["role"] == "assistant" and num < len(messages) - 1:
            num += 1

        head = messages[:num]
        tail = messages[num:]
=======
    def summarize(self, messages):
        if len(messages) < 2:
            return messages

        sized = self.tokenize(messages)
        total = sum(tokens for tokens, _msg in sized)
        if total <= self.max_tokens:
            return messages

        tail_tokens = 0
        split_index = len(messages)
        half_max_tokens = self.max_tokens // 2

        # Iterate over the messages in reverse order
        for i in range(len(sized)-1, -1, -1):
            tokens, _msg = sized[i]
            if tail_tokens + tokens < half_max_tokens:
                tail_tokens += tokens
                split_index = i
            else:
                break

        head = messages[:split_index]
        tail = messages[split_index:]
>>>>>>> UPDATED
```

This change ensures that the tail of the messages holds as close as possible but less than `max_tokens`//2 tokens and the head holds the rest.
2023-07-24 19:08:13 -03:00
Paul Gauthier
0db9d210d7 Advise the user when chat history is being summarized 2023-07-24 19:05:57 -03:00
Paul Gauthier
844194c268 Merge branch 'refactor-repo' into chat-history 2023-07-24 18:48:26 -03:00
Paul Gauthier
c8f1c0609d tests and bugfix 2023-07-24 17:47:53 -03:00
Paul Gauthier
957f59c011 updated help 2023-07-24 17:35:47 -03:00
Paul Gauthier
cac627ae5f aider: Add None as the third argument in all calls to GitRepo.
# Aider chat conversation:

USER: `GitRepo` now takes a 3rd arg. add None as the missing arg in all calls

ASSISTANT: To add `None` as the third argument in all calls to `GitRepo`, we need to modify the `tests/test_repo.py` file.

Here are the changes:

```python
tests/test_repo.py
<<<<<<< ORIGINAL
        repo = GitRepo(InputOutput(), None)
=======
        repo = GitRepo(InputOutput(), None, None)
>>>>>>> UPDATED
```

```python
tests/test_repo.py
<<<<<<< ORIGINAL
        repo = GitRepo(InputOutput(), None)
=======
        repo = GitRepo(InputOutput(), None, None)
>>>>>>> UPDATED
```

```python
tests/test_repo.py
<<<<<<< ORIGINAL
        repo = GitRepo(InputOutput(), None)
=======
        repo = GitRepo(InputOutput(), None, None)
>>>>>>> UPDATED
```

```python
tests/test_repo.py
<<<<<<< ORIGINAL
        tracked_files = GitRepo(InputOutput(), [tempdir]).get_tracked_files()
=======
        tracked_files = GitRepo(InputOutput(), [tempdir], None).get_tracked_files()
>>>>>>> UPDATED
```

```python
tests/test_repo.py
<<<<<<< ORIGINAL
            git_repo = GitRepo(InputOutput(), None)
=======
            git_repo = GitRepo(InputOutput(), None, None)
>>>>>>> UPDATED
```
2023-07-24 17:32:09 -03:00
Paul Gauthier
64c50bab48 Accept either a git dname or a list of fnames 2023-07-24 17:31:00 -03:00
Paul Gauthier
256dd7bbd9 Merge branch 'main' into refactor-repo 2023-07-24 15:22:20 -03:00
Paul Gauthier
d146398434 set version to 0.10.1-dev 2023-07-24 15:12:37 -03:00
Paul Gauthier
1f91735e88 version bump to 0.10.1 2023-07-24 15:12:18 -03:00
Paul Gauthier
ece45c7dc2 Bugfix versionbump 2023-07-24 15:12:15 -03:00
Paul Gauthier
62344a39cc Updated HISTORY 2023-07-24 15:06:58 -03:00
Paul Gauthier
d534dffe0a Updated HISTORY 2023-07-24 15:06:21 -03:00
Paul Gauthier
ccd5089e93 Merge branch 'main' into chat-history 2023-07-24 14:51:02 -03:00
Paul Gauthier
0fa4c8501d Merge branch 'main' into refactor-repo 2023-07-24 14:50:56 -03:00
Paul Gauthier
b006b39c6f GPT should ask users to add files to the chat 2023-07-24 14:49:12 -03:00
Paul Gauthier
7804925e42 Merge branch 'main' into chat-history 2023-07-24 14:24:40 -03:00
Paul Gauthier
94a271d522 Merge branch 'main' into refactor-repo 2023-07-24 14:24:20 -03:00
Paul Gauthier
17729280c6 Merge remote-tracking branch 'origin/main' 2023-07-24 14:05:45 -03:00
Paul Gauthier
745b3c320b Scan the git index using entries, which does not rely on a prior commit 2023-07-24 14:00:35 -03:00
Paul Gauthier
66da82094d use index.entries 2023-07-24 13:47:33 -03:00
Paul Gauthier
df096272bc aider: Refactored get_tracked_files method to use index.entries instead of index.diff to enumerate staged files.
# Aider chat conversation:

USER: is there any way to enumerate the staged files using the index besides index.diff

ASSISTANT: Yes, you can enumerate the staged files using the `index.entries` property of the `IndexFile` object. This property returns a dictionary-like object where the keys are the file paths and the values are `IndexEntry` objects.

Here's how you can modify the `get_tracked_files` method in the `GitRepo` class to use `index.entries` instead of `index.diff`:

```python
aider/repo.py
<<<<<<< ORIGINAL
    def get_tracked_files(self):
        if not self.repo:
            return []

        try:
            commit = self.repo.head.commit
        except ValueError:
            commit = None

        files = []
        if commit:
            for blob in commit.tree.traverse():
                if blob.type == "blob":  # blob is a file
                    files.append(blob.path)

        # Add staged files
        index = self.repo.index
        staged_files = [item.a_path for item in index.diff(None)]
        dump(staged_files)
        files.extend(staged_files)

        # convert to appropriate os.sep, since git always normalizes to /
        res = set(str(Path(PurePosixPath(path))) for path in files)

        return res
=======
    def get_tracked_files(self):
        if not self.repo:
            return []

        try:
            commit = self.repo.head.commit
        except ValueError:
            commit = None

        files = []
        if commit:
            for blob in commit.tree.traverse():
                if blob.type == "blob":  # blob is a file
                    files.append(blob.path)

        # Add staged files
        index = self.repo.index
        staged_files = [str(Path(PurePosixPath(path))) for path in index.entries.keys()]
        dump(staged_files)
        files.extend(staged_files)

        # convert to appropriate os.sep, since git always normalizes to /
        res = set(str(Path(PurePosixPath(path))) for path in files)

        return res
>>>>>>> UPDATED
```

This change will make the `get_tracked_files` method return all files in the index, not just the ones that have been modified. If you only want to get the files that have been modified, you should stick with using `index.diff`.
2023-07-24 13:45:10 -03:00
Paul Gauthier
f87ec8c38b Refactor getting staged files to handle ValueError. 2023-07-24 13:44:22 -03:00
Paul Gauthier
89542d2139 aider: Handle the case where the repository has never had any commits by using index.diff(None) to get the staged files.
# Aider chat conversation:

USER: index.diff("HEAD") fails when the repo has never had any commits

ASSISTANT: To handle the case where the repository has never had any commits, we can add a try-except block around the `index.diff("HEAD")` call in the `get_tracked_files` method of the `GitRepo` class in the `aider/repo.py` file. If a `git.exc.BadName` exception is raised, we can assume that there are no commits and return an empty list for the staged files.

Here is the proposed change:

```python
aider/repo.py
<<<<<<< ORIGINAL
        # Add staged files
        index = self.repo.index
        staged_files = [item.a_path for item in index.diff("HEAD")]
        files.extend(staged_files)
=======
        # Add staged files

^C KeyboardInterrupt

USER: there can be staged files when there are not commits

ASSISTANT: You're correct. If there are no commits, `index.diff("HEAD")` will fail, but there can still be staged files. We can use `index.diff(None)` to get the staged files in this case. Here's the proposed change:

```python
aider/repo.py
<<<<<<< ORIGINAL
        # Add staged files
        index = self.repo.index
        staged_files = [item.a_path for item in index.diff("HEAD")]
        files.extend(staged_files)
=======
        # Add staged files
        index = self.repo.index
        try:
            staged_files = [item.a_path for item in index.diff("HEAD")]
        except git.exc.BadName:
            staged_files = [item.a_path for item in index.diff(None)]
        files.extend(staged_files)
>>>>>>> UPDATED
```
2023-07-24 13:43:05 -03:00
Paul Gauthier
bef1d00ab1 Refactor test_get_tracked_files_with_new_staged_file to use a GitTemporaryDirectory and add new assertions. 2023-07-24 13:42:05 -03:00
Paul Gauthier
96f15c4533 Be sure to include staged files in get_tracked_files 2023-07-24 13:05:00 -03:00
Paul Gauthier
85166341d1 aider: Add staged files to the list of tracked files in the get_tracked_files method.
# Aider chat conversation:

USER: doing commit.tree.traverse() is not picking up files which have been git added but not yet committed

ASSISTANT: To include files that have been staged but not yet committed, you need to use the `git diff --cached` command. This command shows the changes between the index and your last commit; in other words, what you would be committing if you run "git commit" without "-a" option.

However, this change requires modification of the `aider/repo.py` file, which is currently read-only. Could you please make it read-write so that I can propose the necessary changes?

USER: I added these *read-write* files: aider/repo.py

ASSISTANT: Great, now that we have write access to `aider/repo.py`, we can modify the `get_tracked_files` method to include files that have been staged but not yet committed.

Here are the steps:

1. We will use the `git diff --cached --name-only` command to get a list of staged files. This command lists the file names that have been added (staged) but not yet committed.

2. We will add these staged files to the list of tracked files.

Here is the *edit block* for the changes:

```python
aider/repo.py
<<<<<<< ORIGINAL
    def get_tracked
^C KeyboardInterrupt

USER: is there a way to use traverse?

ASSISTANT: Yes, you can use the `git.IndexFile` object to traverse the staged files. The `IndexFile` object represents the git index, which is where files are staged before they are committed.

Here are the steps:

1. Get the `IndexFile` object from the `Repo` object.
2. Traverse the `IndexFile` object to get the staged files.

Here is the *edit block* for the changes:

```python
aider/repo.py
<<<<<<< ORIGINAL
    def get_tracked_files(self):
        if not self.repo:
            return []

        try:
            commit = self.repo.head.commit
        except ValueError:
            return set()

        files = []
        for blob in commit.tree.traverse():
            if blob.type == "blob":  # blob is a file
                files.append(blob.path)

        # convert to appropriate os.sep, since git always normalizes to /
        res = set(str(Path(PurePosixPath(path))) for path in files)

        return res
=======
    def get_tracked_files(self):
        if not self.repo:
            return []

        try:
            commit = self.repo.head.commit
        except ValueError:
            return set()

        files = []
        for blob in commit.tree.traverse():
            if blob.type == "blob":  # blob is a file
                files.append(blob.path)

        # Add staged files
        index = self.repo.index
        staged_files = [item.a_path for item in index.diff("HEAD")]
        files.extend(staged_files)

        # convert to appropriate os.sep, since git always normalizes to /
        res = set(str(Path(PurePosixPath(path))) for path in files)

        return res
>>>>>>> UPDATED
```

This change will ensure that the `get_tracked_files` method returns both committed and staged files.
2023-07-24 13:03:38 -03:00
paul-gauthier
8c6b50c64f
Update install.md 2023-07-23 16:18:30 -03:00
paul-gauthier
e63ae71e3e
Update install.md 2023-07-23 16:17:50 -03:00
Paul Gauthier
e52807b868 cleanup 2023-07-23 10:31:59 -03:00
Paul Gauthier
94c4aad2db Merge branch 'main' into chat-history 2023-07-23 10:23:45 -03:00
Paul Gauthier
143ca36733 Merge branch 'main' into refactor-repo 2023-07-23 10:23:38 -03:00