This commit is contained in:
Paul Gauthier 2023-05-22 09:41:56 -07:00
parent 96b5938827
commit 068a1edded

View file

@ -30,35 +30,34 @@ class objects that are required to prepare for the test.
GPT-4 is great at "self contained" coding tasks, like writing or GPT-4 is great at "self contained" coding tasks, like writing or
modifying a pure function with no external dependencies. These work modifying a pure function with no external dependencies. These work
well because you can send GPT a self-contained request ("write a well because you can send GPT a self-contained request like "write a
Fibonacci function") and it can create new code from whole cloth. Or Fibonacci function" or "rewrite the loop using list
you can send it an existing function implementation and ask for self-contained comprehensions". These changes require no context beyond the code
changes ("rewrite the loop using list being discussed.
comprehensions"). These require no context beyond the code being
discussed.
Most real code is not pure and self-contained. And many common code Most real code is not pure and self-contained, it is intertwined with
changes require you to understand related code from many different files in a code from many different files in a repo.
repo. If you want GPT to "switch all the print statements in Foo to If you ask GPT to "switch all the print statements in class Foo to
use the logging system", it needs to see the code in the Foo class use the BarLog logging system", it needs to see the code in the Foo class
with the prints, and it also needs to understand how the project's logging with the prints, and it also needs to understand how the project's BarLog
system works. logging system works.
A simple solution is to **send the entire codebase** to GPT along with A simple solution is to **send the entire codebase** to GPT along with
each change request. Now GPT has all the context! But even moderately each change request. Now GPT has all the context! But even moderately
sized repos won't fit in the 8k-token context window. An sized repos won't fit in the 8k-token context window.
An
improved approach is to be selective, and **hand pick which files to send**. improved approach is to be selective, and **hand pick which files to send**.
For the example above, you could send the file that For the example above, you could send the file that
contains Foo and the file that contains the logging subsystem. contains Foo and the file that contains the logging subsystem.
This works pretty well, and is how `aider` previously worked. You This works pretty well, and is how `aider` previously worked. You
manually specify which files to "add to the chat". manually specify which files to "add to the chat".
But it's not ideal to have to manually identify and curate the right But it's not ideal to have to manually identify the right
set of files to add to the chat. It can get complicated, as set of files to add to the chat.
some changes will need context from many files. And you might still overrun Some changes may need context from many files.
the context window if you need to add too many files for context, And you might still overrun
many of which aren't going to end up being modified. the context window if you need to add many files for context.
## Using a repo map to provide context ## Using a repo map to provide context