mirror of https://github.com/Aider-AI/aider.git synced 2025-05-25 14:55:00 +00:00

Paul Gauthier 0a95badf3f Updated docs

2024-06-21 08:57:30 -07:00

3.4 KiB

Raw Blame History

parent	highlight_image	nav_order	description
More info	/assets/robot-ast.png	900	Aider uses a map of your git repository to provide code context to LLMs.

Repository map

Aider uses a concise map of your whole git repository that includes the most important classes and functions along with their types and call signatures. This helps aider understand the code it's editing and how it relates to the other parts of the codebase. The repo map also helps aider write new code that respects and utilizes existing libraries, modules and abstractions found elsewhere in the codebase.

Using a repo map to provide context

Aider sends a repo map to the LLM along with each change request from the user. The repo map contains a list of the files in the repo, along with the key symbols which are defined in each file. It shows how each of these symbols are defined, by including the critical lines of code for each definition.

Here's a part of the repo map of aider's repo, for base_coder.py and commands.py :

aider/coders/base_coder.py:
⋮...
│class Coder:
│    abs_fnames = None
⋮...
│    @classmethod
│    def create(
│        self,
│        main_model,
│        edit_format,
│        io,
│        skip_model_availabily_check=False,
│        **kwargs,
⋮...
│    def abs_root_path(self, path):
⋮...
│    def run(self, with_message=None):
⋮...

aider/commands.py:
⋮...
│class Commands:
│    voice = None
│
⋮...
│    def get_commands(self):
⋮...
│    def get_command_completions(self, cmd_name, partial):
⋮...
│    def run(self, inp):
⋮...

Mapping out the repo like this provides some key benefits:

The LLM can see classes, methods and function signatures from everywhere in the repo. This alone may give it enough context to solve many tasks. For example, it can probably figure out how to use the API exported from a module just based on the details shown in the map.
If it needs to see more code, the LLM can use the map to figure out which files it needs to look at. The LLM can ask to see these specific files, and aider will offer to add them to the chat context.

Optimizing the map

Of course, for large repositories even just the repo map might be too large for the LLM's context window. Aider solves this problem by sending just the most relevant portions of the repo map. It does this by analyzing the full repo map using a graph ranking algorithm, computed on a graph where each source file is a node and edges connect files which have dependencies. Aider optimizes the repo map by selecting the most important parts of the codebase which will fit into the token budget assigned by the user (via the --map-tokens switch, which defaults to 1k tokens).

The sample map shown above doesn't contain every class, method and function from those files. It only includes the most important identifiers, the ones which are most often referenced by other portions of the code. These are the key pieces of context that the LLM needs to know to understand the overall codebase.

More info

Please check the repo map article on aider's blog for more information on aider's repository map and how it is constructed.

3.4 KiB Raw Blame History

Repository map

Using a repo map to provide context

Optimizing the map

More info

3.4 KiB

Raw Blame History