mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-25 23:05:00 +00:00
86 lines
2.6 KiB
Markdown
86 lines
2.6 KiB
Markdown
---
|
|
parent: Troubleshooting
|
|
nav_order: 25
|
|
---
|
|
|
|
# Token limits
|
|
|
|
Every LLM has limits on how many tokens it can process for each request:
|
|
|
|
- The model's **context window** limits how many total tokens of
|
|
*input and output* it can process.
|
|
- Each model has limit on how many **output tokens** it can
|
|
produce.
|
|
|
|
Aider will report an error if a model responds indicating that
|
|
it has exceeded a token limit.
|
|
The error will include suggested actions to try and
|
|
avoid hitting token limits.
|
|
Here's an example error:
|
|
|
|
```
|
|
Model gpt-3.5-turbo has hit a token limit!
|
|
|
|
Input tokens: 768 of 16385
|
|
Output tokens: 4096 of 4096 -- exceeded output limit!
|
|
Total tokens: 4864 of 16385
|
|
|
|
To reduce output tokens:
|
|
- Ask for smaller changes in each request.
|
|
- Break your code into smaller source files.
|
|
- Try using a stronger model like gpt-4o or opus that can return diffs.
|
|
|
|
For more info: https://aider.chat/docs/token-limits.html
|
|
```
|
|
|
|
## Input tokens & context window size
|
|
|
|
The most common problem is trying to send too much data to a
|
|
model,
|
|
overflowing its context window.
|
|
Technically you can exhaust the context window if the input is
|
|
too large or if the input plus output are too large.
|
|
|
|
Strong models like GPT-4o and Opus have quite
|
|
large context windows, so this sort of error is
|
|
typically only an issue when working with weaker models.
|
|
|
|
The easiest solution is to try and reduce the input tokens
|
|
by removing files from the chat.
|
|
It's best to only add the files that aider will need to *edit*
|
|
to complete your request.
|
|
|
|
- Use `/tokens` to see token usage.
|
|
- Use `/drop` to remove unneeded files from the chat session.
|
|
- Use `/clear` to clear the chat history.
|
|
- Break your code into smaller source files.
|
|
|
|
## Output token limits
|
|
|
|
Most models have quite small output limits, often as low
|
|
as 4k tokens.
|
|
If you ask aider to make a large change that affects a lot
|
|
of code, the LLM may hit output token limits
|
|
as it tries to send back all the changes.
|
|
|
|
To avoid hitting output token limits:
|
|
|
|
- Ask for smaller changes in each request.
|
|
- Break your code into smaller source files.
|
|
- Use a strong model like gpt-4o, sonnet or opus that can return diffs.
|
|
|
|
|
|
## Other causes
|
|
|
|
Sometimes token limit errors are caused by
|
|
non-compliant API proxy servers
|
|
or bugs in the API server you are using to host a local model.
|
|
Aider has been well tested when directly connecting to
|
|
major
|
|
[LLM provider cloud APIs](https://aider.chat/docs/llms.html).
|
|
For serving local models,
|
|
[Ollama](https://aider.chat/docs/llms/ollama.html) is known to work well with aider.
|
|
|
|
Try using aider without an API proxy server
|
|
or directly with one of the recommended cloud APIs
|
|
and see if your token limit problems resolve.
|