mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-29 00:35:00 +00:00
improved token limit err msgs and docs #678
This commit is contained in:
parent
0fc6b9beaa
commit
a30e656304
3 changed files with 150 additions and 7 deletions
86
website/docs/troubleshooting/token-limits.md
Normal file
86
website/docs/troubleshooting/token-limits.md
Normal file
|
@ -0,0 +1,86 @@
|
|||
---
|
||||
parent: Troubleshooting
|
||||
nav_order: 25
|
||||
---
|
||||
|
||||
# Token limits
|
||||
|
||||
Every LLM has limits on how many tokens it can process:
|
||||
|
||||
- The model's **context window** limits how many tokens of
|
||||
*input and output* it can process.
|
||||
- Each model has limit on how many **output tokens** it can
|
||||
produce.
|
||||
|
||||
Aider will report an error if a model responds indicating that
|
||||
it has exceeded a token limit.
|
||||
The error will include suggested actions to try and
|
||||
avoid hitting token limits.
|
||||
Here's an example error:
|
||||
|
||||
```
|
||||
Model gpt-3.5-turbo has hit a token limit!
|
||||
|
||||
Input tokens: 768 of 16385
|
||||
Output tokens: 4096 of 4096 -- exceeded output limit!
|
||||
Total tokens: 4864 of 16385
|
||||
|
||||
To reduce output tokens:
|
||||
- Ask for smaller changes in each request.
|
||||
- Break your code into smaller source files.
|
||||
- Try using a stronger model like gpt-4o or opus that can return diffs.
|
||||
|
||||
For more info: https://aider.chat/docs/token-limits.html
|
||||
```
|
||||
|
||||
## Input tokens & context window size
|
||||
|
||||
The most common problem is trying to send too much data to a
|
||||
model,
|
||||
overflowing its context window.
|
||||
Technically you can exhaust the context window if the input is
|
||||
too large or if the input plus output are too large.
|
||||
|
||||
Strong models like GPT-4o and Opus have quite
|
||||
large context windows, so this sort of error is
|
||||
typically only an issue when working with weaker models.
|
||||
|
||||
The easiest solution is to try and reduce the input tokens
|
||||
by removing files from the chat.
|
||||
It's best to only add the files that aider will need to *edit*
|
||||
to complete your request.
|
||||
|
||||
- Use `/tokens` to see token usage.
|
||||
- Use `/drop` to remove unneeded files from the chat session.
|
||||
- Use `/clear` to clear the chat history.
|
||||
- Break your code into smaller source files.
|
||||
|
||||
## Output token limits
|
||||
|
||||
Most models have quite small output limits, often as low
|
||||
as 4k tokens.
|
||||
If you ask aider to make a large change that affects a lot
|
||||
of code, the LLM may hit output token limits
|
||||
as it tries to send back all the changes.
|
||||
|
||||
To avoid hitting output token limits:
|
||||
|
||||
- Ask for smaller changes in each request.
|
||||
- Break your code into smaller source files.
|
||||
- Try using a stronger model like gpt-4o or opus that can return diffs.
|
||||
|
||||
|
||||
## Other causes
|
||||
|
||||
Sometimes token limit errors are caused by
|
||||
non-compliant API proxy servers
|
||||
or bugs in the API server you are using to host a local model.
|
||||
Aider has been well tested when directly connecting to
|
||||
major
|
||||
[LLM provider cloud APIs](https://aider.chat/docs/llms.html).
|
||||
For serving local models,
|
||||
[Ollama](https://aider.chat/docs/llms/ollama.html) is known to work well with aider.
|
||||
|
||||
Try using aider without an API proxy server
|
||||
or directly with one of the recommended cloud APIs
|
||||
and see if your token limit problems resolve.
|
Loading…
Add table
Add a link
Reference in a new issue