copy

2025-05-20 20:35:00 +00:00 · 2025-02-04 12:47:33 -08:00 · 2025-02-04 12:47:33 -08:00 · 6517cb15ef
commit 6517cb15ef
parent 7b78f92feb
1 changed files with 4 additions and 12 deletions
--- a/aider/website/docs/llms/ollama.md
+++ b/aider/website/docs/llms/ollama.md
@ -45,18 +45,10 @@ setx   OLLAMA_API_KEY <api-key> # Windows, restart shell after setx
 [Ollama uses a 2k context window by default](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size),
 which is very small for working with aider.
-Unlike most other LLM servers, Ollama does not throw an error if you submit
+By default, aider sets Ollama's context window 
-a request that exceeds the context window.
+to be large enough for each request you send plus 8k tokens for the reply.
 Instead, it just silently truncates the request by discarding the "oldest" messages
 in the chat to make it fit within the context window.
-So if your context window is too small, you won't get an explicit error.
+Larger context windows may be helpful to allow larger replies from the LLM
 The biggest symptom will be that aider says it can't see (some of) the files
 you added to the chat.
 That's because ollama is silently discarding them because they exceed the context window.
 Aider sets Ollama's context window to 8k by default. 
 Larger context windows will allow you to work with larger amounts of code,
 but will use memory and increase latency.
 If you would like
 a larger context window
@ -67,6 +59,6 @@ like this:
 ```
 - name: ollama/qwen2.5-coder:32b-instruct-fp16
  extra_params:
-    num_ctx: 8192
+    num_ctx: 65536
 ```