@@ -20,7 +20,7 @@ Please refer to the [GPU docs](./gpu.md).
## How can I specify the context window size?
By default, Ollama uses a context window size of 4096 tokens, unless you have a single GPU with <= 4 GB of VRAM, in which case it will default to 2048 tokens.
By default, Ollama uses a context window size of 4096 tokens.
This can be overridden with the `OLLAMA_CONTEXT_LENGTH` environment variable. For example, to set the default context window to 8K, use:
| mirostat_eta | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1) | float | mirostat_eta 0.1 |
| mirostat_tau | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0) | float | mirostat_tau 5.0 |
| num_ctx | Sets the size of the context window used to generate the next token. (Default: 2048) | int | num_ctx 4096 |
| repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 |
| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 |
"OLLAMA_ORIGINS":{"OLLAMA_ORIGINS",AllowedOrigins(),"A comma separated list of allowed origins"},
"OLLAMA_SCHED_SPREAD":{"OLLAMA_SCHED_SPREAD",SchedSpread(),"Always schedule model across all GPUs"},
"OLLAMA_MULTIUSER_CACHE":{"OLLAMA_MULTIUSER_CACHE",MultiUserCache(),"Optimize prompt caching for multi-user scenarios"},
"OLLAMA_CONTEXT_LENGTH":{"OLLAMA_CONTEXT_LENGTH",ContextLength(),"Context length to use unless otherwise specified (default 4096 or 2048 with low VRAM)"},
"OLLAMA_CONTEXT_LENGTH":{"OLLAMA_CONTEXT_LENGTH",ContextLength(),"Context length to use unless otherwise specified (default: 4096)"},
"OLLAMA_NEW_ENGINE":{"OLLAMA_NEW_ENGINE",NewEngine(),"Enable the new Ollama engine"},