Alternatively, you can change the amount of time all models are loaded into memory by setting the `OLLAMA_KEEP_ALIVE` environment variable when starting the Ollama server. The `OLLAMA_KEEP_ALIVE` variable uses the same parameter types as the `keep_alive` parameter types mentioned above. Refer to section explaining [how to configure the Ollama server](#how-do-i-configure-ollama-server) to correctly set the environment variable.
Alternatively, you can change the amount of time all models are loaded into memory by setting the `OLLAMA_KEEP_ALIVE` environment variable when starting the Ollama server. The `OLLAMA_KEEP_ALIVE` variable uses the same parameter types as the `keep_alive` parameter types mentioned above. Refer to section explaining [how to configure the Ollama server](#how-do-i-configure-ollama-server) to correctly set the environment variable.
@@ -29,7 +29,7 @@ Ollama uses unicode characters for progress indication, which may render as unkn
...
@@ -29,7 +29,7 @@ Ollama uses unicode characters for progress indication, which may render as unkn
Here's a quick example showing API access from `powershell`
Here's a quick example showing API access from `powershell`
```powershell
```powershell
(Invoke-WebRequest-methodPOST-Body'{"model":"llama3", "prompt":"Why is the sky blue?", "stream": false}'-urihttp://localhost:11434/api/generate).Content|ConvertFrom-json
(Invoke-WebRequest-methodPOST-Body'{"model":"llama3.1", "prompt":"Why is the sky blue?", "stream": false}'-urihttp://localhost:11434/api/generate).Content|ConvertFrom-json