- 19 Aug, 2024 1 commit
-
-
Daniel Hiltgen authored
This adds new variants for arm64 specific to Jetson platforms
-
- 03 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
When ollama is running a long time, tmp cleaners can remove the runners. This tightens up a few corner cases on arm macs where we failed with "server cpu not listed in available servers map[]"
-
- 17 Jun, 2024 1 commit
-
-
Daniel Hiltgen authored
We update the PATH on windows to get the CLI mapped, but this has an unintended side effect of causing other apps that may use our bundled DLLs to get terminated when we upgrade.
-
- 14 Jun, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 04 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 23 Apr, 2024 2 commits
-
-
Daniel Hiltgen authored
Now that the llm runner is an executable and not just a dll, more users are facing problems with security policy configurations on windows that prevent users writing to directories and then executing binaries from the same location. This change removes payloads from the main executable on windows and shifts them over to be packaged in the installer and discovered based on the executables location. This also adds a new zip file for people who want to "roll their own" installation model.
-
Daniel Hiltgen authored
This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. The default settings are currently set at 1 concurrent request per model and only 1 loaded model at a time, but these can be adjusted by setting OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
-
- 21 Apr, 2024 1 commit
-
-
Cheng authored
-
- 01 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-