- 17 Apr, 2024 1 commit
-
-
ManniX-ITA authored
-
- 16 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
* parse wide argv characters on windows * cleanup * move cleanup to end of `main`
-
- 01 Apr, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-
- 26 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 23 Mar, 2024 1 commit
-
-
Daniel Hiltgen authored
The release just before ggml-cuda.cu refactoring
-
- 16 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 12 Mar, 2024 3 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
racerole authored
Signed-off-by:racerole <jiangyifeng@outlook.com>
-
- 11 Mar, 2024 2 commits
-
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
-
- 09 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 08 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 01 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 20 Feb, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Taras Tsugrii authored
-
- 14 Feb, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 09 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
Make sure that when a shutdown signal comes, we shutdown quickly instead of waiting for a potentially long exchange to wrap up.
-
- 31 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This requires an upstream change to support graceful termination, carried as a patch.
-
- 22 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This wires up logging in llama.cpp to always go to stderr, and also turns up logging if OLLAMA_DEBUG is set.
-
- 21 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
Detect potential error scenarios so we can fallback to CPU mode without hitting asserts.
-
- 17 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This also refines the build process for the ext_server build.
-
- 14 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 11 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
In some cases we may want multiple variants for a given GPU type or CPU. This adds logic to have an optional Variant which we can use to select an optimal library, but also allows us to try multiple variants in case some fail to load. This can be useful for scenarios such as ROCm v5 vs v6 incompatibility or potentially CPU features.
-
- 10 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
* update submodule to `6efb8eb30e7025b168f3fda3ff83b9b386428ad6` * unblock condition variable in `update_slots` when closing server
-
- 07 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 04 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
-