- 21 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 20 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 07 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 05 Jul, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* Fix assert on small embedding inputs * Update llm/patches/09-pooling.diff
-
- 03 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
On windows, if the model dir contained unicode characters clip models would fail to load. This fixes the file name handling in clip.cpp to support utf16 on windows.
-
- 27 Jun, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Jun, 2024 1 commit
-
-
Jeffrey Morgan authored
* llm: update llama.cpp submodule to `7c26775` * disable `LLAMA_BLAS` for now * `-DLLAMA_OPENMP=off`
-
- 07 Jun, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 30 May, 2024 1 commit
-
-
Jeffrey Morgan authored
* update llama.cpp submodule to `5921b8f089d3b7bda86aac5a66825df6a6c10603` * add patch
-
- 23 May, 2024 2 commits
-
-
Michael Yang authored
-
Daniel Hiltgen authored
This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load
-
- 16 May, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 06 May, 2024 1 commit
-
-
Jeffrey Morgan authored
* fix llava models not working after first request * individual requests only for llava models
-
- 26 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 25 Apr, 2024 1 commit
-
-
jmorganca authored
-
- 02 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 23 Mar, 2024 1 commit
-
-
Daniel Hiltgen authored
The release just before ggml-cuda.cu refactoring
-
- 14 Mar, 2024 1 commit
-
-
Michael Yang authored
-
- 13 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 11 Mar, 2024 2 commits
-
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
-
- 10 Mar, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 09 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 08 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 01 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 20 Feb, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 19 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve the problem where we don't fully unload from the GPU when we go idle.
-
- 12 Feb, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 06 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 31 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This requires an upstream change to support graceful termination, carried as a patch.
-
- 25 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
* Fix clearing kv cache between requests with the same prompt * fix powershell script
-