- 08 Jan, 2024 3 commits
-
-
Jeffrey Morgan authored
* select layers based on estimated model memory usage * always account for scratch vram * dont load +1 layers * better estmation for graph alloc * Update gpu/gpu_darwin.go Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * Update llm/llm.go Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * Update llm/llm.go * add overhead for cuda memory * Update llm/llm.go Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * fix build error on linux * address comments --------- Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com>
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
- 07 Jan, 2024 6 commits
-
-
Daniel Hiltgen authored
Detect very old CUDA GPUs and fall back to CPU
-
Daniel Hiltgen authored
Accept windows paths for image processing
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
If we try to load the CUDA library on an old GPU, it panics and crashes the server. This checks the compute capability before we load the library so we can gracefully fall back to CPU mode.
-
Guilherme Baptista authored
-
- 06 Jan, 2024 4 commits
-
-
Daniel Hiltgen authored
This enhances our regex to support windows style paths. The regex will match invalid path specifications, but we'll still validate file existence and filter out mismatches
-
Daniel Hiltgen authored
Add windows native build instructions
-
Jeffrey Morgan authored
-
Michael Yang authored
Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05 fix: allow extension origins (still needs explicit listing), fixes #1686
-
- 05 Jan, 2024 13 commits
-
-
Bruce MacDonald authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Bruce MacDonald authored
- additional information is now available in show response, use this to pull gguf before running - make gguf updates cancellable
-
Patrick Devine authored
-
Patrick Devine authored
-
Jeffrey Morgan authored
* gpu: read memory info from all cuda devices * add `LOOKUP_SIZE` constant * better constant name * address comments
-
Bruce MacDonald authored
-
Matt Williams authored
-
Michael Yang authored
update Dockerfile.build
-
Matt Williams authored
Signed-off-by:Matt Williams <m@technovangelist.com>
-
Nicholas Dudfield authored
-
Michael Yang authored
-
- 04 Jan, 2024 14 commits
-
-
Daniel Hiltgen authored
Cleaup stale submodule
-
Daniel Hiltgen authored
If the tree has a stale submodule, make sure we clean it up first
-
Daniel Hiltgen authored
Revamp code layout for the llm directory and llama.cpp submodule
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Brian Murray authored
-
Daniel Hiltgen authored
Load dynamic cpu lib on windows
-
Daniel Hiltgen authored
On linux, we link the CPU library in to the Go app and fall back to it when no GPU match is found. On windows we do not link in the CPU library so that we can better control our dependencies for the CLI. This fixes the logic so we correctly fallback to the dynamic CPU library on windows.
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* update cmake flags for intel macOS * remove `LLAMA_K_QUANTS` * put back `CMAKE_OSX_DEPLOYMENT_TARGET` and disable `LLAMA_F16C`
-