- 01 Feb, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Michael Yang authored
-
- 09 Jan, 2024 1 commit
-
-
Michael Yang authored
-
- 08 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
* select layers based on estimated model memory usage * always account for scratch vram * dont load +1 layers * better estmation for graph alloc * Update gpu/gpu_darwin.go Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * Update llm/llm.go Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * Update llm/llm.go * add overhead for cuda memory * Update llm/llm.go Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * fix build error on linux * address comments --------- Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com>
-
- 03 Jan, 2024 1 commit
-
-
Bruce MacDonald authored
-
- 02 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
Refactor where we store build outputs, and support a fully dynamic loading model on windows so the base executable has no special dependencies thus doesn't require a special PATH.
-
- 20 Dec, 2023 1 commit
-
-
Daniel Hiltgen authored
This switches the default llama.cpp to be CPU based, and builds the GPU variants as dynamically loaded libraries which we can select at runtime. This also bumps the ROCm library to version 6 given 5.7 builds don't work on the latest ROCm library that just shipped.
-
- 19 Dec, 2023 4 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.
-
Bruce MacDonald authored
- remove ggml runner - automatically pull gguf models when ggml detected - tell users to update to gguf in the case automatic pull fails Co-Authored-By:Jeffrey Morgan <jmorganca@gmail.com>
-
- 14 Dec, 2023 1 commit
-
-
Bruce MacDonald authored
* restore model load duration on generate response - set model load duration on generate and chat done response - calculate createAt time when response created * remove checkpoints predict opts * Update routes.go
-
- 12 Dec, 2023 2 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
- remove parallel
-
- 11 Dec, 2023 1 commit
-
-
Patrick Devine authored
--------- Co-authored-by:Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
-
- 10 Dec, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 09 Dec, 2023 1 commit
-
-
Bruce MacDonald authored
* fix: queued request failures - increase parallel requests to 2 to complete queued request, queueing is managed in ollama * log steam errors
-
- 05 Dec, 2023 3 commits
-
-
Michael Yang authored
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
This reverts commit 7a0899d6.
-
- 04 Dec, 2023 1 commit
-
-
Bruce MacDonald authored
- update chat docs - add messages chat endpoint - remove deprecated context and template generate parameters from docs - context and template are still supported for the time being and will continue to work as expected - add partial response to chat history
-
- 24 Nov, 2023 1 commit
-
-
Jing Zhang authored
* Support cuda build in Windows * Enable dynamic NumGPU allocation for Windows
-
- 21 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 20 Nov, 2023 1 commit
-
-
Purinda Gunasekara authored
-
- 19 Nov, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Bruce MacDonald authored
-
- 17 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 10 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
* add `"format": "json"` as an API parameter --------- Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
- 09 Nov, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 04 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 03 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 02 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 31 Oct, 2023 1 commit
-
-
Michael Yang authored
-
- 27 Oct, 2023 2 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
- 19 Oct, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
add error for falcon and starcoder vocab compatibility --------- Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
- 18 Oct, 2023 2 commits
-
-
Arne Müller authored
-
Arne Müller authored
-