- 09 Apr, 2024 2 commits
-
-
Blake Mizerany authored
-
Blake Mizerany authored
This commit introduces a more friendly way to build Ollama dependencies and the binary without abusing `go generate` and removing the unnecessary extra steps it brings with it. This script also provides nicer feedback to the user about what is happening during the build process. At the end, it prints a helpful message to the user about what to do next (e.g. run the new local Ollama).
-
- 04 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 19 Dec, 2023 2 commits
-
-
Daniel Hiltgen authored
Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.
-
Bruce MacDonald authored
- remove ggml runner - automatically pull gguf models when ggml detected - tell users to update to gguf in the case automatic pull fails Co-Authored-By:Jeffrey Morgan <jmorganca@gmail.com>
-
- 24 Nov, 2023 1 commit
-
-
Jing Zhang authored
* Support cuda build in Windows * Enable dynamic NumGPU allocation for Windows
-
- 27 Oct, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 23 Oct, 2023 1 commit
-
-
Michael Yang authored
-
- 06 Oct, 2023 1 commit
-
-
Bruce MacDonald authored
- this makes it easier to see that the subprocess is associated with ollama
-
- 21 Sep, 2023 1 commit
-
-
Michael Yang authored
-
- 20 Sep, 2023 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 12 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
* linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 07 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 05 Sep, 2023 2 commits
-
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
-
- 30 Aug, 2023 1 commit
-
-
Bruce MacDonald authored
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
-