"vscode:/vscode.git/clone" did not exist on "fd4792ec56965a9c8564c3d88212c29a0378583d"
- 17 Apr, 2024 2 commits
-
-
ManniX-ITA authored
-
ManniX-ITA authored
-
- 16 Apr, 2024 5 commits
-
-
Michael Yang authored
-
Jeffrey Morgan authored
* parse wide argv characters on windows * cleanup * move cleanup to end of `main`
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Michael Yang authored
TODO: update padding() to _only_ returning the padding
-
- 15 Apr, 2024 2 commits
-
-
Patrick Devine authored
-
Jeffrey Morgan authored
* terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading * use `unload` in signal handler
-
- 13 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 11 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 10 Apr, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 09 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
During testing, we're seeing some models take over 3 minutes.
-
Blake Mizerany authored
-
Blake Mizerany authored
This commit introduces a more friendly way to build Ollama dependencies and the binary without abusing `go generate` and removing the unnecessary extra steps it brings with it. This script also provides nicer feedback to the user about what is happening during the build process. At the end, it prints a helpful message to the user about what to do next (e.g. run the new local Ollama).
-
Jeffrey Morgan authored
-
- 08 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 07 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528)
-
- 06 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 04 Apr, 2024 3 commits
-
-
Michael Yang authored
-
Daniel Hiltgen authored
-
mofanke authored
-
- 03 Apr, 2024 3 commits
-
-
Daniel Hiltgen authored
The subprocess change moved the build directory arm64 builds weren't setting cross-compilation flags when building on x86
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 02 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Michael Yang authored
-
Daniel Hiltgen authored
-
- 01 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-
Michael Yang authored
count each layer independently when deciding gpu offloading
-
Michael Yang authored
-
- 29 Mar, 2024 1 commit
-
-
Patrick Devine authored
Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 26 Mar, 2024 3 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Patrick Devine authored
-
- 25 Mar, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Jeremy authored
-