- 23 Apr, 2024 3 commits
-
-
Daniel Hiltgen authored
This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. The default settings are currently set at 1 concurrent request per model and only 1 loaded model at a time, but these can be adjusted by setting OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
-
Michael Yang authored
-
Daniel Hiltgen authored
-
- 21 Apr, 2024 2 commits
- 18 Apr, 2024 3 commits
- 17 Apr, 2024 6 commits
-
-
Michael Yang authored
-
Jeremy authored
-
Jeremy authored
-
Jeremy authored
-
Michael Yang authored
-
Jeremy authored
-
- 16 Apr, 2024 6 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Jeffrey Morgan authored
* parse wide argv characters on windows * cleanup * move cleanup to end of `main`
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Michael Yang authored
TODO: update padding() to _only_ returning the padding
-
- 15 Apr, 2024 2 commits
-
-
Patrick Devine authored
-
Jeffrey Morgan authored
* terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading * use `unload` in signal handler
-
- 13 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 11 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 10 Apr, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 09 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
During testing, we're seeing some models take over 3 minutes.
-
Blake Mizerany authored
-
Blake Mizerany authored
This commit introduces a more friendly way to build Ollama dependencies and the binary without abusing `go generate` and removing the unnecessary extra steps it brings with it. This script also provides nicer feedback to the user about what is happening during the build process. At the end, it prints a helpful message to the user about what to do next (e.g. run the new local Ollama).
-
Jeffrey Morgan authored
-
- 08 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 07 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528)
-
- 06 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 04 Apr, 2024 3 commits
-
-
Michael Yang authored
-
Daniel Hiltgen authored
-
mofanke authored
-
- 03 Apr, 2024 3 commits
-
-
Daniel Hiltgen authored
The subprocess change moved the build directory arm64 builds weren't setting cross-compilation flags when building on x86
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 02 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
-