- 04 May, 2024 1 commit
-
-
Michael Yang authored
-
- 01 May, 2024 5 commits
-
-
Mark Ward authored
-
Mark Ward authored
-
Mark Ward authored
log when the waiting for the process to stop to help debug when other tasks execute during this wait. expire timer clear the timer reference because it will not be reused. close will clean up expireTimer if calling code has not already done this.
-
Mark Ward authored
-
Jeffrey Morgan authored
-
- 30 Apr, 2024 4 commits
-
-
jmorganca authored
-
jmorganca authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
* Bump llama.cpp to b2761 * Adjust types for bump
-
- 29 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 27 Apr, 2024 3 commits
-
-
Hernan Martinez authored
-
Hernan Martinez authored
-
Hernan Martinez authored
-
- 26 Apr, 2024 9 commits
-
-
Daniel Hiltgen authored
This will speed up CI which already tries to only build static for unit tests
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This will make it simpler for CI to accumulate artifacts from prior steps
-
- 25 Apr, 2024 4 commits
-
-
Jeffrey Morgan authored
* llm: limit generation to 10x context size to avoid run on generations * add comment * simplify condition statement
-
Michael Yang authored
-
jmorganca authored
-
Roy Yang authored
-
- 24 Apr, 2024 2 commits
-
-
Patrick Devine authored
-
Daniel Hiltgen authored
If we get our predictions wrong, this can be used to set a lower memory limit as a workaround. Recent multi-gpu refactoring accidentally removed it, so this adds it back.
-
- 23 Apr, 2024 6 commits
-
-
Daniel Hiltgen authored
Now that the llm runner is an executable and not just a dll, more users are facing problems with security policy configurations on windows that prevent users writing to directories and then executing binaries from the same location. This change removes payloads from the main executable on windows and shifts them over to be packaged in the installer and discovered based on the executables location. This also adds a new zip file for people who want to "roll their own" installation model.
-
Daniel Hiltgen authored
Tmp cleaners can nuke the file out from underneath us. This detects the missing runner, and re-initializes the payloads.
-
Daniel Hiltgen authored
This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. The default settings are currently set at 1 concurrent request per model and only 1 loaded model at a time, but these can be adjusted by setting OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Daniel Hiltgen authored
-
- 21 Apr, 2024 2 commits
- 18 Apr, 2024 3 commits