- 24 Apr, 2024 12 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Blake Mizerany authored
-
Daniel Hiltgen authored
AMD gfx patch rev is hex
-
Daniel Hiltgen authored
Report errors on server lookup instead of path lookup failure
-
Daniel Hiltgen authored
Correctly handle gfx90a discovery
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Blake Mizerany authored
This allows users of a valid Digest to know it has a minimum of 2 characters in the hash part for use when sharding. This is a reasonable restriction as the hash part is a SHA256 hash which is 64 characters long, which is the common hash used. There is no anticipation of using a hash with less than 2 characters. Also, add MustParseDigest. Also, replace Digest.Type with Digest.Split for getting both the type and hash parts together, which is most the common case when asking for either.
-
Daniel Hiltgen authored
Add back memory escape valve
-
Daniel Hiltgen authored
If we get our predictions wrong, this can be used to set a lower memory limit as a workaround. Recent multi-gpu refactoring accidentally removed it, so this adds it back.
-
- 23 Apr, 2024 26 commits
-
-
Daniel Hiltgen authored
Move nested payloads to installer and zip file on windows
-
Daniel Hiltgen authored
Give the go routine a moment to deliver the expired event
-
Daniel Hiltgen authored
Now that the llm runner is an executable and not just a dll, more users are facing problems with security policy configurations on windows that prevent users writing to directories and then executing binaries from the same location. This change removes payloads from the main executable on windows and shifts them over to be packaged in the installer and discovered based on the executables location. This also adds a new zip file for people who want to "roll their own" installation model.
-
Daniel Hiltgen authored
Detect and recover if runner removed
-
Michael authored
adding phi-3 mini to readme
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Tmp cleaners can nuke the file out from underneath us. This detects the missing runner, and re-initializes the payloads.
-
Daniel Hiltgen authored
Adds support for customizing GPU build flags in llama.cpp
-
Michael Yang authored
fix: mixtral graph
-
Daniel Hiltgen authored
Request and model concurrency
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. The default settings are currently set at 1 concurrent request per model and only 1 loaded model at a time, but these can be adjusted by setting OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Trim spaces and quotes from llm lib override
-
Bruce MacDonald authored
- move some popular integrations to the top of the lists
-
Bruce MacDonald authored
This reverts commit fad00a85.
-
-
Michael Yang authored
-
Hao Wu authored
* add chat (web UI) for LLM I have used chat with llama3 in local successfully and the code is MIT licensed. * Update README.md --------- Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
Maple Gao authored
Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
Võ Đình Đạt authored
-
Jonathan Smoley authored
Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
Eric Curtin authored
The goal of podman-ollama is to make AI even more boring. Signed-off-by:Eric Curtin <ecurtin@redhat.com>
-
Daniel Hiltgen authored
-
reid41 authored
* add qa-pilot link * format the link * add shell-pilot
-
Christian Neff authored
-
- 22 Apr, 2024 1 commit
-
-
Bruce MacDonald authored
-
- 21 Apr, 2024 1 commit
-
-
Jeremy authored
Fixed improper env references
-