- 27 Oct, 2023 1 commit
-
-
Bruce MacDonald authored
* allow for a configurable ollama models directory - set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored - update docs Co-Authored-By:
Jeffrey Morgan <jmorganca@gmail.com> Co-Authored-By:
Jay Nakrani <dhananjaynakrani@gmail.com> Co-Authored-By:
Akhil Acharya <akhilcacharya@gmail.com> Co-Authored-By:
Sasha Devol <sasha.devol@protonmail.com>
-
- 26 Oct, 2023 1 commit
-
-
Michael Yang authored
-
- 20 Oct, 2023 1 commit
-
-
Michael Yang authored
-
- 19 Oct, 2023 1 commit
-
-
Bruce MacDonald authored
- only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context
-
- 13 Oct, 2023 2 commits
-
-
Michael Yang authored
-
Bruce MacDonald authored
- remove new lines from llama.cpp error messages relayed to client - check api option types and return error on wrong type - change num layers from 95% VRAM to 92% VRAM
-
- 12 Oct, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 11 Oct, 2023 2 commits
-
-
Michael Yang authored
-
Bruce MacDonald authored
* update streaming request accept header * add optional stream param to request bodies
-
- 09 Oct, 2023 1 commit
-
-
Michael Yang authored
-
- 05 Oct, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 04 Oct, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 02 Oct, 2023 1 commit
-
-
Bruce MacDonald authored
* include seed in params for llama.cpp server and remove empty filter for temp * relay default predict options to llama.cpp - reorganize options to match predict request for readability * omit empty stop --------- Co-authored-by:hallh <hallh@users.noreply.github.com>
-
- 29 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 28 Sep, 2023 1 commit
-
-
Michael Yang authored
-
- 14 Sep, 2023 1 commit
-
-
Patrick Devine authored
-
- 12 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
* linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 06 Sep, 2023 1 commit
-
-
Patrick Devine authored
-
- 31 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 30 Aug, 2023 1 commit
-
-
Bruce MacDonald authored
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
-
- 29 Aug, 2023 1 commit
-
-
Patrick Devine authored
-
- 28 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 26 Aug, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 22 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 17 Aug, 2023 2 commits
-
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 16 Aug, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Blake Mizerany authored
* cmd: support OLLAMA_HOST environment variable This commit adds support for the OLLAMA_HOST environment variable. This variable can be used to specify the host to which the client should connect. This is useful when the client is running somewhere other than the host where the server is running. The new api.FromEnv function is used to read configure clients from the environment. Clients wishing to use the environment variable being consistent with the Ollama CLI can use this new function. * Update api/client.go Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> * Update api/client.go Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> --------- Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com>
-
- 10 Aug, 2023 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Patrick Devine authored
-
Bruce MacDonald authored
Co-Authored-By:Jeffrey Morgan <jmorganca@gmail.com>
-
- 08 Aug, 2023 2 commits
-
-
Bruce MacDonald authored
- default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload
-
Jeffrey Morgan authored
Fixes #297 Fixes #296
-
- 07 Aug, 2023 1 commit
-
-
Michael Yang authored
num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions
-
- 04 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 01 Aug, 2023 3 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
- read runner options from map to see what was specified explicitly and overwrite zero values
-
Jeffrey Morgan authored
-
- 31 Jul, 2023 1 commit
-
-
Bruce MacDonald authored
-