- 30 Sep, 2023 1 commit
-
-
Jay Nakrani authored
Document response stream chunk delimiter.
-
- 29 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 27 Sep, 2023 1 commit
-
-
Michael Yang authored
-
- 23 Sep, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 22 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 21 Sep, 2023 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
HEAD request should respond like their GET counterparts except without a response body.
-
Bruce MacDonald authored
* remove tmp directories created by previous servers * clean up on server stop * Update routes.go * Update server/routes.go Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> * create top-level temp ollama dir * check file exists before creating --------- Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> Co-authored-by:
Michael Yang <mxyng@pm.me>
-
Michael Yang authored
this should be less error prone
-
- 20 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 18 Sep, 2023 1 commit
-
-
Patrick Devine authored
-
- 12 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
* linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 11 Sep, 2023 1 commit
-
-
Patrick Devine authored
-
- 06 Sep, 2023 1 commit
-
-
Patrick Devine authored
-
- 03 Sep, 2023 1 commit
-
-
Michael Yang authored
-
- 31 Aug, 2023 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 30 Aug, 2023 1 commit
-
-
Bruce MacDonald authored
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
-
- 29 Aug, 2023 1 commit
-
-
Patrick Devine authored
-
- 22 Aug, 2023 3 commits
-
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Ryan Baker authored
-
- 15 Aug, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 11 Aug, 2023 1 commit
-
-
Patrick Devine authored
-
- 10 Aug, 2023 4 commits
-
-
Jeffrey Morgan authored
-
Michael Yang authored
-
Michael Yang authored
-
Bruce MacDonald authored
Co-Authored-By:Jeffrey Morgan <jmorganca@gmail.com>
-
- 09 Aug, 2023 3 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
-
- 08 Aug, 2023 3 commits
-
-
Jeffrey Morgan authored
Fixes #282
-
Bruce MacDonald authored
- default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload
-
Bruce MacDonald authored
-
- 07 Aug, 2023 2 commits
-
-
Michael Yang authored
num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions
-
cmiller01 authored
* resolves: https://github.com/jmorganca/ollama/issues/300 and https://github.com/jmorganca/ollama/issues/282 * example usage: ``` ollama serve --port 9999 --allowed-origins "http://foo.example.com,http://192.0.0.1" ```
-
- 03 Aug, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 02 Aug, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 01 Aug, 2023 2 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
- read runner options from map to see what was specified explicitly and overwrite zero values
-