- 30 Sep, 2023 1 commit
-
-
Jay Nakrani authored
Document response stream chunk delimiter.
-
- 28 Sep, 2023 1 commit
-
-
Aaron Coffey authored
-
- 27 Sep, 2023 3 commits
-
-
Jeffrey Morgan authored
-
Bruce MacDonald authored
-
James Braza authored
-
- 25 Sep, 2023 5 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 20 Sep, 2023 4 commits
-
-
Michael Yang authored
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
- 14 Sep, 2023 2 commits
-
-
Bruce MacDonald authored
* enable packaging multiple cuda versions * use nvcc cuda version if available --------- Co-authored-by:Michael Yang <mxyng@pm.me>
-
Matt Williams authored
* Update API docs Signed-off-by:
Matt Williams <m@technovangelist.com> * strange TOC was getting auto generated Signed-off-by:
Matt Williams <m@technovangelist.com> * Update docs/api.md Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * Update docs/api.md Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * Update docs/api.md Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> * Update api.md --------- Signed-off-by:
Matt Williams <m@technovangelist.com> Co-authored-by:
Bruce MacDonald <brucewmacdonald@gmail.com> Co-authored-by:
Michael Chiang <mchiang0610@users.noreply.github.com>
-
- 12 Sep, 2023 1 commit
-
-
Bruce MacDonald authored
* linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 06 Sep, 2023 1 commit
-
-
Ackermann Yuriy authored
-
- 30 Aug, 2023 2 commits
-
-
Bruce MacDonald authored
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
-
Quinn Slack authored
The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list. Fixes https://github.com/jmorganca/ollama/issues/295.
-
- 27 Aug, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 25 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 15 Aug, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 14 Aug, 2023 5 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
Güvenç Usanmaz authored
base_url value for Ollama object creation is corrected.
-
- 11 Aug, 2023 6 commits
-
-
Matt Williams authored
Signed-off-by:Matt Williams <m@technovangelist.com>
-
Matt Williams authored
Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
Matt Williams authored
Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
Matt Williams authored
Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
Matt Williams authored
Signed-off-by:Matt Williams <m@technovangelist.com>
-
Arturas Smorgun authored
Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 10 Aug, 2023 4 commits
-
-
Arturas Smorgun authored
It is required to be adjusted for some models, see https://github.com/jmorganca/ollama/issues/320 for more context
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Michael Yang authored
-
- 09 Aug, 2023 2 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-