- 30 Aug, 2023 5 commits
-
-
Jeffrey Morgan authored
-
Bruce MacDonald authored
-
Bruce MacDonald authored
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
-
Quinn Slack authored
The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list. Fixes https://github.com/jmorganca/ollama/issues/295.
-
Michael Yang authored
update upload chunks
-
- 29 Aug, 2023 2 commits
-
-
Michael Yang authored
allow F16 to use metal
-
Patrick Devine authored
-
- 28 Aug, 2023 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 27 Aug, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Michael Yang authored
update README.md
-
- 26 Aug, 2023 9 commits
-
-
Michael Yang authored
add 34b to mem check
-
Michael Yang authored
set default template
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Michael Yang authored
warning F16 uses significantly more memory than quantized model so the standard requires don't apply.
-
Michael Yang authored
-
Quinn Slack authored
Previously, `ollama rm model1 model2 modelN` would only delete `model1`. The other model command-line arguments would be silently ignored. Now, all models mentioned are deleted.
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 25 Aug, 2023 3 commits
-
-
Michael Yang authored
patch llama.cpp for 34B
-
Michael Yang authored
-
Michael Yang authored
-
- 24 Aug, 2023 2 commits
-
-
Michael Yang authored
add 34b model type
-
Michael Yang authored
-
- 22 Aug, 2023 11 commits
-
-
Michael Yang authored
Mxyng/cleanup
-
Michael Yang authored
use url.URL
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
build release mode
-
Michael Yang authored
-
Michael Yang authored
add version
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Ryan Baker authored
-
- 21 Aug, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 18 Aug, 2023 1 commit
-
-
Michael Yang authored
retry on unauthorized chunk push
-