- 30 Aug, 2023 1 commit
-
-
Quinn Slack authored
The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list. Fixes https://github.com/jmorganca/ollama/issues/295.
-
- 26 Aug, 2023 3 commits
-
-
Michael Yang authored
warning F16 uses significantly more memory than quantized model so the standard requires don't apply.
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 25 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 24 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 18 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 17 Aug, 2023 1 commit
-
-
Michael Yang authored
-
- 14 Aug, 2023 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
- 13 Aug, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 11 Aug, 2023 1 commit
-
-
Michael Yang authored
remove used Unknown FileType
-
- 10 Aug, 2023 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-