Commits · f4432e1dbac6ca8af83979bd324cc0483fc8db7a · OpenDAS / ollama

30 Aug, 2023 1 commit

treat stop as stop sequences, not exact tokens (#442) · f4432e1d

Quinn Slack authored Aug 30, 2023

The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list.

Fixes https://github.com/jmorganca/ollama/issues/295.

f4432e1d

26 Aug, 2023 3 commits
- allow F16 to use metal · b25dd179
  Michael Yang authored Aug 26, 2023
```
warning F16 uses significantly more memory than quantized model so the
standard requires don't apply.
```
  b25dd179
- add 34b to mem check · 304f2b6c
  Michael Yang authored Aug 26, 2023
  
  304f2b6c
- add missing entries for 34B · 177b69a2
  Jeffrey Morgan authored Aug 25, 2023
  
  177b69a2
25 Aug, 2023 1 commit
- patch llama.cpp for 34B · 7a378f8b
  Michael Yang authored Aug 25, 2023
  
  7a378f8b
24 Aug, 2023 1 commit
- add 34b model type · b1cececb
  Michael Yang authored Aug 24, 2023
  
  b1cececb
18 Aug, 2023 1 commit
- fix ModelType() · 5ca05c2e
  Michael Yang authored Aug 18, 2023
  
  5ca05c2e
17 Aug, 2023 1 commit
- model and file type as strings · a894cc79
  Michael Yang authored Aug 17, 2023
  
  a894cc79
14 Aug, 2023 4 commits
- close open files · e26085b9
  Michael Yang authored Aug 14, 2023
  
  e26085b9
- update llama.cpp · f7b61333
  Michael Yang authored Aug 14, 2023
  
  f7b61333
- Update llama.go · 4b2d366c
  Bruce MacDonald authored Aug 14, 2023
  
  4b2d366c
- log embedding eval timing · 56fd4e4e
  Bruce MacDonald authored Aug 14, 2023
  
  56fd4e4e
13 Aug, 2023 1 commit
- update `llama.cpp` to `f64d44a` · 22885aea
  Jeffrey Morgan authored Aug 12, 2023
  
  22885aea
11 Aug, 2023 1 commit
- ggml: fix off by one error · 6ed991c8
  Michael Yang authored Aug 11, 2023
```
remove used Unknown FileType
```
  6ed991c8
10 Aug, 2023 4 commits
- implement loading ggml lora adapters through the modelfile · 6de5d032
  Michael Yang authored Aug 03, 2023
  
  6de5d032
- check memory requirements before loading · d791df75
  Michael Yang authored Aug 03, 2023
  
  d791df75
- disable gpu for q5_0, q5_1, q8_0 quants · 020a3b35
  Michael Yang authored Aug 03, 2023
  
  020a3b35
- partial decode ggml bin for more info · fccf8d17
  Michael Yang authored Jul 21, 2023
  
  fccf8d17