Commits · a82eb275ff6380e742a2e3bd69dba141216cec4d · OpenDAS / ollama

30 Aug, 2023 5 commits

update docs for subprocess · a82eb275
Jeffrey Morgan authored Aug 30, 2023

a82eb275
remove test not applicate to subprocess · f964aea9
Bruce MacDonald authored Aug 30, 2023

f964aea9

subprocess llama.cpp server (#401) · 42998d79

Bruce MacDonald authored Aug 30, 2023

* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm

42998d79

treat stop as stop sequences, not exact tokens (#442) · f4432e1d

Quinn Slack authored Aug 30, 2023

The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list.

Fixes https://github.com/jmorganca/ollama/issues/295.

f4432e1d

Merge pull request #428 from jmorganca/mxyng/upload-chunks · 982c5354
Michael Yang authored Aug 30, 2023
```
update upload chunks
```
982c5354

29 Aug, 2023 2 commits
- Merge pull request #421 from jmorganca/mxyng/f16-metal · 7df342a6
  Michael Yang authored Aug 29, 2023
```
allow F16 to use metal
```
  7df342a6
- add model IDs (#439) · 8bbff2df
  Patrick Devine authored Aug 28, 2023
  
  8bbff2df
28 Aug, 2023 4 commits
- remove unused parameter · 16b06699
  Michael Yang authored Aug 28, 2023
  
  16b06699
- loosen http status code checks · 246dc654
  Michael Yang authored Aug 26, 2023
  
  246dc654
- chunked pipe · 865fceb7
  Michael Yang authored Aug 26, 2023
  
  865fceb7
- bump chunk size to 95MB · 72266c76
  Michael Yang authored Aug 25, 2023
  
  72266c76
27 Aug, 2023 2 commits
- update `orca` to `orca-mini` · d3b838ce
  Jeffrey Morgan authored Aug 27, 2023
  
  d3b838ce
- Merge pull request #412 from jmorganca/mxyng/update-readme · e639a12f
  Michael Yang authored Aug 26, 2023
```
update README.md
```
  e639a12f
26 Aug, 2023 9 commits
- Merge pull request #420 from jmorganca/mxyng/34b-mem-check · e82fcf30
  Michael Yang authored Aug 26, 2023
```
add 34b to mem check
```
  e82fcf30
- Merge pull request #426 from jmorganca/default-template · 495e8b0a
  Michael Yang authored Aug 26, 2023
```
set default template
```
  495e8b0a
- set default template · 59734ca2
  Michael Yang authored Aug 26, 2023
  
  59734ca2
- default host to `127.0.0.1`, fixes #424 · 22ab7f5f
  Jeffrey Morgan authored Aug 26, 2023
  
  22ab7f5f
- allow F16 to use metal · b25dd179
  Michael Yang authored Aug 26, 2023
```
warning F16 uses significantly more memory than quantized model so the
standard requires don't apply.
```
  b25dd179
- add 34b to mem check · 304f2b6c
  Michael Yang authored Aug 26, 2023
  
  304f2b6c
- delete all models (not just 1st) in `ollama rm` (#415) · 2ecc3a33
  Quinn Slack authored Aug 26, 2023
```
Previously, `ollama rm model1 model2 modelN` would only delete `model1`. The other model command-line arguments would be silently ignored. Now, all models mentioned are deleted.
```
  2ecc3a33
- add `codellama` to model list in readme · ee6e1df1
  Jeffrey Morgan authored Aug 25, 2023
  
  ee6e1df1
- add missing entries for 34B · 177b69a2
  Jeffrey Morgan authored Aug 25, 2023
  
  177b69a2
25 Aug, 2023 3 commits
- Merge pull request #411 from jmorganca/mxyng/34b · dad63f08
  Michael Yang authored Aug 25, 2023
```
patch llama.cpp for 34B
```
  dad63f08
- update README.md · 041f9ad1
  Michael Yang authored Aug 25, 2023
  
  041f9ad1
- patch llama.cpp for 34B · 7a378f8b
  Michael Yang authored Aug 25, 2023
  
  7a378f8b
24 Aug, 2023 2 commits
- Merge pull request #405 from jmorganca/mxyng/34b · de0bdd7f
  Michael Yang authored Aug 24, 2023
```
add 34b model type
```
  de0bdd7f
- add 34b model type · b1cececb
  Michael Yang authored Aug 24, 2023
  
  b1cececb
22 Aug, 2023 11 commits
- Merge pull request #398 from jmorganca/mxyng/cleanup · e0d39fa3
  Michael Yang authored Aug 22, 2023
```
Mxyng/cleanup
```
  e0d39fa3
- Merge pull request #393 from jmorganca/mxyng/net-url · 968ced2e
  Michael Yang authored Aug 22, 2023
```
use url.URL
```
  968ced2e
- remove unused requestContextKey · 32d1a000
  Michael Yang authored Aug 22, 2023
  
  32d1a000
- move upload funcs to upload.go · 04e21282
  Michael Yang authored Aug 22, 2023
  
  04e21282
- use url.URL · 2cc63468
  Michael Yang authored Aug 21, 2023
  
  2cc63468
- Merge pull request #397 from jmorganca/mxyng/release-mode · 8f827641
  Michael Yang authored Aug 22, 2023
```
build release mode
```
  8f827641
- build release mode · 95187d7e
  Michael Yang authored Aug 22, 2023
  
  95187d7e
- Merge pull request #392 from jmorganca/mxyng/version · 9ec7e375
  Michael Yang authored Aug 22, 2023
```
add version
```
  9ec7e375
- add version · 2c7f956b
  Michael Yang authored Aug 21, 2023
  
  2c7f956b
- fix `FROM` instruction erroring when referring to a file · a9f6c566
  Jeffrey Morgan authored Aug 22, 2023
  
  a9f6c566
- Strip protocol from model path (#377) · 0a892419
  Ryan Baker authored Aug 21, 2023
  
  0a892419
21 Aug, 2023 1 commit
- add `.env` to `.dockerignore` · e3054fc7
  Jeffrey Morgan authored Aug 21, 2023
  
  e3054fc7
18 Aug, 2023 1 commit
- Merge pull request #381 from jmorganca/mxyng/fix-push-chunks · 23c24850
  Michael Yang authored Aug 18, 2023
```
retry on unauthorized chunk push
```
  23c24850