Commits · e9f7f3602961d2b0beaff27144ec89301c2173ca · OpenDAS / ollama

14 Jul, 2024 2 commits

Support image input for OpenAI chat compatibility (#5208) · e9f7f360

royjhan authored Jul 13, 2024



* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author
Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* Support image input for OpenAI chat

* Decoding

* Fix message processing logic

* openai vision test

* type errors

* clean up

* redundant check

* merge conflicts

* merge conflicts

* merge conflicts

* flattening and smaller image

* add test

* support python and js SDKs and mandate prefixing

* clean up

---------
Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

e9f7f360

remove template (#5655) · 057d3186
Patrick Devine authored Jul 13, 2024

057d3186

13 Jul, 2024 5 commits
- server: prepend system message in chat handler · f7ee0123
  jmorganca authored Jul 13, 2024
  
  f7ee0123
- server: fix `context`, `load_duration` and `total_duration` fields (#5676) · 1ed0aa8f
  Jeffrey Morgan authored Jul 13, 2024
```
* server: fix `contet`, `load_duration` and `total_duration` fields

* Update server/routes.go
```
  1ed0aa8f
- llm: looser checks for minimum memory (#5677) · ef98803d
  Jeffrey Morgan authored Jul 13, 2024
  
  ef98803d
- Add Kerlig AI, an app for macOS (#5675) · 02fea420
  Jarek authored Jul 13, 2024
  
  02fea420
- fix system prompt (#5662) · 22c5451f
  Michael Yang authored Jul 12, 2024
```
* fix system prompt

* execute template when hitting previous roles

* fix tests

---------
Co-authored-by: jmorganca <jmorganca@gmail.com>
```
  22c5451f
12 Jul, 2024 9 commits
- Revert "remove template from tests" · 23ebbaa4
  Patrick Devine authored Jul 12, 2024
```
This reverts commit 9ac0a7a5.
```
  23ebbaa4
- remove template from tests · 9ac0a7a5
  Patrick Devine authored Jul 12, 2024
  
  9ac0a7a5
- Merge pull request #5653 from ollama/mxyng/collect-system · e5c65a85
  Michael Yang authored Jul 12, 2024
```
template: preprocess message and collect system
```
  e5c65a85
- app: also clean up tempdir runners on install (#5646) · 33627331
  Jeffrey Morgan authored Jul 12, 2024
  
  33627331
- template: preprocess message and collect system · 36c87c43
  Michael Yang authored Jul 12, 2024
  
  36c87c43
- Clean up old files when installing on Windows (#5645) · 179737fe
  Jeffrey Morgan authored Jul 11, 2024
```
* app: always clean up install dir; force close applications

* remove wildcard

* revert `CloseApplications`

* whitespace

* update `LOCALAPPDATA` var
```
  179737fe
- Merge pull request #5639 from ollama/mxyng/unaggregated-system · 47353f5e
  Michael Yang authored Jul 11, 2024
  
  47353f5e
- fix: quant err message (#5616) · 10e76882
  Josh authored Jul 11, 2024
  
  10e76882
- rename aggregate to contents · 5056bb9c
  Michael Yang authored Jul 11, 2024
  
  5056bb9c
11 Jul, 2024 8 commits
- llm: avoid loading model if system memory is too small (#5637) · c4cf8ad5
  Jeffrey Morgan authored Jul 11, 2024
```
* llm: avoid loading model if system memory is too small

* update log

* Instrument swap free space

On linux and windows, expose how much swap space is available
so we can take that into consideration when scheduling models

* use `systemSwapFreeMemory` in check

---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
```
  c4cf8ad5
- revert embedded templates to use prompt/response · 57ec6901
  Michael Yang authored Jul 11, 2024
```
This reverts commit 19753c18.

for compat. messages will be added at a later date
```
  57ec6901
- do no automatically aggregate system messages · e64f9ebb
  Michael Yang authored Jul 11, 2024
  
  e64f9ebb
- sched: only error when over-allocating system memory (#5626) · 791650dd
  Jeffrey Morgan authored Jul 11, 2024
  
  791650dd
- llm: dont link cuda with compat libs (#5621) · efbf41ed
  Jeffrey Morgan authored Jul 10, 2024
  
  efbf41ed
- Merge pull request #5620 from ollama/mxyng/templates · cf155898
  Michael Yang authored Jul 10, 2024
```
update embedded templates
```
  cf155898
- update embedded templates · 19753c18
  Michael Yang authored Jul 10, 2024
  
  19753c18
- add system prompt to first legacy template · 41be2809
  Michael Yang authored Jul 10, 2024
  
  41be2809
10 Jul, 2024 10 commits
- Merge pull request #5612 from ollama/mxyng/mem · 37a570f9
  Michael Yang authored Jul 10, 2024
```
chatglm graph
```
  37a570f9
- chatglm graph · 5a739ff4
  Michael Yang authored Jul 10, 2024
  
  5a739ff4
- remove `GGML_CUDA_FORCE_MMQ=on` from build (#5588) · 4e262eb2
  Jeffrey Morgan authored Jul 10, 2024
  
  4e262eb2
- Merge pull request #5124 from dhiltgen/amd_windows · 4cfcbc32
  Daniel Hiltgen authored Jul 10, 2024
```
Wire up windows AMD driver reporting
```
  4cfcbc32
- Merge pull request #5555 from dhiltgen/msvc_deps · 79292ff3
  Daniel Hiltgen authored Jul 10, 2024
```
Bundle missing CRT libraries
```
  79292ff3
- Merge pull request #5580 from dhiltgen/cuda_overhead · 8ea50044
  Daniel Hiltgen authored Jul 10, 2024
```
Detect CUDA OS overhead
```
  8ea50044
- Merge pull request #5607 from dhiltgen/win_rocm_v6 · b50c8186
  Daniel Hiltgen authored Jul 10, 2024
```
Bump ROCm on windows to 6.1.2
```
  b50c8186
- Merge pull request #5605 from dhiltgen/merge_glitch · b99e750b
  Daniel Hiltgen authored Jul 10, 2024
```
Remove duplicate merge glitch
```
  b99e750b
- Bump ROCm on windows to 6.1.2 · 1f50356e
  Daniel Hiltgen authored Jul 10, 2024
```
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
```
  1f50356e
- Remove duplicate merge glitch · 22c81f62
  Daniel Hiltgen authored Jul 10, 2024
  
  22c81f62
09 Jul, 2024 6 commits
- Merge pull request #5503 from dhiltgen/dual_rocm · 2d1e3c32
  Daniel Hiltgen authored Jul 09, 2024
```
Workaround broken ROCm p2p copy
```
  2d1e3c32
- OpenAI v1/completions: allow stop token list (#5551) · 4918fae5
  royjhan authored Jul 09, 2024
```
* stop token parsing fix

* add stop test
```
  4918fae5
- separate request tests (#5578) · 0aff6787
  royjhan authored Jul 09, 2024
  
  0aff6787
- Detect CUDA OS Overhead · f6f759fc
  Daniel Hiltgen authored Jul 09, 2024
```
This adds logic to detect skew between the driver and
management library which can be attributed to OS overhead
and records that so we can adjust subsequent management
library free VRAM updates and avoid OOM scenarios.
```
  f6f759fc
- Merge pull request #5579 from dhiltgen/win_static_deps · 9544a57e
  Daniel Hiltgen authored Jul 09, 2024
```
Statically link c++ and thread lib on windows
```
  9544a57e
- Statically link c++ and thread lib · b51e3b63
  Daniel Hiltgen authored Jul 09, 2024
```
This makes sure we statically link the c++ and thread library on windows
to avoid unnecessary runtime dependencies on non-standard DLLs
```
  b51e3b63