Commits · ee02f548c82f8a81dd023fa0d15ccdf840389ef9 · OpenDAS / ollama

06 May, 2024 19 commits
- Update linux.md (#3847) · ee02f548
  Mohamed A. Fouad authored May 06, 2024
```
Add -e to viewing logs in order to show end of ollama logs
```
  ee02f548
- Merge pull request #4188 from dhiltgen/use_our_lib · b08870af
  Daniel Hiltgen authored May 06, 2024
```
User our bundled libraries (cuda) instead of the host library
```
  b08870af
- Update api.md (#3945) · 3ecae420
  Darinka authored May 07, 2024
```
* Update api.md

Changed the calculation of tps (token/s) in the documentation

* Update docs/api.md

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
```
  3ecae420
- Merge pull request #4090 from dhiltgen/rocm_paths · 4cbbf0e1
  Daniel Hiltgen authored May 06, 2024
```
Support Fedoras standard ROCm location
```
  4cbbf0e1
- Use our libraries first · 380378cc
  Daniel Hiltgen authored May 05, 2024
```
Trying to live off the land for cuda libraries was not the right strategy.  We need to use the version we compiled against to ensure things work properly
```
  380378cc
- Merge pull request #4208 from dhiltgen/fix_sched_test · 0963c650
  Daniel Hiltgen authored May 06, 2024
```
Fix stale test logic
```
  0963c650
- Fix `no slots available` error with concurrent requests (#4160) · ed740a25
  Jeffrey Morgan authored May 06, 2024
  
  ed740a25
- Skip scheduling cancelled requests, always reload unloaded runners (#4189) · c9f98622
  Jeffrey Morgan authored May 06, 2024
  
  c9f98622
- Fix stale test logic · 0a954e50
  Daniel Hiltgen authored May 06, 2024
```
The model processing was recently changed to be deferred but
this test scenario hadn't been adjusted for that change in behavior.
```
  0a954e50
- docs: pbcopy on mac (#3129) · aa93423f
  Adrien Brault authored May 06, 2024
  
  aa93423f
- Add BrainSoup to compatible clients list (#3473) · 01c93862
  Nurgo authored May 06, 2024
  
  01c93862
- Merge pull request #4135 from dhiltgen/no_physx · af9eb36f
  Daniel Hiltgen authored May 06, 2024
```
Skip PhysX cudart library
```
  af9eb36f
- Merge pull request #4067 from dhiltgen/cudart · 06093fd3
  Daniel Hiltgen authored May 06, 2024
```
Add CUDA Driver API for GPU discovery
```
  06093fd3
- Update README.md with StreamDeploy (#3621) · 86b7fcac
  Tony Loehr authored May 06, 2024
```
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
```
  86b7fcac
- chore: delete `HEAD` (#4194) · fb8ddc56
  Hyden Liu authored May 07, 2024
  
  fb8ddc56
- 👌 IMPROVE: add portkey library for production tools (#4119) · 242efe66
  Saif authored May 06, 2024
  
  242efe66
- Fix llava models not working after first request (#4164) · 1b0e6c9c
  Jeffrey Morgan authored May 05, 2024
```
* fix llava models not working after first request

* individual requests only for llava models
```
  1b0e6c9c
- unload in critical section (#4187) · dfa2f32c
  Jeffrey Morgan authored May 05, 2024
  
  dfa2f32c
- Merge pull request #4154 from dhiltgen/central_config · 840424a2
  Daniel Hiltgen authored May 05, 2024
```
Centralize server config handling
```
  840424a2
05 May, 2024 9 commits
- Centralize server config handling · f56aa200
  Daniel Hiltgen authored May 04, 2024
```
This moves all the env var reading into one central module
and logs the loaded config once at startup which should
help in troubleshooting user server logs
```
  f56aa200
- chore: format go code (#4149) · 6707768e
  alwqx authored May 06, 2024
  
  6707768e
- update libraries for langchain_community + llama3 changed from llama2 (#4174) · c78bb76a
  Lord Basil - Automate EVERYTHING authored May 05, 2024
  
  c78bb76a
- allocate a large enough kv cache for all parallel requests (#4162) · 942c9792
  Jeffrey Morgan authored May 05, 2024
  
  942c9792
- Update README.md (#4111) · 06164911
  Bernardo de Oliveira Bruning authored May 05, 2024
```
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
```
  06164911
- validate the format of the digest when getting the model path (#4175) · 2a21363b
  Patrick Devine authored May 05, 2024
  
  2a21363b
- Merge pull request #4144 from dhiltgen/max_queue · 02686991
  Daniel Hiltgen authored May 05, 2024
```
Make maximum pending request configurable
```
  02686991
- Add integration test to push max queue limits · 45d61aaa
  Daniel Hiltgen authored May 05, 2024
  
  45d61aaa
- Make maximum pending request configurable · 20f6c065
  Daniel Hiltgen authored May 03, 2024
```
This also bumps up the default to be 50 queued requests
instead of 10.
```
  20f6c065
04 May, 2024 4 commits
- Merge pull request #4141 from dhiltgen/win_docs · 371f5e52
  Daniel Hiltgen authored May 04, 2024
```
Explain the 2 different windows download options
```
  371f5e52
- Explain the 2 different windows download options · e006480e
  Daniel Hiltgen authored May 03, 2024
  
  e006480e
- Merge pull request #4143 from ollama/mxyng/final-response · aed54587
  Michael Yang authored May 03, 2024
```
omit prompt and generate settings from final response
```
  aed54587
- omit prompt and generate settings from final response · 44869c59
  Michael Yang authored May 03, 2024
  
  44869c59
03 May, 2024 8 commits
- Merge pull request #4145 from dhiltgen/fix_lint · 52663284
  Daniel Hiltgen authored May 03, 2024
```
Fix lint warnings
```
  52663284
- Fix lint warnings · 42fa9d7f
  Daniel Hiltgen authored May 03, 2024
  
  42fa9d7f
- Merge pull request #4059 from ollama/mxyng/parser-2 · b7a87a22
  Michael Yang authored May 03, 2024
```
rename parser to model/file
```
  b7a87a22
- Update 'llama2' -> 'llama3' in most places (#4116) · e8aaea03
  Dr Nic Williams authored May 04, 2024
```
* Update 'llama2' -> 'llama3' in most places

---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
```
  e8aaea03
- Skip PhysX cudart library · b1ad3a43
  Daniel Hiltgen authored May 03, 2024
```
For some reason this library gives incorrect GPU information, so skip it
```
  b1ad3a43
- Merge pull request #4129 from dhiltgen/unit_tests · 267e25a7
  Daniel Hiltgen authored May 03, 2024
```
Soften timeouts on sched unit tests
```
  267e25a7
- Soften timeouts on sched unit tests · 9a32c514
  Daniel Hiltgen authored May 03, 2024
```
This gives us more headroom on the scheduler tests to tamp
down some flakes.
```
  9a32c514
- Merge pull request #3892 from ollama/mxyng/parser · e9ae607e
  Michael Yang authored May 02, 2024
```
refactor modelfile parser
```
  e9ae607e