Commits · 4d4f75a8a8a349e73dfd85ec0737ad42f5171eb0 · OpenDAS / ollama

07 May, 2024 7 commits
- Revert "fix golangci workflow missing gofmt and goimports (#4190)" · 4d4f75a8
  Michael Yang authored May 07, 2024
```
This reverts commit 04f971c8.
```
  4d4f75a8
- Correct the kubernetes terminology (#3843) · 3f71ba40
  Mélony QIN authored May 07, 2024
```
* add details on kubernetes deployment and separate the testing process

* Update examples/kubernetes/README.md

thanks for suggesting this change, I agree with you and let's make this project better together !
Co-authored-by: JonZeolla <Zeolla@gmail.com>

---------
Co-authored-by: QIN Mélony <MQN1@dsone.3ds.com>
Co-authored-by: JonZeolla <Zeolla@gmail.com>
```
  3f71ba40
- Update README.md to include ollama-r library (#4012) · 88a67127
  Hause Lin authored May 07, 2024
```
* Update README.md

Add Ollama for R - ollama-r library

* Update README.md

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
```
  88a67127
- Update .gitattributes · f7dc7dcc
  Jeffrey Morgan authored May 07, 2024
  
  f7dc7dcc
- fix golangci workflow missing gofmt and goimports (#4190) · 04f971c8
  alwqx authored May 08, 2024
  
  04f971c8
- Merge pull request #4215 from ollama/mxyng/mem · 70edb9bc
  Michael Yang authored May 07, 2024
```
llm: add minimum based on layer size
```
  70edb9bc
- llm: add minimum based on layer size · 4736391b
  Michael Yang authored May 06, 2024
  
  4736391b
06 May, 2024 23 commits
- note on naming restrictions (#2625) · 7c533041
  CrispStrobe authored May 07, 2024
```
* note on naming restrictions

else push would fail with cryptic
retrieving manifest 
Error: file does not exist
==> maybe change that in code too

* Update docs/import.md

---------
Co-authored-by: C-4-5-3 <154636388+C-4-5-3@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
```
  7c533041
- close server on receiving signal (#4213) · 39d9d22c
  Jeffrey Morgan authored May 06, 2024
  
  39d9d22c
- Add MarshalJSON to Duration (#3284) · af47413d
  Jackie Li authored May 06, 2024
```
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
```
  af47413d
- Windows automatically recognizes username (#3214) · d091fe3c
  Jeffrey Chen authored May 07, 2024
  
  d091fe3c
- Update linux.md (#3847) · ee02f548
  Mohamed A. Fouad authored May 06, 2024
```
Add -e to viewing logs in order to show end of ollama logs
```
  ee02f548
- Merge pull request #4188 from dhiltgen/use_our_lib · b08870af
  Daniel Hiltgen authored May 06, 2024
```
User our bundled libraries (cuda) instead of the host library
```
  b08870af
- Update api.md (#3945) · 3ecae420
  Darinka authored May 07, 2024
```
* Update api.md

Changed the calculation of tps (token/s) in the documentation

* Update docs/api.md

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
```
  3ecae420
- Merge pull request #4090 from dhiltgen/rocm_paths · 4cbbf0e1
  Daniel Hiltgen authored May 06, 2024
```
Support Fedoras standard ROCm location
```
  4cbbf0e1
- Use our libraries first · 380378cc
  Daniel Hiltgen authored May 05, 2024
```
Trying to live off the land for cuda libraries was not the right strategy.  We need to use the version we compiled against to ensure things work properly
```
  380378cc
- Merge pull request #4208 from dhiltgen/fix_sched_test · 0963c650
  Daniel Hiltgen authored May 06, 2024
```
Fix stale test logic
```
  0963c650
- Fix `no slots available` error with concurrent requests (#4160) · ed740a25
  Jeffrey Morgan authored May 06, 2024
  
  ed740a25
- Skip scheduling cancelled requests, always reload unloaded runners (#4189) · c9f98622
  Jeffrey Morgan authored May 06, 2024
  
  c9f98622
- Fix stale test logic · 0a954e50
  Daniel Hiltgen authored May 06, 2024
```
The model processing was recently changed to be deferred but
this test scenario hadn't been adjusted for that change in behavior.
```
  0a954e50
- docs: pbcopy on mac (#3129) · aa93423f
  Adrien Brault authored May 06, 2024
  
  aa93423f
- Add BrainSoup to compatible clients list (#3473) · 01c93862
  Nurgo authored May 06, 2024
  
  01c93862
- Merge pull request #4135 from dhiltgen/no_physx · af9eb36f
  Daniel Hiltgen authored May 06, 2024
```
Skip PhysX cudart library
```
  af9eb36f
- Merge pull request #4067 from dhiltgen/cudart · 06093fd3
  Daniel Hiltgen authored May 06, 2024
```
Add CUDA Driver API for GPU discovery
```
  06093fd3
- Update README.md with StreamDeploy (#3621) · 86b7fcac
  Tony Loehr authored May 06, 2024
```
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
```
  86b7fcac
- chore: delete `HEAD` (#4194) · fb8ddc56
  Hyden Liu authored May 07, 2024
  
  fb8ddc56
- 👌 IMPROVE: add portkey library for production tools (#4119) · 242efe66
  Saif authored May 06, 2024
  
  242efe66
- Fix llava models not working after first request (#4164) · 1b0e6c9c
  Jeffrey Morgan authored May 05, 2024
```
* fix llava models not working after first request

* individual requests only for llava models
```
  1b0e6c9c
- unload in critical section (#4187) · dfa2f32c
  Jeffrey Morgan authored May 05, 2024
  
  dfa2f32c
- Merge pull request #4154 from dhiltgen/central_config · 840424a2
  Daniel Hiltgen authored May 05, 2024
```
Centralize server config handling
```
  840424a2
05 May, 2024 9 commits
- Centralize server config handling · f56aa200
  Daniel Hiltgen authored May 04, 2024
```
This moves all the env var reading into one central module
and logs the loaded config once at startup which should
help in troubleshooting user server logs
```
  f56aa200
- chore: format go code (#4149) · 6707768e
  alwqx authored May 06, 2024
  
  6707768e
- update libraries for langchain_community + llama3 changed from llama2 (#4174) · c78bb76a
  Lord Basil - Automate EVERYTHING authored May 05, 2024
  
  c78bb76a
- allocate a large enough kv cache for all parallel requests (#4162) · 942c9792
  Jeffrey Morgan authored May 05, 2024
  
  942c9792
- Update README.md (#4111) · 06164911
  Bernardo de Oliveira Bruning authored May 05, 2024
```
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
```
  06164911
- validate the format of the digest when getting the model path (#4175) · 2a21363b
  Patrick Devine authored May 05, 2024
  
  2a21363b
- Merge pull request #4144 from dhiltgen/max_queue · 02686991
  Daniel Hiltgen authored May 05, 2024
```
Make maximum pending request configurable
```
  02686991
- Add integration test to push max queue limits · 45d61aaa
  Daniel Hiltgen authored May 05, 2024
  
  45d61aaa
- Make maximum pending request configurable · 20f6c065
  Daniel Hiltgen authored May 03, 2024
```
This also bumps up the default to be 50 queued requests
instead of 10.
```
  20f6c065
04 May, 2024 1 commit
- Merge pull request #4141 from dhiltgen/win_docs · 371f5e52
  Daniel Hiltgen authored May 04, 2024
```
Explain the 2 different windows download options
```
  371f5e52