- 07 May, 2024 7 commits
-
-
Michael Yang authored
This reverts commit 04f971c8.
-
Mélony QIN authored
* add details on kubernetes deployment and separate the testing process * Update examples/kubernetes/README.md thanks for suggesting this change, I agree with you and let's make this project better together ! Co-authored-by:
JonZeolla <Zeolla@gmail.com> --------- Co-authored-by:
QIN Mélony <MQN1@dsone.3ds.com> Co-authored-by:
JonZeolla <Zeolla@gmail.com>
-
Hause Lin authored
* Update README.md Add Ollama for R - ollama-r library * Update README.md --------- Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Jeffrey Morgan authored
-
alwqx authored
-
Michael Yang authored
llm: add minimum based on layer size
-
Michael Yang authored
-
- 06 May, 2024 23 commits
-
-
CrispStrobe authored
* note on naming restrictions else push would fail with cryptic retrieving manifest Error: file does not exist ==> maybe change that in code too * Update docs/import.md --------- Co-authored-by:
C-4-5-3 <154636388+C-4-5-3@users.noreply.github.com> Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com>
-
Jeffrey Morgan authored
-
Jackie Li authored
--------- Co-authored-by:Patrick Devine <patrick@infrahq.com>
-
Jeffrey Chen authored
-
Mohamed A. Fouad authored
Add -e to viewing logs in order to show end of ollama logs
-
Daniel Hiltgen authored
User our bundled libraries (cuda) instead of the host library
-
Darinka authored
* Update api.md Changed the calculation of tps (token/s) in the documentation * Update docs/api.md --------- Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Daniel Hiltgen authored
Support Fedoras standard ROCm location
-
Daniel Hiltgen authored
Trying to live off the land for cuda libraries was not the right strategy. We need to use the version we compiled against to ensure things work properly
-
Daniel Hiltgen authored
Fix stale test logic
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
The model processing was recently changed to be deferred but this test scenario hadn't been adjusted for that change in behavior.
-
Adrien Brault authored
-
Nurgo authored
-
Daniel Hiltgen authored
Skip PhysX cudart library
-
Daniel Hiltgen authored
Add CUDA Driver API for GPU discovery
-
Tony Loehr authored
Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
Hyden Liu authored
-
Saif authored
-
Jeffrey Morgan authored
* fix llava models not working after first request * individual requests only for llava models
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Centralize server config handling
-
- 05 May, 2024 9 commits
-
-
Daniel Hiltgen authored
This moves all the env var reading into one central module and logs the loaded config once at startup which should help in troubleshooting user server logs
-
alwqx authored
-
Lord Basil - Automate EVERYTHING authored
-
Jeffrey Morgan authored
-
Bernardo de Oliveira Bruning authored
--------- Co-authored-by:Patrick Devine <patrick@infrahq.com>
-
Patrick Devine authored
-
Daniel Hiltgen authored
Make maximum pending request configurable
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This also bumps up the default to be 50 queued requests instead of 10.
-
- 04 May, 2024 1 commit
-
-
Daniel Hiltgen authored
Explain the 2 different windows download options
-