- 20 May, 2024 2 commits
-
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
Patrick Devine authored
-
- 16 May, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 15 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 14 May, 2024 6 commits
-
-
Patrick Devine authored
-
Michael Yang authored
-
Ryo Machida authored
* Fixed the API endpoint /api/tags to return {models: []} instead of {models: null} when the model list is empty. * Update server/routes.go --------- Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Daniel Hiltgen authored
The APIs we query are optimistic on free space, and windows pages VRAM, so we don't have to wait to see reported usage recover on unload
-
Patrick Devine authored
-
Patrick Devine authored
-
- 12 May, 2024 2 commits
-
-
Michael Yang authored
- 10 May, 2024 5 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Make sure the first GPU has the most free space
-
Jeffrey Morgan authored
* rename `--quantization` to `--quantize` * backwards * Update api/types.go Co-authored-by:
Michael Yang <mxyng@pm.me> --------- Co-authored-by:
Michael Yang <mxyng@pm.me>
-
Jeffrey Morgan authored
* dont clamp ctx size in `PredictServerFit` * minimum 4 context * remove context warning
-
Michael Yang authored
-
- 09 May, 2024 7 commits
-
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Ensure the runners are terminated
-
Daniel Hiltgen authored
The GPU drivers take a while to update their free memory reporting, so we need to wait until the values converge with what we're expecting before proceeding to start another runner in order to get an accurate picture.
-
Daniel Hiltgen authored
This cleans up the logging for GPU discovery a bit, and can serve as a foundation to report GPU information in a future UX.
-
Bruce MacDonald authored
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 08 May, 2024 5 commits
-
-
Bruce MacDonald authored
* Add preflight OPTIONS handling and update CORS config - Implement early return with HTTP 204 (No Content) for OPTIONS requests in allowedHostsMiddleware to optimize preflight handling. - Extend CORS configuration to explicitly allow 'Authorization' headers and 'OPTIONS' method when OLLAMA_ORIGINS environment variable is set. * allow auth, content-type, and user-agent headers * Update routes.go
-
Michael Yang authored
-
Bruce MacDonald authored
-
Michael Yang authored
-
Bruce MacDonald authored
-
- 07 May, 2024 1 commit
-
-
Michael Yang authored
-
- 06 May, 2024 10 commits
-
-
Jeffrey Morgan authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
- FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32} -
Jeffrey Morgan authored
-