"docs/source/en/api/pipelines/dance_diffusion.md" did not exist on "856dad57bb7a9ee13af4a08492e524b0a145a2c5"
- 23 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. The default settings are currently set at 1 concurrent request per model and only 1 loaded model at a time, but these can be adjusted by setting OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
-
- 10 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 01 Apr, 2024 1 commit
-
-
Michael Yang authored
count each layer independently when deciding gpu offloading
-
- 28 Nov, 2023 1 commit
-
-
Michael Yang authored
-
- 20 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Nov, 2023 1 commit
-
-
Michael Yang authored
-
- 14 Nov, 2023 1 commit
-
-
Michael Yang authored
-
- 13 Oct, 2023 1 commit
-
-
Michael Yang authored
-
- 11 Oct, 2023 1 commit
-
-
Michael Yang authored
-