- 25 Nov, 2024 2 commits
-
-
Bruce MacDonald authored
After a user pushes their model it is not clear what to do next. Add a link to the output of `ollama push` that tells the user where their model can now be found.
-
Simon Schampijer authored
- better formatting of input prompt - use invoke instead of predict
-
- 24 Nov, 2024 4 commits
-
-
reid41 authored
-
frob authored
-
Adarsh Mishra authored
-
Patcher authored
-
- 23 Nov, 2024 5 commits
-
-
Meng Zhuo authored
-
josc146 authored
-
oza6ut0ne authored
-
Rodrigo Ribeiro Gomes authored
-
Jesse Gross authored
If there are no avilable slots for new sequences then a request will not be added to the processing queue but will continue on to wait for a response that never comes. Besides never giving a response to the request, this prevents the model from being unloaded due to the outstanding request. To prevent this, there are semaphores that prevent more requests from being processed than there are slots - one in the Ollama server and one in the runner. - The Ollama server one works but it is not designed to protect the runner's data internal structures and the runner can return a final response before clearing its data structures. - The internal runner semaphore has similar behavior where it can release the semaphore when it issues a response. This is wrong - it should only release the semaphore after it has cleared the data structure. In addition, we should return an error if a slot is not found rather than deadlocking in the event we ever get to this spot. Fixes #7779
-
- 22 Nov, 2024 8 commits
-
-
Bruce MacDonald authored
In the past the ollama.com server would return a JWT that contained information about the user being authenticated. This was used to return different error messages to the user. This is no longer possible since the token used to authenticate does not contain information about the user anymore. Removing this code that no longer works. Follow up changes will improve the error messages returned here, but good to clean up first.
-
Daniel Hiltgen authored
This had fallen out of sync with the envconfig behavior, where max queue default was not zero.
-
Daniel Hiltgen authored
Users get confused by "Failed to acquire semaphore" error="context canceled" messages in the logs, which are actually clients giving up. While there could be a legitimate hang bug in the system, sometimes this is just short client timeouts with an overloaded system, so this should help users understand what's going on better.
-
Daniel Hiltgen authored
This avoids emitting the progress indicators to stderr, and the interactive prompts to the output file or pipe. Running "ollama run model > out.txt" now exits immediately, and "echo hello | ollama run model > out.txt" produces zero stderr output and a typical response in out.txt
-
Leon Sander authored
-
Mikel Olasagasti Uranga authored
update uuid.New().String() to uuid.NewString()
-
Dustin authored
-
Edwin.JH.Lee authored
-
- 21 Nov, 2024 21 commits
-
-
Elias authored
OrionChat is a free web-based chat interface that simplifies interactions with multiple AI model providers. It provides a unified platform for chatting and exploring multiple large language models (LLMs).
-
湛露先生 authored
Signed-off-by:zhanluxianshen <zhanluxianshen@163.com>
-
Jeffrey Morgan authored
-
R0CKSTAR authored
Signed-off-by:Xiaodong Ye <xiaodong.ye@mthreads.com>
-
Paul Robello authored
-
毛巳煜 authored
-
xuyangbocn authored
-
emrgnt-cmplxty authored
-
Cyril Blaecke authored
-
Christian Tzolov authored
-
Philippe Charrière authored
Parakeet is a GoLang SDK for Ollama --------- Co-authored-by:Parth Sareen <parth.sareen@ollama.com>
-
Marcin Szczygliński authored
-
Michael authored
-
Jakub Burkiewicz authored
-
Dezoito authored
-
Franco Lombardo authored
-
Aarushi authored
-
Kevin Brake authored
-
chyok authored
-
Nico authored
-
Laurent Eschenauer authored
-