• Jesse Gross's avatar
    runner.go: Enforce NUM_PARALLEL directly in the runner · 17b386a8
    Jesse Gross authored
    NUM_PARALEL is currently enforced by the Ollama server process - it
    will only issue requests to the runner if the maximum number of
    concurrent requests has not been exceeded. Although this should
    be sufficient, it is good for the runner to protect its own data
    structures. Currently, if too many requests get through to the
    runner, they will just get stuck and never return.
    
    This may help with reports of Ollama hanging, though it is unclear
    how it would actually occur.
    
    Bug #7573
    17b386a8
runner.go 23.9 KB