llm/server.go · 172b5924af1f08277a7d6133d9bbfd4bd7438f01 · OpenDAS / ollama

llm: Avoid integer underflow on llama engine memory layout · 172b5924

Jesse Gross authored Dec 19, 2025

On the llama engine, when we compute the memory layout, we reserve
a buffer to allow for some flexibility for incorrect estimates.
This is subtracted from GPU free memory and on GPUs with limited
memory, it may underflow.

Fixes #13494

172b5924

server.go 55.1 KB

Replace server.go