Commits · efaee8c2d658f7f40a2f44b411ebfb25fcc198b0 · OpenDAS / ollama

30 Sep, 2025 1 commit

ggml: Backport scale kernel fixes · efaee8c2

Jesse Gross authored Sep 23, 2025

The GGML scale kernel uses signed 32-bit ints to represent
the number of elements in the tensor. For large images,
mistral-small3.2 overflows this, triggering CUDA errors due
to negative arguments.

Currently, this can happen when the user passes a large image
to mistral-small3.2. However, with upcoming changes to reserve
CUDA memory, it happens every time mistral-small is loaded as
we reserve using a worst case batch.

This patch is part of an upstream GGML commit and should be removed
after GGML is updated past 0a1b398 "ggml: add ops for WAN video model
(cuda && cpu) (#15669)".

Fixes #10388

efaee8c2