1. 30 Sep, 2025 1 commit
    • Jesse Gross's avatar
      ggml: Backport scale kernel fixes · efaee8c2
      Jesse Gross authored
      The GGML scale kernel uses signed 32-bit ints to represent
      the number of elements in the tensor. For large images,
      mistral-small3.2 overflows this, triggering CUDA errors due
      to negative arguments.
      
      Currently, this can happen when the user passes a large image
      to mistral-small3.2. However, with upcoming changes to reserve
      CUDA memory, it happens every time mistral-small is loaded as
      we reserve using a worst case batch.
      
      This patch is part of an upstream GGML commit and should be removed
      after GGML is updated past 0a1b398 "ggml: add ops for WAN video model
      (cuda && cpu) (#15669)".
      
      Fixes #10388
      efaee8c2