1. 20 Oct, 2025 1 commit
  2. 26 Aug, 2025 1 commit
    • Michael Yang's avatar
      convert: fix tensor sorting (#12015) · 86834a27
      Michael Yang authored
      there's two bugs here.
      
      1. the check for a layer id is incorrect and should be >= 0 since layer
         0 is valid
      2. if both tensors have an layer identifier, it will only compare the
         layer id which will return 0 if the tensors are in the same layer.
         instead it should fallback to comparing the full tensor name
      86834a27
  3. 16 Jun, 2025 1 commit
  4. 19 May, 2025 1 commit
    • Jesse Gross's avatar
      ggml: Seperate tensor load from backend creation · 94ab428e
      Jesse Gross authored
      Currently, when the backend is created, the tensors are loaded at the
      same time, which is a slow operation. This separates them to be two
      steps:
       - Create backend, including enumerating tensors and memory allocation
       - Loading tensor data
      
      This allows more flexibility in managing model loading.
      94ab428e
  5. 06 May, 2025 1 commit
    • Daniel Hiltgen's avatar
      Move quantization to new backend (#10363) · 42481045
      Daniel Hiltgen authored
      * Move quantization logic to GGML via new backend
      
      This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.
      
      * Remove "add model quantizations"
      
      This is no longer needed now that quantization is implemented in Go+GGML code directly.
      42481045
  6. 01 May, 2025 1 commit