1. 16 Dec, 2025 1 commit
    • Bruce MacDonald's avatar
      types: ConfigV2 and RootFS (#13504) · 45c47393
      Bruce MacDonald authored
      Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
      45c47393
  2. 27 Oct, 2025 1 commit
    • Devon Rifkin's avatar
      create: inherit FROM model's renderer/parser · 1bdd8169
      Devon Rifkin authored
      On main, the `RENDERER` and `PARSER` fields from the `Modelfile` don't
      get propagated to a new model created with a `req.From` parameter. This
      is easily triggered via `ollama run qwen3-coder`, then running some save
      command like `/save qwen3-coder-custom`.
      
      Added a regression test for this, and then open the config for the
      "from" model in order to use its renderer/parser as a default for the
      new model. This will fix the CLI and also API-based creates.
      
      Fixes: https://github.com/ollama/ollama/issues/12792
      1bdd8169
  3. 20 Oct, 2025 1 commit
  4. 17 Sep, 2025 1 commit
  5. 06 May, 2025 1 commit
    • Daniel Hiltgen's avatar
      Move quantization to new backend (#10363) · 42481045
      Daniel Hiltgen authored
      * Move quantization logic to GGML via new backend
      
      This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.
      
      * Remove "add model quantizations"
      
      This is no longer needed now that quantization is implemented in Go+GGML code directly.
      42481045
  6. 14 Feb, 2025 1 commit
    • Michael Yang's avatar
      next ollama runner (#7913) · 58245413
      Michael Yang authored
      
      
      feat: add new Ollama engine using ggml through cgo
      
      This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.
      
      - `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
      - `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
      - `ml.Tensor` defines the interface for a tensor and tensor operations
      
      This is the first implementation of the new engine. Follow up PRs will implement more features:
      
      - non-greedy sampling (#8410)
      - integration with Ollama and KV caching (#8301)
      - more model support (#9080) with more coming soon
      Co-authored-by: default avatarBruce MacDonald <brucewmacdonald@gmail.com>
      58245413
  7. 15 Jan, 2025 1 commit
  8. 01 Jan, 2025 1 commit
  9. 27 Aug, 2024 2 commits
  10. 02 Aug, 2024 2 commits
  11. 31 Jul, 2024 2 commits
  12. 22 Jul, 2024 1 commit
  13. 19 Jul, 2024 1 commit
  14. 16 Jul, 2024 1 commit
  15. 12 Jul, 2024 1 commit
  16. 11 Jul, 2024 2 commits
  17. 05 Jul, 2024 1 commit
  18. 13 Jun, 2024 1 commit
  19. 12 Jun, 2024 1 commit
  20. 07 Jun, 2024 1 commit
  21. 04 Jun, 2024 1 commit
  22. 14 May, 2024 1 commit