"vscode:/vscode.git/clone" did not exist on "f56dcc58c12e140c9841a65987feb2a041cd773e"
  1. 18 Jun, 2025 5 commits
  2. 17 Jun, 2025 1 commit
  3. 16 Jun, 2025 3 commits
  4. 14 Jun, 2025 1 commit
  5. 12 Jun, 2025 2 commits
  6. 11 Jun, 2025 3 commits
  7. 10 Jun, 2025 3 commits
  8. 09 Jun, 2025 1 commit
  9. 08 Jun, 2025 1 commit
  10. 07 Jun, 2025 2 commits
  11. 06 Jun, 2025 4 commits
  12. 05 Jun, 2025 2 commits
  13. 04 Jun, 2025 1 commit
  14. 31 May, 2025 1 commit
  15. 30 May, 2025 1 commit
  16. 29 May, 2025 3 commits
    • Jesse Gross's avatar
      ggml: Export GPU UUIDs · aaa78180
      Jesse Gross authored
      This enables matching up devices and information reported by the backend
      with system management libraries such as nvml to get accurate free
      memory reporting.
      aaa78180
    • Jesse Gross's avatar
      llm: Make "POST predict" error message more informative · f15ffc43
      Jesse Gross authored
      "POST predict" basically means that the runner has crashed, which
      can have many reasons. However, many people think this is a specific
      error and either report only this message or group together unrelated
      bugs. This replaces it with a more friendly and helpful message.
      f15ffc43
    • Devon Rifkin's avatar
      add thinking support to the api and cli (#10584) · 5f57b0ef
      Devon Rifkin authored
      - Both `/api/generate` and `/api/chat` now accept a `"think"`
        option that allows specifying whether thinking mode should be on or
        not
      - Templates get passed this new option so, e.g., qwen3's template can
        put `/think` or `/no_think` in the system prompt depending on the
        value of the setting
      - Models' thinking support is inferred by inspecting model templates.
        The prefix and suffix the parser uses to identify thinking support is
        also automatically inferred from templates
      - Thinking control & parsing is opt-in via the API to prevent breaking
        existing API consumers. If the `"think"` option is not specified, the
        behavior is unchanged from previous versions of ollama
      - Add parsing for thinking blocks in both streaming/non-streaming mode
        in both `/generate` and `/chat`
      - Update the CLI to make use of these changes. Users can pass `--think`
        or `--think=false` to control thinking, or during an interactive
        session they can use the commands `/set think` or `/set nothink`
      - A `--hidethinking` option has also been added to the CLI. This makes
        it easy to use thinking in scripting scenarios like
        `ollama run qwen3 --think --hidethinking "my question here"` where you
        just want to see the answer but still want the benefits of thinking
        models
      5f57b0ef
  17. 27 May, 2025 5 commits
  18. 26 May, 2025 1 commit