Unverified Commit 3d99d977 authored by Parth Sareen's avatar Parth Sareen Committed by GitHub
Browse files

docs: add docs for docs.ollama.com (#12805)

parent 6d02a43a
---
title: VS Code
---
## Install
Install [VSCode](https://code.visualstudio.com/download).
## Usage with Ollama
1. Open Copilot side bar found in top right window
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/vscode-sidebar.png"
alt="VSCode chat Sidebar"
width="75%"
/>
</div>
2. Select the model drowpdown > **Manage models**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/vscode-models.png"
alt="VSCode model picker"
width="75%"
/>
</div>
3. Enter **Ollama** under **Provider Dropdown** and select desired models (e.g `qwen3, qwen3-coder:480b-cloud`)
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/vscode-model-options.png"
alt="VSCode model options dropdown"
width="75%"
/>
</div>
---
title: Xcode
---
## Install
Install [XCode](https://developer.apple.com/xcode/)
## Usage with Ollama
<Note> Ensure Apple Intelligence is setup and the latest XCode version is v26.0 </Note>
1. Click **XCode** in top left corner > **Settings**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/xcode-intelligence-window.png"
alt="Xcode Intelligence window"
width="50%"
/>
</div>
2. Select **Locally Hosted**, enter port **11434** and click **Add**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/xcode-locally-hosted.png"
alt="Xcode settings"
width="50%"
/>
</div>
3. Select the **star icon** on the top left corner and click the **dropdown**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/xcode-chat-icon.png"
alt="Xcode settings"
width="50%"
/>
</div>
4. Click **My Account** and select your desired model
## Connecting to ollama.com directly
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com
2. Select **Internet Hosted** and enter URL as `https://ollama.com`
3. Enter your **Ollama API Key** and click **Add**
\ No newline at end of file
---
title: Zed
---
## Install
Install [Zed](https://zed.dev/download).
## Usage with Ollama
1. In Zed, click the **star icon** in the bottom-right corner, then select **Configure**.
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/zed-settings.png"
alt="Zed star icon in bottom right corner"
width="50%"
/>
</div>
2. Under **LLM Providers**, choose **Ollama**
3. Confirm the **Host URL** is `http://localhost:11434`, then click **Connect**
4. Once connected, select a model under **Ollama**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/zed-ollama-dropdown.png"
alt="Zed star icon in bottom right corner"
width="50%"
/>
</div>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on **ollama.com**
2. In Zed, open the **star icon** → **Configure**
3. Under **LLM Providers**, select **Ollama**
4. Set the **API URL** to `https://ollama.com`
# Linux ---
title: Linux
---
## Install ## Install
...@@ -10,15 +12,16 @@ curl -fsSL https://ollama.com/install.sh | sh ...@@ -10,15 +12,16 @@ curl -fsSL https://ollama.com/install.sh | sh
## Manual install ## Manual install
> [!NOTE] <Note>
> If you are upgrading from a prior version, you **MUST** remove the old libraries with `sudo rm -rf /usr/lib/ollama` first. If you are upgrading from a prior version, you should remove the old libraries
with `sudo rm -rf /usr/lib/ollama` first.
</Note>
Download and extract the package: Download and extract the package:
```shell ```shell
curl -LO https://ollama.com/download/ollama-linux-amd64.tgz curl -fsSL https://ollama.com/download/ollama-linux-amd64.tgz \
sudo rm -rf /usr/lib/ollama | sudo tar zx -C /usr
sudo tar -C /usr -xzf ollama-linux-amd64.tgz
``` ```
Start Ollama: Start Ollama:
...@@ -35,15 +38,11 @@ ollama -v ...@@ -35,15 +38,11 @@ ollama -v
### AMD GPU install ### AMD GPU install
If you have an AMD GPU, **also** download and extract the additional ROCm package: If you have an AMD GPU, also download and extract the additional ROCm package:
> [!IMPORTANT]
> The ROCm tgz contains only AMD dependent libraries. You must extract **both** `ollama-linux-amd64.tgz` and `ollama-linux-amd64-rocm.tgz` into the same location.
```shell ```shell
curl -L https://ollama.com/download/ollama-linux-amd64-rocm.tgz -o ollama-linux-amd64-rocm.tgz curl -fsSL https://ollama.com/download/ollama-linux-amd64-rocm.tgz \
sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz | sudo tar zx -C /usr
``` ```
### ARM64 install ### ARM64 install
...@@ -51,8 +50,8 @@ sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz ...@@ -51,8 +50,8 @@ sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz
Download and extract the ARM64-specific package: Download and extract the ARM64-specific package:
```shell ```shell
curl -L https://ollama.com/download/ollama-linux-arm64.tgz -o ollama-linux-arm64.tgz curl -fsSL https://ollama.com/download/ollama-linux-arm64.tgz \
sudo tar -C /usr -xzf ollama-linux-arm64.tgz | sudo tar zx -C /usr
``` ```
### Adding Ollama as a startup service (recommended) ### Adding Ollama as a startup service (recommended)
...@@ -113,12 +112,13 @@ sudo systemctl start ollama ...@@ -113,12 +112,13 @@ sudo systemctl start ollama
sudo systemctl status ollama sudo systemctl status ollama
``` ```
> [!NOTE] <Note>
> While AMD has contributed the `amdgpu` driver upstream to the official linux While AMD has contributed the `amdgpu` driver upstream to the official linux
> kernel source, the version is older and may not support all ROCm features. We kernel source, the version is older and may not support all ROCm features. We
> recommend you install the latest driver from recommend you install the latest driver from
> [AMD](https://www.amd.com/en/support/download/linux-drivers.html) for best support https://www.amd.com/en/support/linux-drivers for best support of your Radeon
> of your Radeon GPU. GPU.
</Note>
## Customizing ## Customizing
...@@ -146,8 +146,8 @@ curl -fsSL https://ollama.com/install.sh | sh ...@@ -146,8 +146,8 @@ curl -fsSL https://ollama.com/install.sh | sh
Or by re-downloading Ollama: Or by re-downloading Ollama:
```shell ```shell
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz curl -fsSL https://ollama.com/download/ollama-linux-amd64.tgz \
sudo tar -C /usr -xzf ollama-linux-amd64.tgz | sudo tar zx -C /usr
``` ```
## Installing specific versions ## Installing specific versions
...@@ -178,6 +178,12 @@ sudo systemctl disable ollama ...@@ -178,6 +178,12 @@ sudo systemctl disable ollama
sudo rm /etc/systemd/system/ollama.service sudo rm /etc/systemd/system/ollama.service
``` ```
Remove ollama libraries from your lib directory (either `/usr/local/lib`, `/usr/lib`, or `/lib`):
```shell
sudo rm -r $(which ollama | tr 'bin' 'lib')
```
Remove the ollama binary from your bin directory (either `/usr/local/bin`, `/usr/bin`, or `/bin`): Remove the ollama binary from your bin directory (either `/usr/local/bin`, `/usr/bin`, or `/bin`):
```shell ```shell
...@@ -187,13 +193,7 @@ sudo rm $(which ollama) ...@@ -187,13 +193,7 @@ sudo rm $(which ollama)
Remove the downloaded models and Ollama service user and group: Remove the downloaded models and Ollama service user and group:
```shell ```shell
sudo rm -r /usr/share/ollama
sudo userdel ollama sudo userdel ollama
sudo groupdel ollama sudo groupdel ollama
``` sudo rm -r /usr/share/ollama
Remove installed libraries:
```shell
sudo rm -rf /usr/local/lib/ollama
``` ```
<svg width="28" height="28" viewBox="0 0 28 28" fill="none" xmlns="http://www.w3.org/2000/svg">
<path fill-rule="evenodd" clip-rule="evenodd" d="M7.25558 0.114339C7.61134 0.222519 7.93252 0.400698 8.22405 0.636149C8.70993 1.0256 9.12005 1.58303 9.433 2.24356C9.74758 2.90792 9.95182 3.64354 10.0292 4.38171C11.0662 3.9284 12.2171 3.65235 13.4041 3.57227L13.4881 3.56718C14.921 3.47809 16.3375 3.6779 17.5728 4.17044C17.7391 4.2379 17.9022 4.31044 18.062 4.3868C18.1443 3.66263 18.3453 2.94355 18.6549 2.29447C18.9678 1.63266 19.378 1.07651 19.8622 0.685785C20.1328 0.459579 20.4638 0.281532 20.8323 0.163974C21.2556 0.0367035 21.7053 0.0137947 22.1434 0.110521C22.8039 0.255609 23.3704 0.578877 23.8168 1.04851C24.2253 1.47739 24.5316 2.0272 24.7408 2.68646C25.1196 3.87517 25.1855 5.43933 24.9302 7.32549L25.0175 7.37639L25.0603 7.40058C26.3072 8.13366 27.1752 9.17855 27.6348 10.3914C28.3512 12.284 27.9905 14.4068 26.7552 15.5943L26.7255 15.621L26.7288 15.6248C27.4157 16.5946 27.8324 17.6192 27.9214 18.6793L27.9246 18.7175C28.0301 20.0729 27.5952 21.4373 26.5839 22.7774L26.5723 22.7902L26.5888 22.8207C27.3663 24.2932 27.6101 25.7759 27.3103 27.2574L27.3004 27.307C27.254 27.5234 27.0983 27.7168 26.8677 27.8446C26.637 27.9724 26.3501 28.0246 26.07 27.9892C25.9312 27.9724 25.7982 27.9347 25.6783 27.8782C25.5585 27.8217 25.4543 27.7474 25.3717 27.6595C25.289 27.572 25.2296 27.4725 25.1968 27.3668C25.164 27.2614 25.1585 27.152 25.1806 27.0448C25.4556 25.7301 25.197 24.4116 24.39 23.0702C24.3147 22.9456 24.2812 22.8083 24.2927 22.671C24.3043 22.5338 24.3604 22.401 24.4559 22.2849L24.4624 22.2773C25.4573 21.1013 25.869 19.9482 25.7801 18.8155C25.7043 17.8241 25.2448 16.8504 24.4624 15.9226C24.3103 15.7423 24.2561 15.5229 24.3115 15.3119C24.367 15.1009 24.5277 14.9152 24.7589 14.795L24.7737 14.7874C25.174 14.585 25.5429 14.0683 25.729 13.3619C25.9344 12.5267 25.8808 11.6658 25.5726 10.8496C25.2349 9.95872 24.6173 9.21546 23.7526 8.70765C22.7726 8.12984 21.4747 7.85111 19.8326 7.9313C19.6178 7.94209 19.4039 7.90286 19.2183 7.81869C19.0327 7.73451 18.8841 7.60927 18.7916 7.45912C18.2744 6.61277 17.5201 6.00696 16.5796 5.63151C15.6767 5.2833 14.6658 5.13696 13.661 5.20897C11.6104 5.33497 9.80194 6.22841 9.26335 7.35476C9.18715 7.51329 9.05009 7.65005 8.87052 7.74673C8.69096 7.84338 8.47747 7.89535 8.25864 7.89566C6.50122 7.8982 5.14075 8.21638 4.14592 8.79037C3.28615 9.28673 2.6998 9.98036 2.39015 10.8114C2.10995 11.5937 2.07158 12.4159 2.27815 13.2118C2.46262 13.9219 2.82333 14.5099 3.23674 14.8268L3.24992 14.8357C3.5991 15.0992 3.67321 15.5103 3.42945 15.8348C2.83651 16.6264 2.39345 17.8062 2.32098 18.9402C2.23862 20.2358 2.62733 21.3609 3.50521 22.1678L3.53157 22.192C3.66406 22.3113 3.74924 22.4576 3.77701 22.6133C3.80475 22.769 3.77385 22.9276 3.68804 23.0702C2.73933 24.6432 2.4478 25.9363 2.76239 26.9545C2.81892 27.1662 2.76631 27.3867 2.61573 27.5687C2.46516 27.7509 2.22851 27.8805 1.95615 27.9299C1.68379 27.9795 1.39724 27.9446 1.15746 27.8334C0.917644 27.7219 0.743586 27.5427 0.672268 27.3337C0.272031 26.0381 0.543797 24.5541 1.45133 22.8818L1.47438 22.8373L1.46121 22.822C1.01515 22.3129 0.682282 21.7498 0.476267 21.156L0.468032 21.1318C0.218008 20.391 0.119645 19.6244 0.176502 18.86C0.248972 17.7019 0.634385 16.5157 1.20097 15.5637L1.22074 15.5306L1.21744 15.5281C0.734856 14.9961 0.377443 14.3152 0.179796 13.5618L0.17156 13.5312C-0.100765 12.4803 -0.0482896 11.3945 0.324737 10.3622C0.756268 9.19764 1.6045 8.19729 2.85462 7.47439C2.95345 7.41712 3.05721 7.35985 3.16098 7.3064C2.89909 5.40624 2.96498 3.8319 3.34545 2.63556C3.55463 1.97629 3.86263 1.42648 4.2711 0.997598C4.71581 0.529242 5.2824 0.205974 5.94287 0.0596123C6.38099 -0.0371136 6.83228 -0.0142049 7.25558 0.114339ZM14.0349 11.6832C15.5765 11.6832 16.9996 12.0816 18.0636 12.7714C19.1013 13.4421 19.7189 14.3432 19.7189 15.2405C19.7189 16.3706 19.0502 17.2513 17.8528 17.8139C16.8316 18.2911 15.4629 18.5228 13.8949 18.5228C12.233 18.5228 10.8132 18.1931 9.78876 17.5886C8.77252 16.9904 8.20264 16.1504 8.20264 15.2405C8.20264 14.3407 8.85817 13.437 9.94194 12.7638C11.0422 12.0803 12.4949 11.6832 14.0349 11.6832ZM14.0349 12.8236C12.8922 12.8159 11.7798 13.1075 10.8791 13.6508C10.1198 14.1217 9.68994 14.7136 9.68994 15.2417C9.68994 15.7865 10.0358 16.2968 10.6946 16.685C11.4441 17.1266 12.5459 17.3824 13.8949 17.3824C15.2109 17.3824 16.321 17.1953 17.077 16.8403C17.8396 16.4839 18.23 15.9672 18.23 15.2405C18.23 14.7021 17.8248 14.1077 17.105 13.6419C16.3078 13.1265 15.2274 12.8236 14.0349 12.8236ZM15.1252 14.3636L15.1318 14.3687C15.3295 14.5608 15.2883 14.8396 15.0396 14.9923L14.5587 15.285V15.8526C14.5578 15.979 14.4921 16.0999 14.376 16.1889C14.2599 16.2779 14.1029 16.3277 13.9394 16.3274C13.7758 16.3277 13.6188 16.2779 13.5027 16.1889C13.3866 16.0999 13.3209 15.979 13.3201 15.8526V15.2672L12.8737 14.9897C12.8148 14.9533 12.7659 14.9082 12.7297 14.857C12.6935 14.8059 12.6707 14.7497 12.6628 14.6917C12.6548 14.6337 12.6618 14.5751 12.6833 14.5192C12.7048 14.4633 12.7404 14.4113 12.7881 14.3661C12.8853 14.2747 13.0253 14.2166 13.1776 14.2044C13.3299 14.1923 13.4824 14.2271 13.6017 14.3012L13.9558 14.5201L14.3182 14.2987C14.4371 14.2261 14.588 14.1922 14.7388 14.2043C14.8896 14.2165 15.0282 14.2736 15.1252 14.3636ZM6.82405 11.9212C7.61134 11.9212 8.25205 12.4176 8.25205 13.0298C8.25248 13.3232 8.10217 13.6048 7.83409 13.8127C7.56602 14.0205 7.20215 14.1376 6.8224 14.1383C6.44321 14.1373 6.08 14.0202 5.81235 13.8127C5.54467 13.6051 5.3944 13.324 5.3944 13.031C5.39351 12.7376 5.54342 12.4559 5.81117 12.2478C6.07895 12.0397 6.4443 11.9223 6.82405 11.9212ZM21.1634 11.9212C21.954 11.9212 22.593 12.4176 22.593 13.0298C22.5935 13.3232 22.4432 13.6048 22.1751 13.8127C21.907 14.0205 21.5431 14.1376 21.1634 14.1383C20.7842 14.1373 20.421 14.0202 20.1533 13.8127C19.8857 13.6051 19.7354 13.324 19.7354 13.031C19.7345 12.7376 19.8844 12.4559 20.1522 12.2478C20.4199 12.0397 20.7836 11.9223 21.1634 11.9212ZM6.48969 1.6543L6.48475 1.65684C6.29392 1.72096 6.131 1.82611 6.01534 1.95975L6.0071 1.96738C5.77981 2.20793 5.58216 2.56174 5.43393 3.02628C5.15392 3.90699 5.07816 5.10206 5.22969 6.56695C5.93793 6.40405 6.7104 6.30223 7.54217 6.26532L7.55864 6.26405L7.58993 6.22077C7.6657 6.11641 7.7464 6.01587 7.8337 5.9166C8.03629 4.93534 7.86993 3.76318 7.41699 2.8061C7.19628 2.34283 6.92781 1.97884 6.67087 1.77139C6.61783 1.72827 6.55871 1.68986 6.49463 1.65684L6.48969 1.6543ZM21.5999 1.70521L21.5966 1.70648C21.5325 1.73949 21.4734 1.7779 21.4203 1.82102C21.1634 2.02847 20.8933 2.39374 20.6742 2.85701C20.1966 3.86754 20.0368 5.11734 20.2954 6.13041L20.3909 6.25387L20.4041 6.27168H20.4535C21.2709 6.27186 22.0841 6.36273 22.8681 6.5415C23.0097 5.11097 22.9307 3.94136 22.6573 3.07719C22.509 2.61265 22.3114 2.25883 22.0824 2.01829L22.0759 2.01066C21.9604 1.87654 21.7975 1.77095 21.6064 1.70648L21.5999 1.70521Z" fill="black"/>
</svg>
# Ollama Model File ---
title: Modelfile Reference
---
> [!NOTE] A Modelfile is the blueprint to create and share customized models using Ollama.
> `Modelfile` syntax is in development
A model file is the blueprint to create and share models with Ollama.
## Table of Contents ## Table of Contents
...@@ -73,26 +72,23 @@ To view the Modelfile of a given model, use the `ollama show --modelfile` comman ...@@ -73,26 +72,23 @@ To view the Modelfile of a given model, use the `ollama show --modelfile` comman
ollama show --modelfile llama3.2 ollama show --modelfile llama3.2
``` ```
> **Output**: ```
> # Modelfile generated by "ollama show"
> ``` # To build a new Modelfile based on this one, replace the FROM line with:
> # Modelfile generated by "ollama show" # FROM llama3.2:latest
> # To build a new Modelfile based on this one, replace the FROM line with: FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
> # FROM llama3.2:latest TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
> FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
> TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
>
> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
>
> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
>
> {{ .Response }}<|eot_id|>"""
> PARAMETER stop "<|start_header_id|>"
> PARAMETER stop "<|end_header_id|>"
> PARAMETER stop "<|eot_id|>"
> PARAMETER stop "<|reserved_special_token"
> ```
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|reserved_special_token"
```
## Instructions ## Instructions
...@@ -110,10 +106,13 @@ FROM <model name>:<tag> ...@@ -110,10 +106,13 @@ FROM <model name>:<tag>
FROM llama3.2 FROM llama3.2
``` ```
A list of available base models: <Card title="Base Models" href="https://github.com/ollama/ollama#model-library">
<https://github.com/ollama/ollama#model-library> A list of available base models
Additional models can be found at: </Card>
<https://ollama.com/library>
<Card title="Base Models" href="https://ollama.com/library">
Additional models can be found at
</Card>
#### Build from a Safetensors model #### Build from a Safetensors model
...@@ -124,10 +123,11 @@ FROM <model directory> ...@@ -124,10 +123,11 @@ FROM <model directory>
The model directory should contain the Safetensors weights for a supported architecture. The model directory should contain the Safetensors weights for a supported architecture.
Currently supported model architectures: Currently supported model architectures:
* Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
* Mistral (including Mistral 1, Mistral 2, and Mixtral) - Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
* Gemma (including Gemma 1 and Gemma 2) - Mistral (including Mistral 1, Mistral 2, and Mixtral)
* Phi3 - Gemma (including Gemma 1 and Gemma 2)
- Phi3
#### Build from a GGUF file #### Build from a GGUF file
...@@ -137,7 +137,6 @@ FROM ./ollama-model.gguf ...@@ -137,7 +137,6 @@ FROM ./ollama-model.gguf
The GGUF file location should be specified as an absolute path or relative to the `Modelfile` location. The GGUF file location should be specified as an absolute path or relative to the `Modelfile` location.
### PARAMETER ### PARAMETER
The `PARAMETER` instruction defines a parameter that can be set when the model is run. The `PARAMETER` instruction defines a parameter that can be set when the model is run.
...@@ -148,18 +147,21 @@ PARAMETER <parameter> <parametervalue> ...@@ -148,18 +147,21 @@ PARAMETER <parameter> <parametervalue>
#### Valid Parameters and Values #### Valid Parameters and Values
| Parameter | Description | Value Type | Example Usage | | Parameter | Description | Value Type | Example Usage |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | -------------------- | | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | -------------------- |
| num_ctx | Sets the size of the context window used to generate the next token. (Default: 4096) | int | num_ctx 4096 | | mirostat | Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0) | int | mirostat 0 |
| repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 | | mirostat_eta | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1) | float | mirostat_eta 0.1 |
| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 | | mirostat_tau | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0) | float | mirostat_tau 5.0 |
| temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 | | num_ctx | Sets the size of the context window used to generate the next token. (Default: 2048) | int | num_ctx 4096 |
| seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 | | repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 |
| stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" | | repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 |
| num_predict | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation) | int | num_predict 42 | | temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 |
| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 | | seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 |
| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 | | stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" |
| min_p | Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter *p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with *p*=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float | min_p 0.05 | | num_predict | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation) | int | num_predict 42 |
| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 |
| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 |
| min_p | Alternative to the top*p, and aims to ensure a balance of quality and variety. The parameter \_p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with _p_=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float | min_p 0.05 |
### TEMPLATE ### TEMPLATE
...@@ -201,9 +203,10 @@ ADAPTER <path to safetensor adapter> ...@@ -201,9 +203,10 @@ ADAPTER <path to safetensor adapter>
``` ```
Currently supported Safetensor adapters: Currently supported Safetensor adapters:
* Llama (including Llama 2, Llama 3, and Llama 3.1)
* Mistral (including Mistral 1, Mistral 2, and Mixtral) - Llama (including Llama 2, Llama 3, and Llama 3.1)
* Gemma (including Gemma 1 and Gemma 2) - Mistral (including Mistral 1, Mistral 2, and Mixtral)
- Gemma (including Gemma 1 and Gemma 2)
#### GGUF adapter #### GGUF adapter
...@@ -237,7 +240,6 @@ MESSAGE <role> <message> ...@@ -237,7 +240,6 @@ MESSAGE <role> <message>
| user | An example message of what the user could have asked. | | user | An example message of what the user could have asked. |
| assistant | An example message of how the model should respond. | | assistant | An example message of how the model should respond. |
#### Example conversation #### Example conversation
``` ```
...@@ -249,7 +251,6 @@ MESSAGE user Is Ontario in Canada? ...@@ -249,7 +251,6 @@ MESSAGE user Is Ontario in Canada?
MESSAGE assistant yes MESSAGE assistant yes
``` ```
## Notes ## Notes
- the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to distinguish it from arguments. - the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to distinguish it from arguments.
......
<svg width="17" height="25" viewBox="0 0 17 25" fill="none" xmlns="http://www.w3.org/2000/svg">
<path fill-rule="evenodd" clip-rule="evenodd" d="M4.40517 0.102088C4.62117 0.198678 4.81617 0.357766 4.99317 0.56799C5.28817 0.915712 5.53718 1.41342 5.72718 2.00318C5.91818 2.59635 6.04218 3.25316 6.08918 3.91224C6.71878 3.5075 7.41754 3.26103 8.13818 3.18953L8.18918 3.18498C9.05919 3.10544 9.91919 3.28384 10.6692 3.72361C10.7702 3.78384 10.8692 3.84861 10.9662 3.91679C11.0162 3.27021 11.1382 2.62817 11.3262 2.04863C11.5162 1.45773 11.7652 0.961166 12.0592 0.612308C12.2235 0.410338 12.4245 0.251368 12.6482 0.146406C12.9052 0.032771 13.1782 0.0123167 13.4442 0.098679C13.8452 0.228223 14.1892 0.516855 14.4602 0.936167C14.7082 1.3191 14.8942 1.81 15.0212 2.39863C15.2512 3.45998 15.2912 4.85655 15.1362 6.54061L15.1892 6.58607L15.2152 6.60766C15.9722 7.26219 16.4992 8.19513 16.7782 9.27807C17.2133 10.9678 16.9943 12.8632 16.2442 13.9235L16.2262 13.9473L16.2282 13.9507C16.6453 14.8166 16.8983 15.7314 16.9523 16.678L16.9543 16.7121C17.0183 17.9223 16.7542 19.1404 16.1402 20.337L16.1332 20.3484L16.1432 20.3756C16.6152 21.6904 16.7632 23.0142 16.5812 24.3369L16.5752 24.3813C16.547 24.5744 16.4525 24.7472 16.3125 24.8612C16.1725 24.9753 15.9983 25.0219 15.8282 24.9903C15.744 24.9753 15.6632 24.9417 15.5904 24.8912C15.5177 24.8408 15.4544 24.7744 15.4042 24.696C15.3541 24.6178 15.318 24.529 15.2981 24.4347C15.2782 24.3406 15.2748 24.2428 15.2882 24.1472C15.4552 22.9733 15.2982 21.7961 14.8082 20.5984C14.7625 20.4871 14.7422 20.3645 14.7492 20.242C14.7562 20.1194 14.7902 20.0009 14.8482 19.8972L14.8522 19.8904C15.4562 18.8404 15.7062 17.8109 15.6522 16.7996C15.6062 15.9143 15.3272 15.045 14.8522 14.2166C14.7598 14.0556 14.7269 13.8597 14.7606 13.6713C14.7943 13.4829 14.8918 13.3171 15.0322 13.2098L15.0412 13.203C15.2842 13.0223 15.5082 12.561 15.6212 11.9303C15.7459 11.1846 15.7133 10.4159 15.5262 9.68716C15.3212 8.89171 14.9462 8.22809 14.4212 7.77468C13.8262 7.25878 13.0382 7.00992 12.0412 7.08151C11.9108 7.09115 11.7809 7.05613 11.6682 6.98097C11.5556 6.90581 11.4653 6.79399 11.4092 6.65993C11.0952 5.90426 10.6372 5.36336 10.0662 5.02814C9.51799 4.71723 8.90425 4.58657 8.29418 4.65087C7.04918 4.76337 5.95118 5.56108 5.62418 6.56675C5.57792 6.70829 5.4947 6.8304 5.38568 6.91672C5.27666 7.00301 5.14703 7.04942 5.01417 7.0497C3.94717 7.05197 3.12117 7.33606 2.51717 7.84855C1.99517 8.29172 1.63916 8.91103 1.45116 9.65307C1.28104 10.3515 1.25774 11.0857 1.38316 11.7962C1.49516 12.4303 1.71416 12.9553 1.96517 13.2382L1.97317 13.2462C2.18517 13.4814 2.23017 13.8485 2.08217 14.1382C1.72216 14.845 1.45316 15.8984 1.40916 16.9109C1.35916 18.0677 1.59516 19.0722 2.12817 19.7927L2.14417 19.8143C2.22461 19.9208 2.27633 20.0514 2.29319 20.1905C2.31003 20.3295 2.29127 20.4711 2.23917 20.5984C1.66316 22.0029 1.48616 23.1574 1.67716 24.0665C1.71148 24.2556 1.67954 24.4524 1.58812 24.6149C1.4967 24.7776 1.35302 24.8933 1.18766 24.9374C1.0223 24.9817 0.848322 24.9506 0.702741 24.8512C0.557141 24.7517 0.451463 24.5917 0.408163 24.4051C0.165162 23.2483 0.330162 21.9233 0.881162 20.4302L0.895162 20.3904L0.887162 20.3768C0.616341 19.9222 0.414243 19.4195 0.289162 18.8893L0.284162 18.8677C0.132362 18.2062 0.0726416 17.5218 0.107162 16.8393C0.151162 15.8052 0.385163 14.7462 0.729162 13.8962L0.741162 13.8666L0.739162 13.8644C0.446163 13.3894 0.229162 12.7814 0.109162 12.1087L0.104162 12.0814C-0.0611788 11.1431 -0.0293187 10.1737 0.197162 9.25194C0.459163 8.21218 0.974162 7.31901 1.73316 6.67356C1.79316 6.62243 1.85616 6.57129 1.91916 6.52357C1.76016 4.827 1.80016 3.42134 2.03117 2.35317C2.15817 1.76455 2.34517 1.27365 2.59317 0.890713C2.86317 0.472537 3.20717 0.183905 3.60817 0.0532252C3.87417 -0.0331371 4.14817 -0.0126829 4.40517 0.102088ZM8.52118 10.4315C9.45719 10.4315 10.3212 10.7871 10.9672 11.403C11.5972 12.0019 11.9722 12.8064 11.9722 13.6076C11.9722 14.6166 11.5662 15.403 10.8392 15.9052C10.2192 16.3314 9.38819 16.5382 8.43618 16.5382C7.42718 16.5382 6.56518 16.2439 5.94318 15.7041C5.32618 15.17 4.98017 14.42 4.98017 13.6076C4.98017 12.8042 5.37818 11.9973 6.03618 11.3962C6.70418 10.786 7.58618 10.4315 8.52118 10.4315ZM8.52118 11.4496C7.82742 11.4428 7.15204 11.7031 6.60518 12.1883C6.14418 12.6087 5.88318 13.1371 5.88318 13.6087C5.88318 14.095 6.09318 14.5507 6.49318 14.8973C6.94818 15.2916 7.61718 15.52 8.43618 15.52C9.23519 15.52 9.90919 15.353 10.3682 15.0359C10.8312 14.7178 11.0682 14.2564 11.0682 13.6076C11.0682 13.1269 10.8222 12.5962 10.3852 12.1803C9.90119 11.7201 9.24519 11.4496 8.52118 11.4496ZM9.18319 12.8246L9.18719 12.8292C9.30719 13.0007 9.28219 13.2496 9.13119 13.386L8.83919 13.6473V14.1541C8.83865 14.267 8.79877 14.375 8.72829 14.4544C8.6578 14.5339 8.56246 14.5783 8.46318 14.578C8.3639 14.5783 8.26856 14.5339 8.19808 14.4544C8.12758 14.375 8.0877 14.267 8.08718 14.1541V13.6314L7.81618 13.3837C7.78042 13.3511 7.7507 13.3109 7.72872 13.2652C7.70674 13.2195 7.69294 13.1694 7.6881 13.1176C7.68326 13.0658 7.6875 13.0135 7.70056 12.9636C7.71362 12.9137 7.73524 12.8672 7.76418 12.8269C7.8232 12.7452 7.9082 12.6934 8.0007 12.6825C8.09318 12.6717 8.18572 12.7027 8.25818 12.7689L8.47318 12.9644L8.69318 12.7667C8.76538 12.7018 8.85702 12.6716 8.94854 12.6825C9.04009 12.6933 9.12427 12.7443 9.18319 12.8246ZM4.14317 10.644C4.62117 10.644 5.01017 11.0871 5.01017 11.6337C5.01043 11.8957 4.91917 12.1471 4.75641 12.3327C4.59365 12.5183 4.37273 12.6229 4.14217 12.6235C3.91195 12.6226 3.69143 12.518 3.52893 12.3327C3.36641 12.1474 3.27517 11.8965 3.27517 11.6349C3.27463 11.3729 3.36565 11.1213 3.52821 10.9355C3.69079 10.7497 3.91261 10.6449 4.14317 10.644ZM12.8492 10.644C13.3292 10.644 13.7172 11.0871 13.7172 11.6337C13.7175 11.8957 13.6262 12.1471 13.4634 12.3327C13.3007 12.5183 13.0798 12.6229 12.8492 12.6235C12.619 12.6226 12.3985 12.518 12.236 12.3327C12.0734 12.1474 11.9822 11.8965 11.9822 11.6349C11.9817 11.3729 12.0727 11.1213 12.2352 10.9355C12.3978 10.7497 12.6186 10.6449 12.8492 10.644ZM3.94017 1.47705L3.93717 1.47932C3.82131 1.53657 3.72239 1.63046 3.65217 1.74977L3.64717 1.75659C3.50917 1.97136 3.38917 2.28727 3.29917 2.70203C3.12917 3.48839 3.08317 4.55541 3.17517 5.86335C3.60517 5.7179 4.07417 5.62699 4.57917 5.59404L4.58917 5.5929L4.60817 5.55426C4.65417 5.46108 4.70317 5.37131 4.75617 5.28268C4.87917 4.40655 4.77817 3.35998 4.50317 2.50545C4.36917 2.09182 4.20617 1.76682 4.05017 1.5816C4.01797 1.5431 3.98207 1.5088 3.94317 1.47932L3.94017 1.47705ZM13.1142 1.52251L13.1122 1.52364C13.0733 1.55312 13.0374 1.58741 13.0052 1.62591C12.8492 1.81114 12.6852 2.13727 12.5522 2.5509C12.2622 3.45316 12.1652 4.56905 12.3222 5.47358L12.3802 5.58381L12.3882 5.59972H12.4182C12.9145 5.59988 13.4082 5.68101 13.8842 5.84062C13.9702 4.56337 13.9222 3.51907 13.7562 2.74749C13.6662 2.33272 13.5462 2.01682 13.4072 1.80205L13.4032 1.79523C13.3331 1.67548 13.2342 1.58121 13.1182 1.52364L13.1142 1.52251Z" fill="black"/>
</svg>
openapi: 3.1.0
info:
title: Ollama API
version: 0.1.0
description: |
OpenAPI specification for the Ollama HTTP API
servers:
- url: http://localhost:11434
description: Local Ollama instance
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: API Key
parameters:
DigestParam:
name: digest
in: path
required: true
description: SHA256 digest identifier, prefixed with `sha256:`
schema:
type: string
schemas:
ModelOptions:
type: object
description: Runtime options that control text generation
properties:
# Sampling Options
seed:
type: integer
description: Random seed used for reproducible outputs
temperature:
type: number
format: float
description: Controls randomness in generation (higher = more random)
top_k:
type: integer
description: Limits next token selection to the K most likely
top_p:
type: number
format: float
description: Cumulative probability threshold for nucleus sampling
min_p:
type: number
format: float
description: Minimum probability threshold for token selection
stop:
oneOf:
- type: string
- type: array
items:
type: string
description: Stop sequences that will halt generation
# Runtime Options
num_ctx:
type: integer
description: Context length size (number of tokens)
num_predict:
type: integer
description: Maximum number of tokens to generate
additionalProperties: true
GenerateRequest:
type: object
required: [model]
properties:
model:
type: string
description: Model name
prompt:
type: string
description: Text for the model to generate a response from
suffix:
type: string
description: Used for fill-in-the-middle models, text that appears after the user prompt and before the model response
images:
type: array
items:
type: string
description: Base64-encoded images for models that support image input
format:
description: Structured output format for the model to generate a response from. Supports either the string `"json"` or a JSON schema object.
oneOf:
- type: string
- type: object
system:
description: System prompt for the model to generate a response from
type: string
stream:
description: When true, returns a stream of partial responses
type: boolean
default: true
think:
type: boolean
description: When true, returns separate thinking output in addition to content
raw:
type: boolean
description: When true, returns the raw response from the model without any prompt templating
keep_alive:
oneOf:
- type: string
- type: number
description: Model keep-alive duration (for example `5m` or `0` to unload immediately)
options:
$ref: "#/components/schemas/ModelOptions"
GenerateResponse:
type: object
properties:
model:
type: string
description: Model name
created_at:
type: string
description: ISO 8601 timestamp of response creation
response:
type: string
description: The model's generated text response
thinking:
type: string
description: The model's generated thinking output
done:
type: boolean
description: Indicates whether generation has finished
done_reason:
type: string
description: Reason the generation stopped
total_duration:
type: integer
description: Time spent generating the response in nanoseconds
load_duration:
type: integer
description: Time spent loading the model in nanoseconds
prompt_eval_count:
type: integer
description: Number of input tokens in the prompt
prompt_eval_duration:
type: integer
description: Time spent evaluating the prompt in nanoseconds
eval_count:
type: integer
description: Number of output tokens generated in the response
eval_duration:
type: integer
description: Time spent generating tokens in nanoseconds
GenerateStreamEvent:
type: object
properties:
model:
type: string
description: Model name
created_at:
type: string
description: ISO 8601 timestamp of response creation
response:
type: string
description: The model's generated text response for this chunk
thinking:
type: string
description: The model's generated thinking output for this chunk
done:
type: boolean
description: Indicates whether the stream has finished
done_reason:
type: string
description: Reason streaming finished
total_duration:
type: integer
description: Time spent generating the response in nanoseconds
load_duration:
type: integer
description: Time spent loading the model in nanoseconds
prompt_eval_count:
type: integer
description: Number of input tokens in the prompt
prompt_eval_duration:
type: integer
description: Time spent evaluating the prompt in nanoseconds
eval_count:
type: integer
description: Number of output tokens generated in the response
eval_duration:
type: integer
description: Time spent generating tokens in nanoseconds
ChatMessage:
type: object
required: [role, content]
properties:
role:
type: string
enum: [system, user, assistant, tool]
description: Author of the message.
content:
type: string
description: Message text content
images:
type: array
items:
type: string
description: Base64-encoded image content
description: Optional list of inline images for multimodal models
tool_calls:
type: array
items:
$ref: "#/components/schemas/ToolCall"
description: Tool call requests produced by the model
ToolCall:
type: object
properties:
function:
type: object
required: [name]
properties:
name:
type: string
description: Name of the function to call
description:
type: string
description: What the function does
arguments:
type: object
description: JSON object of arguments to pass to the function
ToolDefinition:
type: object
required: [type, function]
properties:
type:
type: string
enum: [function]
description: Type of tool (always `function`)
function:
type: object
required: [name, parameters]
properties:
name:
type: string
description: Function name exposed to the model
description:
type: string
description: Human-readable description of the function
parameters:
type: object
description: JSON Schema for the function parameters
ChatRequest:
type: object
required: [model, messages]
properties:
model:
type: string
description: Model name
messages:
type: array
description: Chat history as an array of message objects (each with a role and content)
items:
$ref: "#/components/schemas/ChatMessage"
tools:
type: array
description: Optional list of function tools the model may call during the chat
items:
$ref: "#/components/schemas/ToolDefinition"
format:
oneOf:
- type: string
enum: [json]
- type: object
description: Format to return a response in. Can be `json` or a JSON schema
options:
$ref: "#/components/schemas/ModelOptions"
stream:
type: boolean
default: true
think:
type: boolean
description: When true, returns separate thinking output in addition to content
keep_alive:
oneOf:
- type: string
- type: number
description: Model keep-alive duration (for example `5m` or `0` to unload immediately)
ChatResponse:
type: object
properties:
model:
type: string
description: Model name used to generate this message
created_at:
type: string
format: date-time
description: Timestamp of response creation (ISO 8601)
message:
type: object
properties:
role:
type: string
enum: [assistant]
description: Always `assistant` for model responses
content:
type: string
description: Assistant message text
thinking:
type: string
description: Optional deliberate thinking trace when `think` is enabled
tool_calls:
type: array
items:
$ref: "#/components/schemas/ToolCall"
description: Tool calls requested by the assistant
images:
type: array
items:
type: string
nullable: true
description: Optional base64-encoded images in the response
done:
type: boolean
description: Indicates whether the chat response has finished
done_reason:
type: string
description: Reason the response finished
total_duration:
type: integer
description: Total time spent generating in nanoseconds
load_duration:
type: integer
description: Time spent loading the model in nanoseconds
prompt_eval_count:
type: integer
description: Number of tokens in the prompt
prompt_eval_duration:
type: integer
description: Time spent evaluating the prompt in nanoseconds
eval_count:
type: integer
description: Number of tokens generated in the response
eval_duration:
type: integer
description: Time spent generating tokens in nanoseconds
ChatStreamEvent:
type: object
properties:
model:
type: string
description: Model name used for this stream event
created_at:
type: string
format: date-time
description: When this chunk was created (ISO 8601)
message:
type: object
properties:
role:
type: string
description: Role of the message for this chunk
content:
type: string
description: Partial assistant message text
thinking:
type: string
description: Partial thinking text when `think` is enabled
tool_calls:
type: array
items:
$ref: "#/components/schemas/ToolCall"
description: Partial tool calls, if any
images:
type: array
items:
type: string
nullable: true
description: Partial base64-encoded images, when present
done:
type: boolean
description: True for the final event in the stream
StatusEvent:
type: object
properties:
status:
type: string
description: Human-readable status message
digest:
type: string
description: Content digest associated with the status, if applicable
total:
type: integer
description: Total number of bytes expected for the operation
completed:
type: integer
description: Number of bytes transferred so far
StatusResponse:
type: object
properties:
status:
type: string
description: Current status message
EmbedRequest:
type: object
required: [model, input]
properties:
model:
type: string
description: Model name
input:
oneOf:
- type: string
- type: array
items:
type: string
description: Text or array of texts to generate embeddings for
truncate:
type: boolean
default: true
description: If true, truncate inputs that exceed the context window. If false, returns an error.
dimensions:
type: integer
description: Number of dimensions to generate embeddings for
keep_alive:
type: string
description: Model keep-alive duration
options:
$ref: "#/components/schemas/ModelOptions"
EmbedResponse:
type: object
properties:
model:
type: string
description: Model that produced the embeddings
embeddings:
type: array
items:
type: array
items:
type: number
description: Array of vector embeddings
total_duration:
type: integer
description: Total time spent generating in nanoseconds
load_duration:
type: integer
description: Load time in nanoseconds
prompt_eval_count:
type: integer
description: Number of input tokens processed to generate embeddings
CreateRequest:
type: object
required: [model]
properties:
model:
type: string
description: Name for the model to create
from:
type: string
description: Existing model to create from
template:
type: string
description: Prompt template to use for the model
license:
oneOf:
- type: string
- type: array
items:
type: string
description: License string or list of licenses for the model
system:
type: string
description: System prompt to embed in the model
parameters:
type: object
description: Key-value parameters for the model
messages:
description: Message history to use for the model
type: array
items:
$ref: "#/components/schemas/ChatMessage"
quantize:
type: string
description: Quantization level to apply (e.g. `q4_K_M`, `q8_0`)
stream:
type: boolean
default: true
description: Stream status updates
CopyRequest:
type: object
required: [source, destination]
properties:
source:
type: string
description: Existing model name to copy from
destination:
type: string
description: New model name to create
DeleteRequest:
type: object
required: [model]
properties:
model:
type: string
description: Model name to delete
PullRequest:
type: object
required: [model]
properties:
model:
type: string
description: Name of the model to download
insecure:
type: boolean
description: Allow downloading over insecure connections
stream:
type: boolean
default: true
description: Stream progress updates
PushRequest:
type: object
required: [model]
properties:
model:
type: string
description: Name of the model to publish
insecure:
type: boolean
description: Allow publishing over insecure connections
stream:
type: boolean
default: true
description: Stream progress updates
ShowRequest:
type: object
required: [model]
properties:
model:
type: string
description: Model name to show
verbose:
type: boolean
description: If true, includes large verbose fields in the response.
ShowResponse:
type: object
properties:
parameters:
type: string
description: Model parameter settings serialized as text
license:
type: string
description: The license of the model
details:
type: object
description: High-level model details
template:
type: string
description: The template used by the model to render prompts
capabilities:
type: array
items:
type: string
description: List of supported features
model_info:
type: object
description: Additional model metadata
ModelSummary:
type: object
description: Summary information for a locally available model
properties:
name:
type: string
description: Model name
modified_at:
type: string
description: Last modified timestamp in ISO 8601 format
size:
type: integer
description: Total size of the model on disk in bytes
digest:
type: string
description: SHA256 digest identifier of the model contents
details:
type: object
description: Additional information about the model's format and family
properties:
format:
type: string
description: Model file format (for example `gguf`)
family:
type: string
description: Primary model family (for example `llama`)
families:
type: array
items:
type: string
description: All families the model belongs to, when applicable
parameter_size:
type: string
description: Approximate parameter count label (for example `7B`, `13B`)
quantization_level:
type: string
description: Quantization level used (for example `Q4_0`)
ListResponse:
type: object
properties:
models:
type: array
items:
$ref: "#/components/schemas/ModelSummary"
Ps:
type: object
properties:
model:
type: string
description: Name of the running model
size:
type: integer
description: Size of the model in bytes
digest:
type: string
description: SHA256 digest of the model
details:
type: object
description: Model details such as format and family
expires_at:
type: string
description: Time when the model will be unloaded
size_vram:
type: integer
description: VRAM usage in bytes
PsResponse:
type: object
properties:
models:
type: array
items:
$ref: "#/components/schemas/Ps"
description: Currently running models
WebSearchRequest:
type: object
required: [query]
properties:
query:
type: string
description: Search query string
max_results:
type: integer
minimum: 1
maximum: 10
default: 5
description: Maximum number of results to return
WebSearchResult:
type: object
properties:
title:
type: string
description: Page title of the result
url:
type: string
format: uri
description: Resolved URL for the result
content:
type: string
description: Extracted text content snippet
WebSearchResponse:
type: object
properties:
results:
type: array
items:
$ref: "#/components/schemas/WebSearchResult"
description: Array of matching search results
WebFetchRequest:
type: object
required: [url]
properties:
url:
type: string
format: uri
description: The URL to fetch
WebFetchResponse:
type: object
properties:
title:
type: string
description: Title of the fetched page
content:
type: string
description: Extracted page content
links:
type: array
items:
type: string
format: uri
description: Links found on the page
VersionResponse:
type: object
properties:
version:
type: string
description: Version of Ollama
ErrorResponse:
type: object
properties:
error:
type: string
description: Error message describing what went wrong
paths:
/api/generate:
post:
summary: Generate a response
description: Generates a response for the provided prompt
operationId: generate
x-mint:
href: /api/generate
x-codeSamples:
- lang: bash
label: Default
source: |
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "Why is the sky blue?"
}'
- lang: bash
label: Non-streaming
source: |
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "Why is the sky blue?",
"stream": false
}'
- lang: bash
label: With options
source: |
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "Why is the sky blue?",
"options": {
"temperature": 0.8,
"top_p": 0.9,
"seed": 42
}
}'
- lang: bash
label: Structured outputs
source: |
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "What are the populations of the United States and Canada?",
"stream": false,
"format": {
"type": "object",
"properties": {
"countries": {
"type": "array",
"items": {
"type": "object",
"properties": {
"country": {"type": "string"},
"population": {"type": "integer"}
},
"required": ["country", "population"]
}
}
},
"required": ["countries"]
}
}'
- lang: bash
label: With images
source: |
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "What is in this picture?",
"images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
}'
- lang: bash
label: Load model
source: |
curl http://localhost:11434/api/generate -d '{
"model": "gemma3"
}'
- lang: bash
label: Unload model
source: |
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"keep_alive": 0
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/GenerateRequest"
example:
model: gemma3
prompt: Why is the sky blue?
responses:
"200":
description: Generation responses
content:
application/json:
schema:
$ref: "#/components/schemas/GenerateResponse"
example:
model: "gemma3"
created_at: "2025-10-17T23:14:07.414671Z"
response: "Hello! How can I help you today?"
done: true
done_reason: "stop"
total_duration: 174560334
load_duration: 101397084
prompt_eval_count: 11
prompt_eval_duration: 13074791
eval_count: 18
eval_duration: 52479709
application/x-ndjson:
schema:
$ref: "#/components/schemas/GenerateStreamEvent"
/api/chat:
post:
summary: Generate a chat message
description: Generate the next chat message in a conversation between a user and an assistant.
operationId: chat
x-mint:
href: /api/chat
x-codeSamples:
- lang: bash
label: Default
source: |
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [
{
"role": "user",
"content": "why is the sky blue?"
}
]
}'
- lang: bash
label: Non-streaming
source: |
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [
{
"role": "user",
"content": "why is the sky blue?"
}
],
"stream": false
}'
- lang: bash
label: Structured outputs
source: |
curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
"model": "gemma3",
"messages": [
{
"role": "user",
"content": "What are the populations of the United States and Canada?"
}
],
"stream": false,
"format": {
"type": "object",
"properties": {
"countries": {
"type": "array",
"items": {
"type": "object",
"properties": {
"country": {"type": "string"},
"population": {"type": "integer"}
},
"required": ["country", "population"]
}
}
},
"required": ["countries"]
}
}'
- lang: bash
label: Tool calling
source: |
curl http://localhost:11434/api/chat -d '{
"model": "qwen3",
"messages": [
{
"role": "user",
"content": "What is the weather today in Paris?"
}
],
"stream": false,
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location to get the weather for, e.g. San Francisco, CA"
},
"format": {
"type": "string",
"description": "The format to return the weather in, e.g. 'celsius' or 'fahrenheit'",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location", "format"]
}
}
}
]
}'
- lang: bash
label: Thinking
source: |
curl http://localhost:11434/api/chat -d '{
"model": "gpt-oss",
"messages": [
{
"role": "user",
"content": "What is 1+1?"
}
],
"think": "low"
}'
- lang: bash
label: Images
source: |
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [
{
"role": "user",
"content": "What is in this image?",
"images": [
"iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"
]
}
]
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/ChatRequest"
responses:
"200":
description: Chat response
content:
application/json:
schema:
$ref: "#/components/schemas/ChatResponse"
example:
model: "gemma3"
created_at: "2025-10-17T23:14:07.414671Z"
message:
role: "assistant"
content: "Hello! How can I help you today?"
done: true
done_reason: "stop"
total_duration: 174560334
load_duration: 101397084
prompt_eval_count: 11
prompt_eval_duration: 13074791
eval_count: 18
eval_duration: 52479709
application/x-ndjson:
schema:
$ref: "#/components/schemas/ChatStreamEvent"
/api/embed:
post:
summary: Generate embeddings
description: Creates vector embeddings representing the input text
operationId: embed
x-mint:
href: /api/embed
x-codeSamples:
- lang: bash
label: Default
source: |
curl http://localhost:11434/api/embed -d '{
"model": "embeddinggemma",
"input": "Why is the sky blue?"
}'
- lang: bash
label: Multiple inputs
source: |
curl http://localhost:11434/api/embed -d '{
"model": "embeddinggemma",
"input": [
"Why is the sky blue?",
"Why is the grass green?"
]
}'
- lang: bash
label: Truncation
source: |
curl http://localhost:11434/api/embed -d '{
"model": "embeddinggemma",
"input": "Generate embeddings for this text",
"truncate": true
}'
- lang: bash
label: Dimensions
source: |
curl http://localhost:11434/api/embed -d '{
"model": "embeddinggemma",
"input": "Generate embeddings for this text",
"dimensions": 128
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/EmbedRequest"
example:
model: embeddinggemma
input: "Generate embeddings for this text"
responses:
"200":
description: Vector embeddings for the input text
content:
application/json:
schema:
$ref: "#/components/schemas/EmbedResponse"
example:
model: "embeddinggemma"
embeddings:
- [
0.010071029,
-0.0017594862,
0.05007221,
0.04692972,
0.054916814,
0.008599704,
0.105441414,
-0.025878139,
0.12958129,
0.031952348,
]
total_duration: 14143917
load_duration: 1019500
prompt_eval_count: 8
/api/tags:
get:
summary: List models
description: Fetch a list of models and their details
operationId: list
x-mint:
href: /api/tags
x-codeSamples:
- lang: bash
label: List models
source: |
curl http://localhost:11434/api/tags
responses:
"200":
description: List available models
content:
application/json:
schema:
$ref: "#/components/schemas/ListResponse"
example:
models:
- name: "gemma3"
modified_at: "2025-10-03T23:34:03.409490317-07:00"
size: 3338801804
digest: "a2af6cc3eb7fa8be8504abaf9b04e88f17a119ec3f04a3addf55f92841195f5a"
details:
format: "gguf"
family: "gemma"
families:
- "gemma"
parameter_size: "4.3B"
quantization_level: "Q4_K_M"
/api/ps:
get:
summary: List running models
description: Retrieve a list of models that are currently running
operationId: ps
x-mint:
href: /api/ps
x-codeSamples:
- lang: bash
label: List running models
source: |
curl http://localhost:11434/api/ps
responses:
"200":
description: Models currently loaded into memory
content:
application/json:
schema:
$ref: "#/components/schemas/PsResponse"
example:
models:
- model: "gemma3"
size: 6591830464
digest: "a2af6cc3eb7fa8be8504abaf9b04e88f17a119ec3f04a3addf55f92841195f5a"
details:
parent_model: ""
format: "gguf"
family: "gemma3"
families:
- "gemma3"
parameter_size: "4.3B"
quantization_level: "Q4_K_M"
expires_at: "2025-10-17T16:47:07.93355-07:00"
size_vram: 5333539264
context_length: 4096
/api/show:
post:
summary: Show model details
operationId: show
x-codeSamples:
- lang: bash
label: Default
source: |
curl http://localhost:11434/api/show -d '{
"model": "gemma3"
}'
- lang: bash
label: Verbose
source: |
curl http://localhost:11434/api/show -d '{
"model": "gemma3",
"verbose": true
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/ShowRequest"
example:
model: gemma3
responses:
"200":
description: Model information
content:
application/json:
schema:
$ref: "#/components/schemas/ShowResponse"
example:
parameters: "temperature 0.7\nnum_ctx 2048"
license: "Gemma Terms of Use \n\nLast modified: February 21, 2024..."
capabilities:
- "completion"
- "vision"
modified_at: "2025-08-14T15:49:43.634137516-07:00"
details:
parent_model: ""
format: "gguf"
family: "gemma3"
families:
- "gemma3"
parameter_size: "4.3B"
quantization_level: "Q4_K_M"
model_info:
gemma3.attention.head_count: 8
gemma3.attention.head_count_kv: 4
gemma3.attention.key_length: 256
gemma3.attention.sliding_window: 1024
gemma3.attention.value_length: 256
gemma3.block_count: 34
gemma3.context_length: 131072
gemma3.embedding_length: 2560
gemma3.feed_forward_length: 10240
gemma3.mm.tokens_per_image: 256
gemma3.vision.attention.head_count: 16
gemma3.vision.attention.layer_norm_epsilon: 0.000001
gemma3.vision.block_count: 27
gemma3.vision.embedding_length: 1152
gemma3.vision.feed_forward_length: 4304
gemma3.vision.image_size: 896
gemma3.vision.num_channels: 3
gemma3.vision.patch_size: 14
general.architecture: "gemma3"
general.file_type: 15
general.parameter_count: 4299915632
general.quantization_version: 2
tokenizer.ggml.add_bos_token: true
tokenizer.ggml.add_eos_token: false
tokenizer.ggml.add_padding_token: false
tokenizer.ggml.add_unknown_token: false
tokenizer.ggml.bos_token_id: 2
tokenizer.ggml.eos_token_id: 1
tokenizer.ggml.merges: null
tokenizer.ggml.model: "llama"
tokenizer.ggml.padding_token_id: 0
tokenizer.ggml.pre: "default"
tokenizer.ggml.scores: null
tokenizer.ggml.token_type: null
tokenizer.ggml.tokens: null
tokenizer.ggml.unknown_token_id: 3
/api/create:
post:
summary: Create a model
operationId: create
x-mint:
href: /api/create
x-codeSamples:
- lang: bash
label: Default
source: |
curl http://localhost:11434/api/create -d '{
"from": "gemma3",
"model": "alpaca",
"system": "You are Alpaca, a helpful AI assistant. You only answer with Emojis."
}'
- lang: bash
label: Create from existing
source: |
curl http://localhost:11434/api/create -d '{
"model": "ollama",
"from": "gemma3",
"system": "You are Ollama the llama."
}'
- lang: bash
label: Quantize
source: |
curl http://localhost:11434/api/create -d '{
"model": "llama3.1:8b-instruct-Q4_K_M",
"from": "llama3.1:8b-instruct-fp16",
"quantize": "q4_K_M"
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/CreateRequest"
example:
model: mario
from: gemma3
system: "You are Mario from Super Mario Bros."
responses:
"200":
description: Stream of create status updates
content:
application/json:
schema:
$ref: "#/components/schemas/StatusResponse"
example:
status: "success"
application/x-ndjson:
schema:
$ref: "#/components/schemas/StatusEvent"
example:
status: "success"
/api/copy:
post:
summary: Copy a model
operationId: copy
x-mint:
href: /api/copy
x-codeSamples:
- lang: bash
label: Copy a model to a new name
source: |
curl http://localhost:11434/api/copy -d '{
"source": "gemma3",
"destination": "gemma3-backup"
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/CopyRequest"
example:
source: gemma3
destination: gemma3-backup
/api/pull:
post:
summary: Pull a model
operationId: pull
x-mint:
href: /api/pull
x-codeSamples:
- lang: bash
label: Default
source: |
curl http://localhost:11434/api/pull -d '{
"model": "gemma3"
}'
- lang: bash
label: Non-streaming
source: |
curl http://localhost:11434/api/pull -d '{
"model": "gemma3",
"stream": false
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/PullRequest"
example:
model: gemma3
responses:
"200":
description: Pull status updates.
content:
application/json:
schema:
$ref: "#/components/schemas/StatusResponse"
example:
status: "success"
application/x-ndjson:
schema:
$ref: "#/components/schemas/StatusEvent"
example:
status: "success"
/api/push:
post:
summary: Push a model
operationId: push
x-mint:
href: /api/push
x-codeSamples:
- lang: bash
label: Push model
source: |
curl http://localhost:11434/api/push -d '{
"model": "my-username/my-model"
}'
- lang: bash
label: Non-streaming
source: |
curl http://localhost:11434/api/push -d '{
"model": "my-username/my-model",
"stream": false
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/PushRequest"
example:
model: my-username/my-model
responses:
"200":
description: Push status updates.
content:
application/json:
schema:
$ref: "#/components/schemas/StatusResponse"
example:
status: "success"
application/x-ndjson:
schema:
$ref: "#/components/schemas/StatusEvent"
example:
status: "success"
/api/delete:
delete:
summary: Delete a model
operationId: delete
x-mint:
href: /api/delete
x-codeSamples:
- lang: bash
label: Delete model
source: |
curl -X DELETE http://localhost:11434/api/delete -d '{
"model": "gemma3"
}'
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/DeleteRequest"
example:
model: gemma3
responses:
"200":
description: Deletion status updates.
content:
application/json:
schema:
$ref: "#/components/schemas/StatusResponse"
example:
status: "success"
application/x-ndjson:
schema:
$ref: "#/components/schemas/StatusEvent"
/api/version:
get:
summary: Get version
description: Retrieve the version of the Ollama
operationId: version
x-codeSamples:
- lang: bash
label: Default
source: |
curl http://localhost:11434/api/version
responses:
"200":
description: Version information
content:
application/json:
schema:
$ref: "#/components/schemas/VersionResponse"
example:
version: "0.12.6"
---
title: Quickstart
---
This quickstart will walk your through running your first model with Ollama. To get started, download Ollama on macOS, Windows or Linux.
<a
href="https://ollama.com/download"
target="_blank"
className="inline-block px-6 py-2 bg-black rounded-full dark:bg-neutral-700 text-white font-normal border-none"
>
Download Ollama
</a>
## Run a model
<Tabs>
<Tab title="CLI">
Open a terminal and run the command:
```
ollama run gemma3
```
</Tab>
<Tab title="cURL">
```
ollama pull gemma3
```
Lastly, chat with the model:
```shell
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [{
"role": "user",
"content": "Hello there!"
}],
"stream": false
}'
```
</Tab>
<Tab title="Python">
Start by downloading a model:
```
ollama pull gemma3
```
Then install Ollama's Python library:
```
pip install ollama
```
Lastly, chat with the model:
```python
from ollama import chat
from ollama import ChatResponse
response: ChatResponse = chat(model='gemma3', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)
```
</Tab>
<Tab title="JavaScript">
Start by downloading a model:
```
ollama pull gemma3
```
Then install the Ollama JavaScript library:
```
npm i ollama
```
Lastly, chat with the model:
```shell
import ollama from 'ollama'
const response = await ollama.chat({
model: 'gemma3',
messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(response.message.content)
```
</Tab>
</Tabs>
See a full list of available models [here](https://ollama.com/models).
body {
font-family: ui-sans-serif, system-ui, sans-serif, Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol,Noto Color Emoji;
}
pre, code, .font-mono {
font-family: ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;
}
.nav-logo {
height: 44px;
}
.eyebrow {
color: #666;
font-weight: 400;
}
# Template ---
title: Template
---
Ollama provides a powerful templating engine backed by Go's built-in templating engine to construct prompts for your large language model. This feature is a valuable tool to get the most out of your models. Ollama provides a powerful templating engine backed by Go's built-in templating engine to construct prompts for your large language model. This feature is a valuable tool to get the most out of your models.
...@@ -6,13 +8,13 @@ Ollama provides a powerful templating engine backed by Go's built-in templating ...@@ -6,13 +8,13 @@ Ollama provides a powerful templating engine backed by Go's built-in templating
A basic Go template consists of three main parts: A basic Go template consists of three main parts:
* **Layout**: The overall structure of the template. - **Layout**: The overall structure of the template.
* **Variables**: Placeholders for dynamic data that will be replaced with actual values when the template is rendered. - **Variables**: Placeholders for dynamic data that will be replaced with actual values when the template is rendered.
* **Functions**: Custom functions or logic that can be used to manipulate the template's content. - **Functions**: Custom functions or logic that can be used to manipulate the template's content.
Here's an example of a simple chat template: Here's an example of a simple chat template:
```go ```gotmpl
{{- range .Messages }} {{- range .Messages }}
{{ .Role }}: {{ .Content }} {{ .Role }}: {{ .Content }}
{{- end }} {{- end }}
...@@ -20,9 +22,9 @@ Here's an example of a simple chat template: ...@@ -20,9 +22,9 @@ Here's an example of a simple chat template:
In this example, we have: In this example, we have:
* A basic messages structure (layout) - A basic messages structure (layout)
* Three variables: `Messages`, `Role`, and `Content` (variables) - Three variables: `Messages`, `Role`, and `Content` (variables)
* A custom function (action) that iterates over an array of items (`range .Messages`) and displays each item - A custom function (action) that iterates over an array of items (`range .Messages`) and displays each item
## Adding templates to your model ## Adding templates to your model
...@@ -61,7 +63,7 @@ TEMPLATE """{{- if .System }}<|start_header_id|>system<|end_header_id|> ...@@ -61,7 +63,7 @@ TEMPLATE """{{- if .System }}<|start_header_id|>system<|end_header_id|>
`Messages[].Role` (string): role which can be one of `system`, `user`, `assistant`, or `tool` `Messages[].Role` (string): role which can be one of `system`, `user`, `assistant`, or `tool`
`Messages[].Content` (string): message content `Messages[].Content` (string): message content
`Messages[].ToolCalls` (list): list of tools the model wants to call `Messages[].ToolCalls` (list): list of tools the model wants to call
...@@ -99,9 +101,9 @@ TEMPLATE """{{- if .System }}<|start_header_id|>system<|end_header_id|> ...@@ -99,9 +101,9 @@ TEMPLATE """{{- if .System }}<|start_header_id|>system<|end_header_id|>
Keep the following tips and best practices in mind when working with Go templates: Keep the following tips and best practices in mind when working with Go templates:
* **Be mindful of dot**: Control flow structures like `range` and `with` changes the value `.` - **Be mindful of dot**: Control flow structures like `range` and `with` changes the value `.`
* **Out-of-scope variables**: Use `$.` to reference variables not currently in scope, starting from the root - **Out-of-scope variables**: Use `$.` to reference variables not currently in scope, starting from the root
* **Whitespace control**: Use `-` to trim leading (`{{-`) and trailing (`-}}`) whitespace - **Whitespace control**: Use `-` to trim leading (`{{-`) and trailing (`-}}`) whitespace
## Examples ## Examples
...@@ -155,13 +157,14 @@ CodeLlama [7B](https://ollama.com/library/codellama:7b-code) and [13B](https://o ...@@ -155,13 +157,14 @@ CodeLlama [7B](https://ollama.com/library/codellama:7b-code) and [13B](https://o
<PRE> {{ .Prompt }} <SUF>{{ .Suffix }} <MID> <PRE> {{ .Prompt }} <SUF>{{ .Suffix }} <MID>
``` ```
> [!NOTE] <Note>
> CodeLlama 34B and 70B code completion and all instruct and Python fine-tuned models do not support fill-in-middle. CodeLlama 34B and 70B code completion and all instruct and Python fine-tuned models do not support fill-in-middle.
</Note>
#### Codestral #### Codestral
Codestral [22B](https://ollama.com/library/codestral:22b) supports fill-in-middle. Codestral [22B](https://ollama.com/library/codestral:22b) supports fill-in-middle.
```go ```gotmpl
[SUFFIX]{{ .Suffix }}[PREFIX] {{ .Prompt }} [SUFFIX]{{ .Suffix }}[PREFIX] {{ .Prompt }}
``` ```
# How to troubleshoot issues ---
title: Troubleshooting
description: How to troubleshoot issues encountered with Ollama
---
Sometimes Ollama may not perform as expected. One of the best ways to figure out what happened is to take a look at the logs. Find the logs on **Mac** by running the command: Sometimes Ollama may not perform as expected. One of the best ways to figure out what happened is to take a look at the logs. Find the logs on **Mac** by running the command:
...@@ -23,9 +26,11 @@ docker logs <container-name> ...@@ -23,9 +26,11 @@ docker logs <container-name>
If manually running `ollama serve` in a terminal, the logs will be on that terminal. If manually running `ollama serve` in a terminal, the logs will be on that terminal.
When you run Ollama on **Windows**, there are a few different locations. You can view them in the explorer window by hitting `<cmd>+R` and type in: When you run Ollama on **Windows**, there are a few different locations. You can view them in the explorer window by hitting `<cmd>+R` and type in:
- `explorer %LOCALAPPDATA%\Ollama` to view logs. The most recent server logs will be in `server.log` and older logs will be in `server-#.log`
- `explorer %LOCALAPPDATA%\Ollama` to view logs. The most recent server logs will be in `server.log` and older logs will be in `server-#.log`
- `explorer %LOCALAPPDATA%\Programs\Ollama` to browse the binaries (The installer adds this to your user PATH) - `explorer %LOCALAPPDATA%\Programs\Ollama` to browse the binaries (The installer adds this to your user PATH)
- `explorer %HOMEPATH%\.ollama` to browse where models and configuration is stored - `explorer %HOMEPATH%\.ollama` to browse where models and configuration is stored
- `explorer %TEMP%` where temporary executable files are stored in one or more `ollama*` directories
To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal
...@@ -38,14 +43,26 @@ Join the [Discord](https://discord.gg/ollama) for help interpreting the logs. ...@@ -38,14 +43,26 @@ Join the [Discord](https://discord.gg/ollama) for help interpreting the logs.
## LLM libraries ## LLM libraries
Ollama includes multiple LLM libraries compiled for different GPU libraries and versions. Ollama tries to pick the best one based on the capabilities of your system. If this autodetection has problems, or you run into other problems (e.g. crashes in your GPU) you can workaround this by forcing a specific LLM library. Ollama includes multiple LLM libraries compiled for different GPUs and CPU vector features. Ollama tries to pick the best one based on the capabilities of your system. If this autodetection has problems, or you run into other problems (e.g. crashes in your GPU) you can workaround this by forcing a specific LLM library. `cpu_avx2` will perform the best, followed by `cpu_avx` an the slowest but most compatible is `cpu`. Rosetta emulation under MacOS will work with the `cpu` library.
In the server log, you will see a message that looks something like this (varies from release to release):
```
Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]
```
**Experimental LLM Library Override** **Experimental LLM Library Override**
You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to limit autodetection, so for example, if you have both CUDA and AMD GPUs, but want to force the CUDA v13 only, use: You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:
```shell ```shell
OLLAMA_LLM_LIBRARY="cuda_v13" ollama serve OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
```
You can see what features your CPU has with the following.
```shell
cat /proc/cpuinfo| grep flags | head -1
``` ```
## Installing older or pre-release versions on Linux ## Installing older or pre-release versions on Linux
...@@ -56,13 +73,17 @@ If you run into problems on Linux and want to install an older version, or you'd ...@@ -56,13 +73,17 @@ If you run into problems on Linux and want to install an older version, or you'd
curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.5.7 sh curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.5.7 sh
``` ```
## Linux tmp noexec
If your system is configured with the "noexec" flag where Ollama stores its temporary executable files, you can specify an alternate location by setting OLLAMA_TMPDIR to a location writable by the user ollama runs as. For example OLLAMA_TMPDIR=/usr/share/ollama/
## Linux docker ## Linux docker
If Ollama initially works on the GPU in a docker container, but then switches to running on CPU after some period of time with errors in the server log reporting GPU discovery failures, this can be resolved by disabling systemd cgroup management in Docker. Edit `/etc/docker/daemon.json` on the host and add `"exec-opts": ["native.cgroupdriver=cgroupfs"]` to the docker configuration. If Ollama initially works on the GPU in a docker container, but then switches to running on CPU after some period of time with errors in the server log reporting GPU discovery failures, this can be resolved by disabling systemd cgroup management in Docker. Edit `/etc/docker/daemon.json` on the host and add `"exec-opts": ["native.cgroupdriver=cgroupfs"]` to the docker configuration.
## NVIDIA GPU Discovery ## NVIDIA GPU Discovery
When Ollama starts up, it takes inventory of the GPUs present in the system to determine compatibility and how much VRAM is available. Sometimes this discovery can fail to find your GPUs. In general, running the latest driver will yield the best results. When Ollama starts up, it takes inventory of the GPUs present in the system to determine compatibility and how much VRAM is available. Sometimes this discovery can fail to find your GPUs. In general, running the latest driver will yield the best results.
### Linux NVIDIA Troubleshooting ### Linux NVIDIA Troubleshooting
...@@ -70,28 +91,26 @@ If you are using a container to run Ollama, make sure you've set up the containe ...@@ -70,28 +91,26 @@ If you are using a container to run Ollama, make sure you've set up the containe
Sometimes the Ollama can have difficulties initializing the GPU. When you check the server logs, this can show up as various error codes, such as "3" (not initialized), "46" (device unavailable), "100" (no device), "999" (unknown), or others. The following troubleshooting techniques may help resolve the problem Sometimes the Ollama can have difficulties initializing the GPU. When you check the server logs, this can show up as various error codes, such as "3" (not initialized), "46" (device unavailable), "100" (no device), "999" (unknown), or others. The following troubleshooting techniques may help resolve the problem
- If you are using a container, is the container runtime working? Try `docker run --gpus all ubuntu nvidia-smi` - if this doesn't work, Ollama won't be able to see your NVIDIA GPU. - If you are using a container, is the container runtime working? Try `docker run --gpus all ubuntu nvidia-smi` - if this doesn't work, Ollama won't be able to see your NVIDIA GPU.
- Is the uvm driver loaded? `sudo nvidia-modprobe -u` - Is the uvm driver loaded? `sudo nvidia-modprobe -u`
- Try reloading the nvidia_uvm driver - `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm` - Try reloading the nvidia_uvm driver - `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm`
- Try rebooting - Try rebooting
- Make sure you're running the latest nvidia drivers - Make sure you're running the latest nvidia drivers
If none of those resolve the problem, gather additional information and file an issue: If none of those resolve the problem, gather additional information and file an issue:
- Set `CUDA_ERROR_LEVEL=50` and try again to get more diagnostic logs - Set `CUDA_ERROR_LEVEL=50` and try again to get more diagnostic logs
- Check dmesg for any errors `sudo dmesg | grep -i nvrm` and `sudo dmesg | grep -i nvidia` - Check dmesg for any errors `sudo dmesg | grep -i nvrm` and `sudo dmesg | grep -i nvidia`
You may get more details for initialization failures by enabling debug prints in the uvm driver. You should only use this temporarily while troubleshooting
- `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm uvm_debug_prints=1`
## AMD GPU Discovery ## AMD GPU Discovery
On linux, AMD GPU access typically requires `video` and/or `render` group membership to access the `/dev/kfd` device. If permissions are not set up correctly, Ollama will detect this and report an error in the server log. On linux, AMD GPU access typically requires `video` and/or `render` group membership to access the `/dev/kfd` device. If permissions are not set up correctly, Ollama will detect this and report an error in the server log.
When running in a container, in some Linux distributions and container runtimes, the ollama process may be unable to access the GPU. Use `ls -lnd /dev/kfd /dev/dri /dev/dri/*` on the host system to determine the **numeric** group IDs on your system, and pass additional `--group-add ...` arguments to the container so it can access the required devices. For example, in the following output `crw-rw---- 1 0 44 226, 0 Sep 16 16:55 /dev/dri/card0` the group ID column is `44` When running in a container, in some Linux distributions and container runtimes, the ollama process may be unable to access the GPU. Use `ls -lnd /dev/kfd /dev/dri /dev/dri/*` on the host system to determine the **numeric** group IDs on your system, and pass additional `--group-add ...` arguments to the container so it can access the required devices. For example, in the following output `crw-rw---- 1 0 44 226, 0 Sep 16 16:55 /dev/dri/card0` the group ID column is `44`
If you are experiencing problems getting Ollama to correctly discover or use your GPU for inference, the following may help isolate the failure. If you are experiencing problems getting Ollama to correctly discover or use your GPU for inference, the following may help isolate the failure.
- `AMD_LOG_LEVEL=3` Enable info log levels in the AMD HIP/ROCm libraries. This can help show more detailed error codes that can help troubleshoot problems
- `AMD_LOG_LEVEL=3` Enable info log levels in the AMD HIP/ROCm libraries. This can help show more detailed error codes that can help troubleshoot problems
- `OLLAMA_DEBUG=1` During GPU discovery additional information will be reported - `OLLAMA_DEBUG=1` During GPU discovery additional information will be reported
- Check dmesg for any errors from amdgpu or kfd drivers `sudo dmesg | grep -i amdgpu` and `sudo dmesg | grep -i kfd` - Check dmesg for any errors from amdgpu or kfd drivers `sudo dmesg | grep -i amdgpu` and `sudo dmesg | grep -i kfd`
...@@ -103,4 +122,4 @@ If you experience gibberish responses when models load across multiple AMD GPUs ...@@ -103,4 +122,4 @@ If you experience gibberish responses when models load across multiple AMD GPUs
## Windows Terminal Errors ## Windows Terminal Errors
Older versions of Windows 10 (e.g., 21H1) are known to have a bug where the standard terminal program does not display control characters correctly. This can result in a long string of strings like `←[?25h←[?25l` being displayed, sometimes erroring with `The parameter is incorrect` To resolve this problem, please update to Win 10 22H1 or newer. Older versions of Windows 10 (e.g., 21H1) are known to have a bug where the standard terminal program does not display control characters correctly. This can result in a long string of strings like `←[?25h←[?25l` being displayed, sometimes erroring with `The parameter is incorrect` To resolve this problem, please update to Win 10 22H1 or newer.
# Ollama Windows ---
title: Windows
---
Welcome to Ollama for Windows. Welcome to Ollama for Windows.
...@@ -7,20 +9,20 @@ No more WSL required! ...@@ -7,20 +9,20 @@ No more WSL required!
Ollama now runs as a native Windows application, including NVIDIA and AMD Radeon GPU support. Ollama now runs as a native Windows application, including NVIDIA and AMD Radeon GPU support.
After installing Ollama for Windows, Ollama will run in the background and After installing Ollama for Windows, Ollama will run in the background and
the `ollama` command line is available in `cmd`, `powershell` or your favorite the `ollama` command line is available in `cmd`, `powershell` or your favorite
terminal application. As usual the Ollama [api](./api.md) will be served on terminal application. As usual the Ollama [API](/api) will be served on
`http://localhost:11434`. `http://localhost:11434`.
## System Requirements ## System Requirements
* Windows 10 22H2 or newer, Home or Pro - Windows 10 22H2 or newer, Home or Pro
* NVIDIA 452.39 or newer Drivers if you have an NVIDIA card - NVIDIA 452.39 or newer Drivers if you have an NVIDIA card
* AMD Radeon Driver https://www.amd.com/en/support if you have a Radeon card - AMD Radeon Driver https://www.amd.com/en/support if you have a Radeon card
Ollama uses unicode characters for progress indication, which may render as unknown squares in some older terminal fonts in Windows 10. If you see this, try changing your terminal font settings. Ollama uses unicode characters for progress indication, which may render as unknown squares in some older terminal fonts in Windows 10. If you see this, try changing your terminal font settings.
## Filesystem Requirements ## Filesystem Requirements
The Ollama install does not require Administrator, and installs in your home directory by default. You'll need at least 4GB of space for the binary install. Once you've installed Ollama, you'll need additional space for storing the Large Language models, which can be tens to hundreds of GB in size. If your home directory doesn't have enough space, you can change where the binaries are installed, and where the models are stored. The Ollama install does not require Administrator, and installs in your home directory by default. You'll need at least 4GB of space for the binary install. Once you've installed Ollama, you'll need additional space for storing the Large Language models, which can be tens to hundreds of GB in size. If your home directory doesn't have enough space, you can change where the binaries are installed, and where the models are stored.
### Changing Install Location ### Changing Install Location
...@@ -30,6 +32,20 @@ To install the Ollama application in a location different than your home directo ...@@ -30,6 +32,20 @@ To install the Ollama application in a location different than your home directo
OllamaSetup.exe /DIR="d:\some\location" OllamaSetup.exe /DIR="d:\some\location"
``` ```
### Changing Model Location
To change where Ollama stores the downloaded models instead of using your home directory, set the environment variable `OLLAMA_MODELS` in your user account.
1. Start the Settings (Windows 11) or Control Panel (Windows 10) application and search for _environment variables_.
2. Click on _Edit environment variables for your account_.
3. Edit or create a new variable for your user account for `OLLAMA_MODELS` where you want the models stored
4. Click OK/Apply to save.
If Ollama is already running, Quit the tray application and relaunch it from the Start menu, or a new terminal started after you saved the environment variables.
## API Access ## API Access
Here's a quick example showing API access from `powershell` Here's a quick example showing API access from `powershell`
...@@ -40,22 +56,24 @@ Here's a quick example showing API access from `powershell` ...@@ -40,22 +56,24 @@ Here's a quick example showing API access from `powershell`
## Troubleshooting ## Troubleshooting
Ollama on Windows stores files in a few different locations. You can view them in Ollama on Windows stores files in a few different locations. You can view them in
the explorer window by hitting `<Ctrl>+R` and type in: the explorer window by hitting `<Ctrl>+R` and type in:
- `explorer %LOCALAPPDATA%\Ollama` contains logs, and downloaded updates - `explorer %LOCALAPPDATA%\Ollama` contains logs, and downloaded updates
- *app.log* contains most resent logs from the GUI application - _app.log_ contains most resent logs from the GUI application
- *server.log* contains the most recent server logs - _server.log_ contains the most recent server logs
- *upgrade.log* contains log output for upgrades - _upgrade.log_ contains log output for upgrades
- `explorer %LOCALAPPDATA%\Programs\Ollama` contains the binaries (The installer adds this to your user PATH) - `explorer %LOCALAPPDATA%\Programs\Ollama` contains the binaries (The installer adds this to your user PATH)
- `explorer %HOMEPATH%\.ollama` contains models and configuration - `explorer %HOMEPATH%\.ollama` contains models and configuration
- `explorer %TEMP%` contains temporary executable files in one or more `ollama*` directories
## Uninstall ## Uninstall
The Ollama Windows installer registers an Uninstaller application. Under `Add or remove programs` in Windows Settings, you can uninstall Ollama. The Ollama Windows installer registers an Uninstaller application. Under `Add or remove programs` in Windows Settings, you can uninstall Ollama.
> [!NOTE]
> If you have [changed the OLLAMA_MODELS location](#changing-model-location), the installer will not remove your downloaded models
<Note>
If you have [changed the OLLAMA_MODELS location](#changing-model-location), the installer will not remove your downloaded models
</Note>
## Standalone CLI ## Standalone CLI
...@@ -66,11 +84,12 @@ help you keep up to date. ...@@ -66,11 +84,12 @@ help you keep up to date.
If you'd like to install or integrate Ollama as a service, a standalone If you'd like to install or integrate Ollama as a service, a standalone
`ollama-windows-amd64.zip` zip file is available containing only the Ollama CLI `ollama-windows-amd64.zip` zip file is available containing only the Ollama CLI
and GPU library dependencies for Nvidia. If you have an AMD GPU, also download and GPU library dependencies for Nvidia. If you have an AMD GPU, also download
and extract the additional ROCm package `ollama-windows-amd64-rocm.zip` into the and extract the additional ROCm package `ollama-windows-amd64-rocm.zip` into the
same directory. Both zip files are necessary for a complete AMD installation. same directory. This allows for embedding Ollama in existing applications, or
This allows for embedding Ollama in existing applications, or running it as a running it as a system service via `ollama serve` with tools such as
system service via `ollama serve` with tools such as [NSSM](https://nssm.cc/). [NSSM](https://nssm.cc/).
> [!NOTE] <Note>
> If you are upgrading from a prior version, you should remove the old directories first. If you are upgrading from a prior version, you should remove the old directories first.
</Note>
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment