docs: add docs for docs.ollama.com (#12805)

3d99d977 · Parth Sareen · GitHub · 6d02a43a · 3d99d977 · 3d99d977
Unverified Commit 3d99d977 authored Oct 28, 2025 by Parth Sareen Committed by GitHub Oct 28, 2025
14 changed files
--- a/docs/integrations/vscode.mdx
+++ b/docs/integrations/vscode.mdx
+---
+title: VS Code 
+---
+## Install
+Install [VSCode](https://code.visualstudio.com/download). 
+## Usage with Ollama 
+1. Open Copilot side bar found in top right window
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/vscode-sidebar.png" 
+    alt="VSCode chat Sidebar"
+    width="75%"
+  />
+</div>
+2. Select the model drowpdown > **Manage models**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/vscode-models.png" 
+    alt="VSCode model picker"
+    width="75%"
+  />
+</div>
+3. Enter **Ollama** under **Provider Dropdown** and select desired models (e.g `qwen3, qwen3-coder:480b-cloud`)
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/vscode-model-options.png" 
+    alt="VSCode model options dropdown"
+    width="75%"
+  />
+</div>
--- a/docs/integrations/xcode.mdx
+++ b/docs/integrations/xcode.mdx
+---
+title: Xcode 
+---
+## Install
+Install [XCode](https://developer.apple.com/xcode/)
+## Usage with Ollama 
+<Note> Ensure Apple Intelligence is setup and the latest XCode version is v26.0 </Note>
+1. Click **XCode** in top left corner > **Settings** 
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/xcode-intelligence-window.png" 
+    alt="Xcode Intelligence window"
+    width="50%"
+  />
+</div>
+2. Select **Locally Hosted**, enter port **11434** and click **Add**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/xcode-locally-hosted.png" 
+    alt="Xcode settings"
+    width="50%"
+  />
+</div>
+3. Select the **star icon** on the top left corner and click the **dropdown**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/xcode-chat-icon.png" 
+    alt="Xcode settings"
+    width="50%"
+  />
+</div>
+4. Click **My Account** and select your desired model
+## Connecting to ollama.com directly
+1. Create an [API key](https://ollama.com/settings/keys) from ollama.com 
+2. Select **Internet Hosted** and enter URL as `https://ollama.com`  
+3. Enter your **Ollama API Key** and click **Add** 
\ No newline at end of file
--- a/docs/integrations/zed.mdx
+++ b/docs/integrations/zed.mdx
+---
+title: Zed
+---
+## Install
+Install [Zed](https://zed.dev/download).
+## Usage with Ollama
+1. In Zed, click the **star icon** in the bottom-right corner, then select **Configure**.
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/zed-settings.png" 
+    alt="Zed star icon in bottom right corner"
+    width="50%"
+  />
+</div>
+2. Under **LLM Providers**, choose **Ollama**
+3. Confirm the **Host URL** is `http://localhost:11434`, then click **Connect**
+4. Once connected, select a model under **Ollama**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/zed-ollama-dropdown.png" 
+    alt="Zed star icon in bottom right corner"
+    width="50%"
+  />
+</div>
+## Connecting to ollama.com
+1. Create an [API key](https://ollama.com/settings/keys) on **ollama.com**
+2. In Zed, open the **star icon** → **Configure**
+3. Under **LLM Providers**, select **Ollama**
+4. Set the **API URL** to `https://ollama.com`
--- a/docs/linux.mdx
+++ b/docs/linux.mdx
-# Linux
+---
+title: Linux
+---
 ## Install
@@ -10,15 +12,16 @@ curl -fsSL https://ollama.com/install.sh | sh
 ## Manual install
-> [!NOTE]
+<Note>
-> If you are upgrading from a prior version, you **MUST** remove the old libraries with `sudo rm -rf /usr/lib/ollama` first.
+  If you are upgrading from a prior version, you should remove the old libraries
+  with `sudo rm -rf /usr/lib/ollama` first.
+</Note>
 Download and extract the package:
 ```shell
-curl -LO https://ollama.com/download/ollama-linux-amd64.tgz
+curl -fsSL https://ollama.com/download/ollama-linux-amd64.tgz \
-sudo rm -rf /usr/lib/ollama
+    | sudo tar zx -C /usr
-sudo tar -C /usr -xzf ollama-linux-amd64.tgz
 ```
 Start Ollama:
@@ -35,15 +38,11 @@ ollama -v
 ### AMD GPU install
-If you have an AMD GPU, **also** download and extract the additional ROCm package:
+If you have an AMD GPU, also download and extract the additional ROCm package:
-> [!IMPORTANT]
-> The ROCm tgz contains only AMD dependent libraries.  You must extract **both** `ollama-linux-amd64.tgz` and `ollama-linux-amd64-rocm.tgz` into the same location.
 ```shell
-curl -L https://ollama.com/download/ollama-linux-amd64-rocm.tgz -o ollama-linux-amd64-rocm.tgz
+curl -fsSL https://ollama.com/download/ollama-linux-amd64-rocm.tgz \
-sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz
+    | sudo tar zx -C /usr
 ```
 ### ARM64 install
@@ -51,8 +50,8 @@ sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz
 Download and extract the ARM64-specific package:
 ```shell
-curl -L https://ollama.com/download/ollama-linux-arm64.tgz -o ollama-linux-arm64.tgz
+curl -fsSL https://ollama.com/download/ollama-linux-arm64.tgz \
-sudo tar -C /usr -xzf ollama-linux-arm64.tgz
+    | sudo tar zx -C /usr
 ```
 ### Adding Ollama as a startup service (recommended)
@@ -113,12 +112,13 @@ sudo systemctl start ollama
 sudo systemctl status ollama
 ```
-> [!NOTE]
+<Note>
-> While AMD has contributed the `amdgpu` driver upstream to the official linux
+  While AMD has contributed the `amdgpu` driver upstream to the official linux
-> kernel source, the version is older and may not support all ROCm features. We
+  kernel source, the version is older and may not support all ROCm features. We
-> recommend you install the latest driver from
+  recommend you install the latest driver from
-> [AMD](https://www.amd.com/en/support/download/linux-drivers.html) for best support
+  https://www.amd.com/en/support/linux-drivers for best support of your Radeon
-> of your Radeon GPU.
+  GPU.
+</Note>
 ## Customizing
@@ -146,8 +146,8 @@ curl -fsSL https://ollama.com/install.sh | sh
 Or by re-downloading Ollama:
 ```shell
-curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
+curl -fsSL https://ollama.com/download/ollama-linux-amd64.tgz \
-sudo tar -C /usr -xzf ollama-linux-amd64.tgz
+    | sudo tar zx -C /usr
 ```
 ## Installing specific versions
@@ -178,6 +178,12 @@ sudo systemctl disable ollama
 sudo rm /etc/systemd/system/ollama.service
 ```
+Remove ollama libraries from your lib directory (either `/usr/local/lib`, `/usr/lib`, or `/lib`):
+```shell
+sudo rm -r $(which ollama | tr 'bin' 'lib')
+```
 Remove the ollama binary from your bin directory (either `/usr/local/bin`, `/usr/bin`, or `/bin`):
 ```shell
@@ -187,13 +193,7 @@ sudo rm $(which ollama)
 Remove the downloaded models and Ollama service user and group:
 ```shell
-sudo rm -r /usr/share/ollama
 sudo userdel ollama
 sudo groupdel ollama
-```
+sudo rm -r /usr/share/ollama
-Remove installed libraries:
-```shell
-sudo rm -rf /usr/local/lib/ollama
 ```
--- a/docs/logo.svg
+++ b/docs/logo.svg
+<svg width="28" height="28" viewBox="0 0 28 28" fill="none" xmlns="http://www.w3.org/2000/svg">
+<path fill-rule="evenodd" clip-rule="evenodd" d="M7.25558 0.114339C7.61134 0.222519 7.93252 0.400698 8.22405 0.636149C8.70993 1.0256 9.12005 1.58303 9.433 2.24356C9.74758 2.90792 9.95182 3.64354 10.0292 4.38171C11.0662 3.9284 12.2171 3.65235 13.4041 3.57227L13.4881 3.56718C14.921 3.47809 16.3375 3.6779 17.5728 4.17044C17.7391 4.2379 17.9022 4.31044 18.062 4.3868C18.1443 3.66263 18.3453 2.94355 18.6549 2.29447C18.9678 1.63266 19.378 1.07651 19.8622 0.685785C20.1328 0.459579 20.4638 0.281532 20.8323 0.163974C21.2556 0.0367035 21.7053 0.0137947 22.1434 0.110521C22.8039 0.255609 23.3704 0.578877 23.8168 1.04851C24.2253 1.47739 24.5316 2.0272 24.7408 2.68646C25.1196 3.87517 25.1855 5.43933 24.9302 7.32549L25.0175 7.37639L25.0603 7.40058C26.3072 8.13366 27.1752 9.17855 27.6348 10.3914C28.3512 12.284 27.9905 14.4068 26.7552 15.5943L26.7255 15.621L26.7288 15.6248C27.4157 16.5946 27.8324 17.6192 27.9214 18.6793L27.9246 18.7175C28.0301 20.0729 27.5952 21.4373 26.5839 22.7774L26.5723 22.7902L26.5888 22.8207C27.3663 24.2932 27.6101 25.7759 27.3103 27.2574L27.3004 27.307C27.254 27.5234 27.0983 27.7168 26.8677 27.8446C26.637 27.9724 26.3501 28.0246 26.07 27.9892C25.9312 27.9724 25.7982 27.9347 25.6783 27.8782C25.5585 27.8217 25.4543 27.7474 25.3717 27.6595C25.289 27.572 25.2296 27.4725 25.1968 27.3668C25.164 27.2614 25.1585 27.152 25.1806 27.0448C25.4556 25.7301 25.197 24.4116 24.39 23.0702C24.3147 22.9456 24.2812 22.8083 24.2927 22.671C24.3043 22.5338 24.3604 22.401 24.4559 22.2849L24.4624 22.2773C25.4573 21.1013 25.869 19.9482 25.7801 18.8155C25.7043 17.8241 25.2448 16.8504 24.4624 15.9226C24.3103 15.7423 24.2561 15.5229 24.3115 15.3119C24.367 15.1009 24.5277 14.9152 24.7589 14.795L24.7737 14.7874C25.174 14.585 25.5429 14.0683 25.729 13.3619C25.9344 12.5267 25.8808 11.6658 25.5726 10.8496C25.2349 9.95872 24.6173 9.21546 23.7526 8.70765C22.7726 8.12984 21.4747 7.85111 19.8326 7.9313C19.6178 7.94209 19.4039 7.90286 19.2183 7.81869C19.0327 7.73451 18.8841 7.60927 18.7916 7.45912C18.2744 6.61277 17.5201 6.00696 16.5796 5.63151C15.6767 5.2833 14.6658 5.13696 13.661 5.20897C11.6104 5.33497 9.80194 6.22841 9.26335 7.35476C9.18715 7.51329 9.05009 7.65005 8.87052 7.74673C8.69096 7.84338 8.47747 7.89535 8.25864 7.89566C6.50122 7.8982 5.14075 8.21638 4.14592 8.79037C3.28615 9.28673 2.6998 9.98036 2.39015 10.8114C2.10995 11.5937 2.07158 12.4159 2.27815 13.2118C2.46262 13.9219 2.82333 14.5099 3.23674 14.8268L3.24992 14.8357C3.5991 15.0992 3.67321 15.5103 3.42945 15.8348C2.83651 16.6264 2.39345 17.8062 2.32098 18.9402C2.23862 20.2358 2.62733 21.3609 3.50521 22.1678L3.53157 22.192C3.66406 22.3113 3.74924 22.4576 3.77701 22.6133C3.80475 22.769 3.77385 22.9276 3.68804 23.0702C2.73933 24.6432 2.4478 25.9363 2.76239 26.9545C2.81892 27.1662 2.76631 27.3867 2.61573 27.5687C2.46516 27.7509 2.22851 27.8805 1.95615 27.9299C1.68379 27.9795 1.39724 27.9446 1.15746 27.8334C0.917644 27.7219 0.743586 27.5427 0.672268 27.3337C0.272031 26.0381 0.543797 24.5541 1.45133 22.8818L1.47438 22.8373L1.46121 22.822C1.01515 22.3129 0.682282 21.7498 0.476267 21.156L0.468032 21.1318C0.218008 20.391 0.119645 19.6244 0.176502 18.86C0.248972 17.7019 0.634385 16.5157 1.20097 15.5637L1.22074 15.5306L1.21744 15.5281C0.734856 14.9961 0.377443 14.3152 0.179796 13.5618L0.17156 13.5312C-0.100765 12.4803 -0.0482896 11.3945 0.324737 10.3622C0.756268 9.19764 1.6045 8.19729 2.85462 7.47439C2.95345 7.41712 3.05721 7.35985 3.16098 7.3064C2.89909 5.40624 2.96498 3.8319 3.34545 2.63556C3.55463 1.97629 3.86263 1.42648 4.2711 0.997598C4.71581 0.529242 5.2824 0.205974 5.94287 0.0596123C6.38099 -0.0371136 6.83228 -0.0142049 7.25558 0.114339ZM14.0349 11.6832C15.5765 11.6832 16.9996 12.0816 18.0636 12.7714C19.1013 13.4421 19.7189 14.3432 19.7189 15.2405C19.7189 16.3706 19.0502 17.2513 17.8528 17.8139C16.8316 18.2911 15.4629 18.5228 13.8949 18.5228C12.233 18.5228 10.8132 18.1931 9.78876 17.5886C8.77252 16.9904 8.20264 16.1504 8.20264 15.2405C8.20264 14.3407 8.85817 13.437 9.94194 12.7638C11.0422 12.0803 12.4949 11.6832 14.0349 11.6832ZM14.0349 12.8236C12.8922 12.8159 11.7798 13.1075 10.8791 13.6508C10.1198 14.1217 9.68994 14.7136 9.68994 15.2417C9.68994 15.7865 10.0358 16.2968 10.6946 16.685C11.4441 17.1266 12.5459 17.3824 13.8949 17.3824C15.2109 17.3824 16.321 17.1953 17.077 16.8403C17.8396 16.4839 18.23 15.9672 18.23 15.2405C18.23 14.7021 17.8248 14.1077 17.105 13.6419C16.3078 13.1265 15.2274 12.8236 14.0349 12.8236ZM15.1252 14.3636L15.1318 14.3687C15.3295 14.5608 15.2883 14.8396 15.0396 14.9923L14.5587 15.285V15.8526C14.5578 15.979 14.4921 16.0999 14.376 16.1889C14.2599 16.2779 14.1029 16.3277 13.9394 16.3274C13.7758 16.3277 13.6188 16.2779 13.5027 16.1889C13.3866 16.0999 13.3209 15.979 13.3201 15.8526V15.2672L12.8737 14.9897C12.8148 14.9533 12.7659 14.9082 12.7297 14.857C12.6935 14.8059 12.6707 14.7497 12.6628 14.6917C12.6548 14.6337 12.6618 14.5751 12.6833 14.5192C12.7048 14.4633 12.7404 14.4113 12.7881 14.3661C12.8853 14.2747 13.0253 14.2166 13.1776 14.2044C13.3299 14.1923 13.4824 14.2271 13.6017 14.3012L13.9558 14.5201L14.3182 14.2987C14.4371 14.2261 14.588 14.1922 14.7388 14.2043C14.8896 14.2165 15.0282 14.2736 15.1252 14.3636ZM6.82405 11.9212C7.61134 11.9212 8.25205 12.4176 8.25205 13.0298C8.25248 13.3232 8.10217 13.6048 7.83409 13.8127C7.56602 14.0205 7.20215 14.1376 6.8224 14.1383C6.44321 14.1373 6.08 14.0202 5.81235 13.8127C5.54467 13.6051 5.3944 13.324 5.3944 13.031C5.39351 12.7376 5.54342 12.4559 5.81117 12.2478C6.07895 12.0397 6.4443 11.9223 6.82405 11.9212ZM21.1634 11.9212C21.954 11.9212 22.593 12.4176 22.593 13.0298C22.5935 13.3232 22.4432 13.6048 22.1751 13.8127C21.907 14.0205 21.5431 14.1376 21.1634 14.1383C20.7842 14.1373 20.421 14.0202 20.1533 13.8127C19.8857 13.6051 19.7354 13.324 19.7354 13.031C19.7345 12.7376 19.8844 12.4559 20.1522 12.2478C20.4199 12.0397 20.7836 11.9223 21.1634 11.9212ZM6.48969 1.6543L6.48475 1.65684C6.29392 1.72096 6.131 1.82611 6.01534 1.95975L6.0071 1.96738C5.77981 2.20793 5.58216 2.56174 5.43393 3.02628C5.15392 3.90699 5.07816 5.10206 5.22969 6.56695C5.93793 6.40405 6.7104 6.30223 7.54217 6.26532L7.55864 6.26405L7.58993 6.22077C7.6657 6.11641 7.7464 6.01587 7.8337 5.9166C8.03629 4.93534 7.86993 3.76318 7.41699 2.8061C7.19628 2.34283 6.92781 1.97884 6.67087 1.77139C6.61783 1.72827 6.55871 1.68986 6.49463 1.65684L6.48969 1.6543ZM21.5999 1.70521L21.5966 1.70648C21.5325 1.73949 21.4734 1.7779 21.4203 1.82102C21.1634 2.02847 20.8933 2.39374 20.6742 2.85701C20.1966 3.86754 20.0368 5.11734 20.2954 6.13041L20.3909 6.25387L20.4041 6.27168H20.4535C21.2709 6.27186 22.0841 6.36273 22.8681 6.5415C23.0097 5.11097 22.9307 3.94136 22.6573 3.07719C22.509 2.61265 22.3114 2.25883 22.0824 2.01829L22.0759 2.01066C21.9604 1.87654 21.7975 1.77095 21.6064 1.70648L21.5999 1.70521Z" fill="black"/>
+</svg>
--- a/docs/modelfile.mdx
+++ b/docs/modelfile.mdx
-# Ollama Model File
+---
+title: Modelfile Reference
+---
-> [!NOTE]
+A Modelfile is the blueprint to create and share customized models using Ollama.
-> `Modelfile` syntax is in development
-A model file is the blueprint to create and share models with Ollama.
 ## Table of Contents
@@ -73,26 +72,23 @@ To view the Modelfile of a given model, use the `ollama show --modelfile` comman
 ollama show --modelfile llama3.2
 ```
-> **Output**:
+```
->
+# Modelfile generated by "ollama show"
-> ```
+# To build a new Modelfile based on this one, replace the FROM line with:
-> # Modelfile generated by "ollama show"
+# FROM llama3.2:latest
-> # To build a new Modelfile based on this one, replace the FROM line with:
+FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
-> # FROM llama3.2:latest
+TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
-> FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
-> TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
+{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
->
-> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
->
-> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
->
-> {{ .Response }}<|eot_id|>"""
-> PARAMETER stop "<|start_header_id|>"
-> PARAMETER stop "<|end_header_id|>"
-> PARAMETER stop "<|eot_id|>"
-> PARAMETER stop "<|reserved_special_token"
-> ```
+{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
+{{ .Response }}<|eot_id|>"""
+PARAMETER stop "<|start_header_id|>"
+PARAMETER stop "<|end_header_id|>"
+PARAMETER stop "<|eot_id|>"
+PARAMETER stop "<|reserved_special_token"
+```
 ## Instructions
@@ -110,10 +106,13 @@ FROM <model name>:<tag>
 FROM llama3.2
 ```
-A list of available base models:
+<Card title="Base Models" href="https://github.com/ollama/ollama#model-library">
-<https://github.com/ollama/ollama#model-library>
+  A list of available base models
-Additional models can be found at:
+</Card>
-<https://ollama.com/library>
+<Card title="Base Models" href="https://ollama.com/library">
+  Additional models can be found at
+</Card>
 #### Build from a Safetensors model
@@ -124,10 +123,11 @@ FROM <model directory>
 The model directory should contain the Safetensors weights for a supported architecture.
 Currently supported model architectures:
-  * Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
-  * Mistral (including Mistral 1, Mistral 2, and Mixtral)
+- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
-  * Gemma (including Gemma 1 and Gemma 2)
+- Mistral (including Mistral 1, Mistral 2, and Mixtral)
-  * Phi3
+- Gemma (including Gemma 1 and Gemma 2)
+- Phi3
 #### Build from a GGUF file
@@ -137,7 +137,6 @@ FROM ./ollama-model.gguf
 The GGUF file location should be specified as an absolute path or relative to the `Modelfile` location.
 ### PARAMETER
 The `PARAMETER` instruction defines a parameter that can be set when the model is run.
@@ -148,18 +147,21 @@ PARAMETER <parameter> <parametervalue>
 #### Valid Parameters and Values
-| Parameter      | Description                                                                                                                                                                                                                                             | Value Type | Example Usage        |
+| Parameter      | Description                                                                                                                                                                                                                                                                                                                                                                     | Value Type | Example Usage        |
-| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | -------------------- |
+| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | -------------------- |
-| num_ctx        | Sets the size of the context window used to generate the next token. (Default: 4096)                                                                                                                                                                    | int        | num_ctx 4096         |
+| mirostat       | Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)                                                                                                                                                                                                                                                                 | int        | mirostat 0           |
-| repeat_last_n  | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)                                                                                                                                           | int        | repeat_last_n 64     |
+| mirostat_eta   | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)                                                                                                                                                | float      | mirostat_eta 0.1     |
-| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)                                                                     | float      | repeat_penalty 1.1   |
+| mirostat_tau   | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)                                                                                                                                                                                                                                 | float      | mirostat_tau 5.0     |
-| temperature    | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)                                                                                                                                     | float      | temperature 0.7      |
+| num_ctx        | Sets the size of the context window used to generate the next token. (Default: 2048)                                                                                                                                                                                                                                                                                            | int        | num_ctx 4096         |
-| seed           | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)                                                                                       | int        | seed 42              |
+| repeat_last_n  | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)                                                                                                                                                                                                                                                                   | int        | repeat_last_n 64     |
-| stop           | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile.                                      | string     | stop "AI assistant:" |
+| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)                                                                                                                                                                                             | float      | repeat_penalty 1.1   |
-| num_predict    | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation)                                                                                                                                   | int        | num_predict 42       |
+| temperature    | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)                                                                                                                                                                                                                                                             | float      | temperature 0.7      |
-| top_k          | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)                                                                        | int        | top_k 40             |
+| seed           | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)                                                                                                                                                                                                               | int        | seed 42              |
-| top_p          | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)                                                                 | float      | top_p 0.9            |
+| stop           | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile.                                                                                                                                                              | string     | stop "AI assistant:" |
-| min_p          | Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter *p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with *p*=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float      | min_p 0.05            |
+| num_predict    | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation)                                                                                                                                                                                                                                                                                    | int        | num_predict 42       |
+| top_k          | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)                                                                                                                                                                                                | int        | top_k 40             |
+| top_p          | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)                                                                                                                                                                                         | float      | top_p 0.9            |
+| min_p          | Alternative to the top*p, and aims to ensure a balance of quality and variety. The parameter \_p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with _p_=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float      | min_p 0.05           |
 ### TEMPLATE
@@ -201,9 +203,10 @@ ADAPTER <path to safetensor adapter>
 ```
 Currently supported Safetensor adapters:
-  * Llama (including Llama 2, Llama 3, and Llama 3.1)
-  * Mistral (including Mistral 1, Mistral 2, and Mixtral)
+- Llama (including Llama 2, Llama 3, and Llama 3.1)
-  * Gemma (including Gemma 1 and Gemma 2)
+- Mistral (including Mistral 1, Mistral 2, and Mixtral)
+- Gemma (including Gemma 1 and Gemma 2)
 #### GGUF adapter
@@ -237,7 +240,6 @@ MESSAGE <role> <message>
 | user      | An example message of what the user could have asked.        |
 | assistant | An example message of how the model should respond.          |
 #### Example conversation
 ```
@@ -249,7 +251,6 @@ MESSAGE user Is Ontario in Canada?
 MESSAGE assistant yes
 ```
 ## Notes
 - the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to distinguish it from arguments.

--- a/docs/ollama-logo.svg
+++ b/docs/ollama-logo.svg
+<svg width="17" height="25" viewBox="0 0 17 25" fill="none" xmlns="http://www.w3.org/2000/svg">
+<path fill-rule="evenodd" clip-rule="evenodd" d="M4.40517 0.102088C4.62117 0.198678 4.81617 0.357766 4.99317 0.56799C5.28817 0.915712 5.53718 1.41342 5.72718 2.00318C5.91818 2.59635 6.04218 3.25316 6.08918 3.91224C6.71878 3.5075 7.41754 3.26103 8.13818 3.18953L8.18918 3.18498C9.05919 3.10544 9.91919 3.28384 10.6692 3.72361C10.7702 3.78384 10.8692 3.84861 10.9662 3.91679C11.0162 3.27021 11.1382 2.62817 11.3262 2.04863C11.5162 1.45773 11.7652 0.961166 12.0592 0.612308C12.2235 0.410338 12.4245 0.251368 12.6482 0.146406C12.9052 0.032771 13.1782 0.0123167 13.4442 0.098679C13.8452 0.228223 14.1892 0.516855 14.4602 0.936167C14.7082 1.3191 14.8942 1.81 15.0212 2.39863C15.2512 3.45998 15.2912 4.85655 15.1362 6.54061L15.1892 6.58607L15.2152 6.60766C15.9722 7.26219 16.4992 8.19513 16.7782 9.27807C17.2133 10.9678 16.9943 12.8632 16.2442 13.9235L16.2262 13.9473L16.2282 13.9507C16.6453 14.8166 16.8983 15.7314 16.9523 16.678L16.9543 16.7121C17.0183 17.9223 16.7542 19.1404 16.1402 20.337L16.1332 20.3484L16.1432 20.3756C16.6152 21.6904 16.7632 23.0142 16.5812 24.3369L16.5752 24.3813C16.547 24.5744 16.4525 24.7472 16.3125 24.8612C16.1725 24.9753 15.9983 25.0219 15.8282 24.9903C15.744 24.9753 15.6632 24.9417 15.5904 24.8912C15.5177 24.8408 15.4544 24.7744 15.4042 24.696C15.3541 24.6178 15.318 24.529 15.2981 24.4347C15.2782 24.3406 15.2748 24.2428 15.2882 24.1472C15.4552 22.9733 15.2982 21.7961 14.8082 20.5984C14.7625 20.4871 14.7422 20.3645 14.7492 20.242C14.7562 20.1194 14.7902 20.0009 14.8482 19.8972L14.8522 19.8904C15.4562 18.8404 15.7062 17.8109 15.6522 16.7996C15.6062 15.9143 15.3272 15.045 14.8522 14.2166C14.7598 14.0556 14.7269 13.8597 14.7606 13.6713C14.7943 13.4829 14.8918 13.3171 15.0322 13.2098L15.0412 13.203C15.2842 13.0223 15.5082 12.561 15.6212 11.9303C15.7459 11.1846 15.7133 10.4159 15.5262 9.68716C15.3212 8.89171 14.9462 8.22809 14.4212 7.77468C13.8262 7.25878 13.0382 7.00992 12.0412 7.08151C11.9108 7.09115 11.7809 7.05613 11.6682 6.98097C11.5556 6.90581 11.4653 6.79399 11.4092 6.65993C11.0952 5.90426 10.6372 5.36336 10.0662 5.02814C9.51799 4.71723 8.90425 4.58657 8.29418 4.65087C7.04918 4.76337 5.95118 5.56108 5.62418 6.56675C5.57792 6.70829 5.4947 6.8304 5.38568 6.91672C5.27666 7.00301 5.14703 7.04942 5.01417 7.0497C3.94717 7.05197 3.12117 7.33606 2.51717 7.84855C1.99517 8.29172 1.63916 8.91103 1.45116 9.65307C1.28104 10.3515 1.25774 11.0857 1.38316 11.7962C1.49516 12.4303 1.71416 12.9553 1.96517 13.2382L1.97317 13.2462C2.18517 13.4814 2.23017 13.8485 2.08217 14.1382C1.72216 14.845 1.45316 15.8984 1.40916 16.9109C1.35916 18.0677 1.59516 19.0722 2.12817 19.7927L2.14417 19.8143C2.22461 19.9208 2.27633 20.0514 2.29319 20.1905C2.31003 20.3295 2.29127 20.4711 2.23917 20.5984C1.66316 22.0029 1.48616 23.1574 1.67716 24.0665C1.71148 24.2556 1.67954 24.4524 1.58812 24.6149C1.4967 24.7776 1.35302 24.8933 1.18766 24.9374C1.0223 24.9817 0.848322 24.9506 0.702741 24.8512C0.557141 24.7517 0.451463 24.5917 0.408163 24.4051C0.165162 23.2483 0.330162 21.9233 0.881162 20.4302L0.895162 20.3904L0.887162 20.3768C0.616341 19.9222 0.414243 19.4195 0.289162 18.8893L0.284162 18.8677C0.132362 18.2062 0.0726416 17.5218 0.107162 16.8393C0.151162 15.8052 0.385163 14.7462 0.729162 13.8962L0.741162 13.8666L0.739162 13.8644C0.446163 13.3894 0.229162 12.7814 0.109162 12.1087L0.104162 12.0814C-0.0611788 11.1431 -0.0293187 10.1737 0.197162 9.25194C0.459163 8.21218 0.974162 7.31901 1.73316 6.67356C1.79316 6.62243 1.85616 6.57129 1.91916 6.52357C1.76016 4.827 1.80016 3.42134 2.03117 2.35317C2.15817 1.76455 2.34517 1.27365 2.59317 0.890713C2.86317 0.472537 3.20717 0.183905 3.60817 0.0532252C3.87417 -0.0331371 4.14817 -0.0126829 4.40517 0.102088ZM8.52118 10.4315C9.45719 10.4315 10.3212 10.7871 10.9672 11.403C11.5972 12.0019 11.9722 12.8064 11.9722 13.6076C11.9722 14.6166 11.5662 15.403 10.8392 15.9052C10.2192 16.3314 9.38819 16.5382 8.43618 16.5382C7.42718 16.5382 6.56518 16.2439 5.94318 15.7041C5.32618 15.17 4.98017 14.42 4.98017 13.6076C4.98017 12.8042 5.37818 11.9973 6.03618 11.3962C6.70418 10.786 7.58618 10.4315 8.52118 10.4315ZM8.52118 11.4496C7.82742 11.4428 7.15204 11.7031 6.60518 12.1883C6.14418 12.6087 5.88318 13.1371 5.88318 13.6087C5.88318 14.095 6.09318 14.5507 6.49318 14.8973C6.94818 15.2916 7.61718 15.52 8.43618 15.52C9.23519 15.52 9.90919 15.353 10.3682 15.0359C10.8312 14.7178 11.0682 14.2564 11.0682 13.6076C11.0682 13.1269 10.8222 12.5962 10.3852 12.1803C9.90119 11.7201 9.24519 11.4496 8.52118 11.4496ZM9.18319 12.8246L9.18719 12.8292C9.30719 13.0007 9.28219 13.2496 9.13119 13.386L8.83919 13.6473V14.1541C8.83865 14.267 8.79877 14.375 8.72829 14.4544C8.6578 14.5339 8.56246 14.5783 8.46318 14.578C8.3639 14.5783 8.26856 14.5339 8.19808 14.4544C8.12758 14.375 8.0877 14.267 8.08718 14.1541V13.6314L7.81618 13.3837C7.78042 13.3511 7.7507 13.3109 7.72872 13.2652C7.70674 13.2195 7.69294 13.1694 7.6881 13.1176C7.68326 13.0658 7.6875 13.0135 7.70056 12.9636C7.71362 12.9137 7.73524 12.8672 7.76418 12.8269C7.8232 12.7452 7.9082 12.6934 8.0007 12.6825C8.09318 12.6717 8.18572 12.7027 8.25818 12.7689L8.47318 12.9644L8.69318 12.7667C8.76538 12.7018 8.85702 12.6716 8.94854 12.6825C9.04009 12.6933 9.12427 12.7443 9.18319 12.8246ZM4.14317 10.644C4.62117 10.644 5.01017 11.0871 5.01017 11.6337C5.01043 11.8957 4.91917 12.1471 4.75641 12.3327C4.59365 12.5183 4.37273 12.6229 4.14217 12.6235C3.91195 12.6226 3.69143 12.518 3.52893 12.3327C3.36641 12.1474 3.27517 11.8965 3.27517 11.6349C3.27463 11.3729 3.36565 11.1213 3.52821 10.9355C3.69079 10.7497 3.91261 10.6449 4.14317 10.644ZM12.8492 10.644C13.3292 10.644 13.7172 11.0871 13.7172 11.6337C13.7175 11.8957 13.6262 12.1471 13.4634 12.3327C13.3007 12.5183 13.0798 12.6229 12.8492 12.6235C12.619 12.6226 12.3985 12.518 12.236 12.3327C12.0734 12.1474 11.9822 11.8965 11.9822 11.6349C11.9817 11.3729 12.0727 11.1213 12.2352 10.9355C12.3978 10.7497 12.6186 10.6449 12.8492 10.644ZM3.94017 1.47705L3.93717 1.47932C3.82131 1.53657 3.72239 1.63046 3.65217 1.74977L3.64717 1.75659C3.50917 1.97136 3.38917 2.28727 3.29917 2.70203C3.12917 3.48839 3.08317 4.55541 3.17517 5.86335C3.60517 5.7179 4.07417 5.62699 4.57917 5.59404L4.58917 5.5929L4.60817 5.55426C4.65417 5.46108 4.70317 5.37131 4.75617 5.28268C4.87917 4.40655 4.77817 3.35998 4.50317 2.50545C4.36917 2.09182 4.20617 1.76682 4.05017 1.5816C4.01797 1.5431 3.98207 1.5088 3.94317 1.47932L3.94017 1.47705ZM13.1142 1.52251L13.1122 1.52364C13.0733 1.55312 13.0374 1.58741 13.0052 1.62591C12.8492 1.81114 12.6852 2.13727 12.5522 2.5509C12.2622 3.45316 12.1652 4.56905 12.3222 5.47358L12.3802 5.58381L12.3882 5.59972H12.4182C12.9145 5.59988 13.4082 5.68101 13.8842 5.84062C13.9702 4.56337 13.9222 3.51907 13.7562 2.74749C13.6662 2.33272 13.5462 2.01682 13.4072 1.80205L13.4032 1.79523C13.3331 1.67548 13.2342 1.58121 13.1182 1.52364L13.1142 1.52251Z" fill="black"/>
+</svg>
--- a/docs/ollama.png
+++ b/docs/ollama.png
--- a/docs/openapi.yaml
+++ b/docs/openapi.yaml
+openapi: 3.1.0
+info:
+  title: Ollama API
+  version: 0.1.0
+  description: |
+    OpenAPI specification for the Ollama HTTP API
+servers:
+  - url: http://localhost:11434
+    description: Local Ollama instance
+components:
+  securitySchemes:
+    bearerAuth:
+      type: http
+      scheme: bearer
+      bearerFormat: API Key
+  parameters:
+    DigestParam:
+      name: digest
+      in: path
+      required: true
+      description: SHA256 digest identifier, prefixed with `sha256:`
+      schema:
+        type: string
+  schemas:
+    ModelOptions:
+      type: object
+      description: Runtime options that control text generation
+      properties:
+        # Sampling Options
+        seed:
+          type: integer
+          description: Random seed used for reproducible outputs
+        temperature:
+          type: number
+          format: float
+          description: Controls randomness in generation (higher = more random)
+        top_k:
+          type: integer
+          description: Limits next token selection to the K most likely
+        top_p:
+          type: number
+          format: float
+          description: Cumulative probability threshold for nucleus sampling
+        min_p:
+          type: number
+          format: float
+          description: Minimum probability threshold for token selection
+        stop:
+          oneOf:
+            - type: string
+            - type: array
+              items:
+                type: string
+          description: Stop sequences that will halt generation
+        # Runtime Options
+        num_ctx:
+          type: integer
+          description: Context length size (number of tokens)
+        num_predict:
+          type: integer
+          description: Maximum number of tokens to generate
+      additionalProperties: true
+    GenerateRequest:
+      type: object
+      required: [model]
+      properties:
+        model:
+          type: string
+          description: Model name
+        prompt:
+          type: string
+          description: Text for the model to generate a response from
+        suffix:
+          type: string
+          description: Used for fill-in-the-middle models, text that appears after the user prompt and before the model response
+        images:
+          type: array
+          items:
+            type: string
+            description: Base64-encoded images for models that support image input
+        format:
+          description: Structured output format for the model to generate a response from. Supports either the string `"json"` or a JSON schema object.
+          oneOf:
+            - type: string
+            - type: object
+        system:
+          description: System prompt for the model to generate a response from
+          type: string
+        stream:
+          description: When true, returns a stream of partial responses
+          type: boolean
+          default: true
+        think:
+          type: boolean
+          description: When true, returns separate thinking output in addition to content
+        raw:
+          type: boolean
+          description: When true, returns the raw response from the model without any prompt templating
+        keep_alive:
+          oneOf:
+            - type: string
+            - type: number
+          description: Model keep-alive duration (for example `5m` or `0` to unload immediately)
+        options:
+          $ref: "#/components/schemas/ModelOptions"
+    GenerateResponse:
+      type: object
+      properties:
+        model:
+          type: string
+          description: Model name
+        created_at:
+          type: string
+          description: ISO 8601 timestamp of response creation
+        response:
+          type: string
+          description: The model's generated text response
+        thinking:
+          type: string
+          description: The model's generated thinking output
+        done:
+          type: boolean
+          description: Indicates whether generation has finished
+        done_reason:
+          type: string
+          description: Reason the generation stopped
+        total_duration:
+          type: integer
+          description: Time spent generating the response in nanoseconds
+        load_duration:
+          type: integer
+          description: Time spent loading the model in nanoseconds
+        prompt_eval_count:
+          type: integer
+          description: Number of input tokens in the prompt
+        prompt_eval_duration:
+          type: integer
+          description: Time spent evaluating the prompt in nanoseconds
+        eval_count:
+          type: integer
+          description: Number of output tokens generated in the response
+        eval_duration:
+          type: integer
+          description: Time spent generating tokens in nanoseconds
+    GenerateStreamEvent:
+      type: object
+      properties:
+        model:
+          type: string
+          description: Model name
+        created_at:
+          type: string
+          description: ISO 8601 timestamp of response creation
+        response:
+          type: string
+          description: The model's generated text response for this chunk
+        thinking:
+          type: string
+          description: The model's generated thinking output for this chunk
+        done:
+          type: boolean
+          description: Indicates whether the stream has finished
+        done_reason:
+          type: string
+          description: Reason streaming finished
+        total_duration:
+          type: integer
+          description: Time spent generating the response in nanoseconds
+        load_duration:
+          type: integer
+          description: Time spent loading the model in nanoseconds
+        prompt_eval_count:
+          type: integer
+          description: Number of input tokens in the prompt
+        prompt_eval_duration:
+          type: integer
+          description: Time spent evaluating the prompt in nanoseconds
+        eval_count:
+          type: integer
+          description: Number of output tokens generated in the response
+        eval_duration:
+          type: integer
+          description: Time spent generating tokens in nanoseconds
+    ChatMessage:
+      type: object
+      required: [role, content]
+      properties:
+        role:
+          type: string
+          enum: [system, user, assistant, tool]
+          description: Author of the message.
+        content:
+          type: string
+          description: Message text content
+        images:
+          type: array
+          items:
+            type: string
+            description: Base64-encoded image content
+          description: Optional list of inline images for multimodal models
+        tool_calls:
+          type: array
+          items:
+            $ref: "#/components/schemas/ToolCall"
+          description: Tool call requests produced by the model
+    ToolCall:
+      type: object
+      properties:
+        function:
+          type: object
+          required: [name]
+          properties:
+            name:
+              type: string
+              description: Name of the function to call
+            description:
+              type: string
+              description: What the function does
+            arguments:
+              type: object
+              description: JSON object of arguments to pass to the function
+    ToolDefinition:
+      type: object
+      required: [type, function]
+      properties:
+        type:
+          type: string
+          enum: [function]
+          description: Type of tool (always `function`)
+        function:
+          type: object
+          required: [name, parameters]
+          properties:
+            name:
+              type: string
+              description: Function name exposed to the model
+            description:
+              type: string
+              description: Human-readable description of the function
+            parameters:
+              type: object
+              description: JSON Schema for the function parameters
+    ChatRequest:
+      type: object
+      required: [model, messages]
+      properties:
+        model:
+          type: string
+          description: Model name
+        messages:
+          type: array
+          description: Chat history as an array of message objects (each with a role and content)
+          items:
+            $ref: "#/components/schemas/ChatMessage"
+        tools:
+          type: array
+          description: Optional list of function tools the model may call during the chat
+          items:
+            $ref: "#/components/schemas/ToolDefinition"
+        format:
+          oneOf:
+            - type: string
+              enum: [json]
+            - type: object
+          description: Format to return a response in. Can be `json` or a JSON schema
+        options:
+          $ref: "#/components/schemas/ModelOptions"
+        stream:
+          type: boolean
+          default: true
+        think:
+          type: boolean
+          description: When true, returns separate thinking output in addition to content
+        keep_alive:
+          oneOf:
+            - type: string
+            - type: number
+          description: Model keep-alive duration (for example `5m` or `0` to unload immediately)
+    ChatResponse:
+      type: object
+      properties:
+        model:
+          type: string
+          description: Model name used to generate this message
+        created_at:
+          type: string
+          format: date-time
+          description: Timestamp of response creation (ISO 8601)
+        message:
+          type: object
+          properties:
+            role:
+              type: string
+              enum: [assistant]
+              description: Always `assistant` for model responses
+            content:
+              type: string
+              description: Assistant message text
+            thinking:
+              type: string
+              description: Optional deliberate thinking trace when `think` is enabled
+            tool_calls:
+              type: array
+              items:
+                $ref: "#/components/schemas/ToolCall"
+              description: Tool calls requested by the assistant
+            images:
+              type: array
+              items:
+                type: string
+              nullable: true
+              description: Optional base64-encoded images in the response
+        done:
+          type: boolean
+          description: Indicates whether the chat response has finished
+        done_reason:
+          type: string
+          description: Reason the response finished
+        total_duration:
+          type: integer
+          description: Total time spent generating in nanoseconds
+        load_duration:
+          type: integer
+          description: Time spent loading the model in nanoseconds
+        prompt_eval_count:
+          type: integer
+          description: Number of tokens in the prompt
+        prompt_eval_duration:
+          type: integer
+          description: Time spent evaluating the prompt in nanoseconds
+        eval_count:
+          type: integer
+          description: Number of tokens generated in the response
+        eval_duration:
+          type: integer
+          description: Time spent generating tokens in nanoseconds
+    ChatStreamEvent:
+      type: object
+      properties:
+        model:
+          type: string
+          description: Model name used for this stream event
+        created_at:
+          type: string
+          format: date-time
+          description: When this chunk was created (ISO 8601)
+        message:
+          type: object
+          properties:
+            role:
+              type: string
+              description: Role of the message for this chunk
+            content:
+              type: string
+              description: Partial assistant message text
+            thinking:
+              type: string
+              description: Partial thinking text when `think` is enabled
+            tool_calls:
+              type: array
+              items:
+                $ref: "#/components/schemas/ToolCall"
+              description: Partial tool calls, if any
+            images:
+              type: array
+              items:
+                type: string
+              nullable: true
+              description: Partial base64-encoded images, when present
+        done:
+          type: boolean
+          description: True for the final event in the stream
+    StatusEvent:
+      type: object
+      properties:
+        status:
+          type: string
+          description: Human-readable status message
+        digest:
+          type: string
+          description: Content digest associated with the status, if applicable
+        total:
+          type: integer
+          description: Total number of bytes expected for the operation
+        completed:
+          type: integer
+          description: Number of bytes transferred so far
+    StatusResponse:
+      type: object
+      properties:
+        status:
+          type: string
+          description: Current status message
+    EmbedRequest:
+      type: object
+      required: [model, input]
+      properties:
+        model:
+          type: string
+          description: Model name
+        input:
+          oneOf:
+            - type: string
+            - type: array
+              items:
+                type: string
+          description: Text or array of texts to generate embeddings for
+        truncate:
+          type: boolean
+          default: true
+          description: If true, truncate inputs that exceed the context window. If false, returns an error.
+        dimensions:
+          type: integer
+          description: Number of dimensions to generate embeddings for
+        keep_alive:
+          type: string
+          description: Model keep-alive duration
+        options:
+          $ref: "#/components/schemas/ModelOptions"
+    EmbedResponse:
+      type: object
+      properties:
+        model:
+          type: string
+          description: Model that produced the embeddings
+        embeddings:
+          type: array
+          items:
+            type: array
+            items:
+              type: number
+          description: Array of vector embeddings
+        total_duration:
+          type: integer
+          description: Total time spent generating in nanoseconds
+        load_duration:
+          type: integer
+          description: Load time in nanoseconds
+        prompt_eval_count:
+          type: integer
+          description: Number of input tokens processed to generate embeddings
+    CreateRequest:
+      type: object
+      required: [model]
+      properties:
+        model:
+          type: string
+          description: Name for the model to create
+        from:
+          type: string
+          description: Existing model to create from
+        template:
+          type: string
+          description: Prompt template to use for the model
+        license:
+          oneOf:
+            - type: string
+            - type: array
+              items:
+                type: string
+          description: License string or list of licenses for the model
+        system:
+          type: string
+          description: System prompt to embed in the model
+        parameters:
+          type: object
+          description: Key-value parameters for the model
+        messages:
+          description: Message history to use for the model
+          type: array
+          items:
+            $ref: "#/components/schemas/ChatMessage"
+        quantize:
+          type: string
+          description: Quantization level to apply (e.g. `q4_K_M`, `q8_0`)
+        stream:
+          type: boolean
+          default: true
+          description: Stream status updates
+    CopyRequest:
+      type: object
+      required: [source, destination]
+      properties:
+        source:
+          type: string
+          description: Existing model name to copy from
+        destination:
+          type: string
+          description: New model name to create
+    DeleteRequest:
+      type: object
+      required: [model]
+      properties:
+        model:
+          type: string
+          description: Model name to delete
+    PullRequest:
+      type: object
+      required: [model]
+      properties:
+        model:
+          type: string
+          description: Name of the model to download
+        insecure:
+          type: boolean
+          description: Allow downloading over insecure connections
+        stream:
+          type: boolean
+          default: true
+          description: Stream progress updates
+    PushRequest:
+      type: object
+      required: [model]
+      properties:
+        model:
+          type: string
+          description: Name of the model to publish
+        insecure:
+          type: boolean
+          description: Allow publishing over insecure connections
+        stream:
+          type: boolean
+          default: true
+          description: Stream progress updates
+    ShowRequest:
+      type: object
+      required: [model]
+      properties:
+        model:
+          type: string
+          description: Model name to show
+        verbose:
+          type: boolean
+          description: If true, includes large verbose fields in the response.
+    ShowResponse:
+      type: object
+      properties:
+        parameters:
+          type: string
+          description: Model parameter settings serialized as text
+        license:
+          type: string
+          description: The license of the model
+        details:
+          type: object
+          description: High-level model details
+        template:
+          type: string
+          description: The template used by the model to render prompts
+        capabilities:
+          type: array
+          items:
+            type: string
+          description: List of supported features
+        model_info:
+          type: object
+          description: Additional model metadata
+    ModelSummary:
+      type: object
+      description: Summary information for a locally available model
+      properties:
+        name:
+          type: string
+          description: Model name
+        modified_at:
+          type: string
+          description: Last modified timestamp in ISO 8601 format
+        size:
+          type: integer
+          description: Total size of the model on disk in bytes
+        digest:
+          type: string
+          description: SHA256 digest identifier of the model contents
+        details:
+          type: object
+          description: Additional information about the model's format and family
+          properties:
+            format:
+              type: string
+              description: Model file format (for example `gguf`)
+            family:
+              type: string
+              description: Primary model family (for example `llama`)
+            families:
+              type: array
+              items:
+                type: string
+              description: All families the model belongs to, when applicable
+            parameter_size:
+              type: string
+              description: Approximate parameter count label (for example `7B`, `13B`)
+            quantization_level:
+              type: string
+              description: Quantization level used (for example `Q4_0`)
+    ListResponse:
+      type: object
+      properties:
+        models:
+          type: array
+          items:
+            $ref: "#/components/schemas/ModelSummary"
+    Ps:
+      type: object
+      properties:
+        model:
+          type: string
+          description: Name of the running model
+        size:
+          type: integer
+          description: Size of the model in bytes
+        digest:
+          type: string
+          description: SHA256 digest of the model
+        details:
+          type: object
+          description: Model details such as format and family
+        expires_at:
+          type: string
+          description: Time when the model will be unloaded
+        size_vram:
+          type: integer
+          description: VRAM usage in bytes
+    PsResponse:
+      type: object
+      properties:
+        models:
+          type: array
+          items:
+            $ref: "#/components/schemas/Ps"
+          description: Currently running models
+    WebSearchRequest:
+      type: object
+      required: [query]
+      properties:
+        query:
+          type: string
+          description: Search query string
+        max_results:
+          type: integer
+          minimum: 1
+          maximum: 10
+          default: 5
+          description: Maximum number of results to return
+    WebSearchResult:
+      type: object
+      properties:
+        title:
+          type: string
+          description: Page title of the result
+        url:
+          type: string
+          format: uri
+          description: Resolved URL for the result
+        content:
+          type: string
+          description: Extracted text content snippet
+    WebSearchResponse:
+      type: object
+      properties:
+        results:
+          type: array
+          items:
+            $ref: "#/components/schemas/WebSearchResult"
+          description: Array of matching search results
+    WebFetchRequest:
+      type: object
+      required: [url]
+      properties:
+        url:
+          type: string
+          format: uri
+          description: The URL to fetch
+    WebFetchResponse:
+      type: object
+      properties:
+        title:
+          type: string
+          description: Title of the fetched page
+        content:
+          type: string
+          description: Extracted page content
+        links:
+          type: array
+          items:
+            type: string
+            format: uri
+          description: Links found on the page
+    VersionResponse:
+      type: object
+      properties:
+        version:
+          type: string
+          description: Version of Ollama
+    ErrorResponse:
+      type: object
+      properties:
+        error:
+          type: string
+          description: Error message describing what went wrong
+paths:
+  /api/generate:
+    post:
+      summary: Generate a response
+      description: Generates a response for the provided prompt
+      operationId: generate
+      x-mint:
+        href: /api/generate
+      x-codeSamples:
+        - lang: bash
+          label: Default
+          source: |
+            curl http://localhost:11434/api/generate -d '{
+              "model": "gemma3",
+              "prompt": "Why is the sky blue?"
+            }'
+        - lang: bash
+          label: Non-streaming
+          source: |
+            curl http://localhost:11434/api/generate -d '{
+              "model": "gemma3",
+              "prompt": "Why is the sky blue?",
+              "stream": false
+            }'
+        - lang: bash
+          label: With options
+          source: |
+            curl http://localhost:11434/api/generate -d '{
+              "model": "gemma3",
+              "prompt": "Why is the sky blue?",
+              "options": {
+                "temperature": 0.8,
+                "top_p": 0.9,
+                "seed": 42
+              }
+            }'
+        - lang: bash
+          label: Structured outputs
+          source: |
+            curl http://localhost:11434/api/generate -d '{
+              "model": "gemma3",
+              "prompt": "What are the populations of the United States and Canada?",
+              "stream": false,
+              "format": {
+                "type": "object",
+                "properties": {
+                  "countries": {
+                    "type": "array",
+                    "items": {
+                      "type": "object",
+                      "properties": {
+                        "country": {"type": "string"},
+                        "population": {"type": "integer"}
+                      },
+                      "required": ["country", "population"]
+                    }
+                  }
+                },
+                "required": ["countries"]
+              }
+            }'
+        - lang: bash
+          label: With images
+          source: |
+            curl http://localhost:11434/api/generate -d '{
+              "model": "gemma3",
+              "prompt": "What is in this picture?",
+              "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
+            }'
+        - lang: bash
+          label: Load model
+          source: |
+            curl http://localhost:11434/api/generate -d '{
+              "model": "gemma3"
+            }'
+        - lang: bash
+          label: Unload model
+          source: |
+            curl http://localhost:11434/api/generate -d '{
+              "model": "gemma3",
+              "keep_alive": 0
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/GenerateRequest"
+            example:
+              model: gemma3
+              prompt: Why is the sky blue?
+      responses:
+        "200":
+          description: Generation responses
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/GenerateResponse"
+              example:
+                model: "gemma3"
+                created_at: "2025-10-17T23:14:07.414671Z"
+                response: "Hello! How can I help you today?"
+                done: true
+                done_reason: "stop"
+                total_duration: 174560334
+                load_duration: 101397084
+                prompt_eval_count: 11
+                prompt_eval_duration: 13074791
+                eval_count: 18
+                eval_duration: 52479709
+            application/x-ndjson:
+              schema:
+                $ref: "#/components/schemas/GenerateStreamEvent"
+  /api/chat:
+    post:
+      summary: Generate a chat message
+      description: Generate the next chat message in a conversation between a user and an assistant.
+      operationId: chat
+      x-mint:
+        href: /api/chat
+      x-codeSamples:
+        - lang: bash
+          label: Default
+          source: |
+            curl http://localhost:11434/api/chat -d '{
+              "model": "gemma3",
+              "messages": [
+                {
+                  "role": "user",
+                  "content": "why is the sky blue?"
+                }
+              ]
+            }'
+        - lang: bash
+          label: Non-streaming
+          source: |
+            curl http://localhost:11434/api/chat -d '{
+              "model": "gemma3",
+              "messages": [
+                {
+                  "role": "user",
+                  "content": "why is the sky blue?"
+                }
+              ],
+              "stream": false
+            }'
+        - lang: bash
+          label: Structured outputs
+          source: |
+            curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
+              "model": "gemma3",
+              "messages": [
+                {
+                  "role": "user",
+                  "content": "What are the populations of the United States and Canada?"
+                }
+              ],
+              "stream": false,
+              "format": {
+                "type": "object",
+                "properties": {
+                  "countries": {
+                    "type": "array",
+                    "items": {
+                      "type": "object",
+                      "properties": {
+                        "country": {"type": "string"},
+                        "population": {"type": "integer"}
+                      },
+                      "required": ["country", "population"]
+                    }
+                  }
+                },
+                "required": ["countries"]
+              }
+            }'
+        - lang: bash
+          label: Tool calling
+          source: |
+            curl http://localhost:11434/api/chat -d '{
+              "model": "qwen3",
+              "messages": [
+                {
+                  "role": "user",
+                  "content": "What is the weather today in Paris?"
+                }
+              ],
+              "stream": false,
+              "tools": [
+                {
+                  "type": "function",
+                  "function": {
+                    "name": "get_current_weather",
+                    "description": "Get the current weather for a location",
+                    "parameters": {
+                      "type": "object",
+                      "properties": {
+                        "location": {
+                          "type": "string",
+                          "description": "The location to get the weather for, e.g. San Francisco, CA"
+                        },
+                        "format": {
+                          "type": "string",
+                          "description": "The format to return the weather in, e.g. 'celsius' or 'fahrenheit'",
+                          "enum": ["celsius", "fahrenheit"]
+                        }
+                      },
+                      "required": ["location", "format"]
+                    }
+                  }
+                }
+              ]
+            }'
+        - lang: bash
+          label: Thinking
+          source: |
+            curl http://localhost:11434/api/chat -d '{
+              "model": "gpt-oss",
+              "messages": [
+                {
+                  "role": "user",
+                  "content": "What is 1+1?"
+                }
+              ],
+              "think": "low"
+            }'
+        - lang: bash
+          label: Images
+          source: |
+            curl http://localhost:11434/api/chat -d '{
+              "model": "gemma3",
+              "messages": [
+                {
+                  "role": "user",
+                  "content": "What is in this image?",
+                  "images": [
+                    "iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"
+                  ]
+                }
+              ]
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/ChatRequest"
+      responses:
+        "200":
+          description: Chat response
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/ChatResponse"
+              example:
+                model: "gemma3"
+                created_at: "2025-10-17T23:14:07.414671Z"
+                message:
+                  role: "assistant"
+                  content: "Hello! How can I help you today?"
+                done: true
+                done_reason: "stop"
+                total_duration: 174560334
+                load_duration: 101397084
+                prompt_eval_count: 11
+                prompt_eval_duration: 13074791
+                eval_count: 18
+                eval_duration: 52479709
+            application/x-ndjson:
+              schema:
+                $ref: "#/components/schemas/ChatStreamEvent"
+  /api/embed:
+    post:
+      summary: Generate embeddings
+      description: Creates vector embeddings representing the input text
+      operationId: embed
+      x-mint:
+        href: /api/embed
+      x-codeSamples:
+        - lang: bash
+          label: Default
+          source: |
+            curl http://localhost:11434/api/embed -d '{
+              "model": "embeddinggemma",
+              "input": "Why is the sky blue?"
+            }'
+        - lang: bash
+          label: Multiple inputs
+          source: |
+            curl http://localhost:11434/api/embed -d '{
+              "model": "embeddinggemma",
+              "input": [
+                "Why is the sky blue?",
+                "Why is the grass green?"
+              ]
+            }'
+        - lang: bash
+          label: Truncation
+          source: |
+            curl http://localhost:11434/api/embed -d '{
+              "model": "embeddinggemma",
+              "input": "Generate embeddings for this text",
+              "truncate": true
+            }'
+        - lang: bash
+          label: Dimensions
+          source: |
+            curl http://localhost:11434/api/embed -d '{
+              "model": "embeddinggemma",
+              "input": "Generate embeddings for this text",
+              "dimensions": 128
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/EmbedRequest"
+            example:
+              model: embeddinggemma
+              input: "Generate embeddings for this text"
+      responses:
+        "200":
+          description: Vector embeddings for the input text
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/EmbedResponse"
+              example:
+                model: "embeddinggemma"
+                embeddings:
+                  - [
+                      0.010071029,
+                      -0.0017594862,
+                      0.05007221,
+                      0.04692972,
+                      0.054916814,
+                      0.008599704,
+                      0.105441414,
+                      -0.025878139,
+                      0.12958129,
+                      0.031952348,
+                    ]
+                total_duration: 14143917
+                load_duration: 1019500
+                prompt_eval_count: 8
+  /api/tags:
+    get:
+      summary: List models
+      description: Fetch a list of models and their details
+      operationId: list
+      x-mint:
+        href: /api/tags
+      x-codeSamples:
+        - lang: bash
+          label: List models
+          source: |
+            curl http://localhost:11434/api/tags
+      responses:
+        "200":
+          description: List available models
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/ListResponse"
+              example:
+                models:
+                  - name: "gemma3"
+                    modified_at: "2025-10-03T23:34:03.409490317-07:00"
+                    size: 3338801804
+                    digest: "a2af6cc3eb7fa8be8504abaf9b04e88f17a119ec3f04a3addf55f92841195f5a"
+                    details:
+                      format: "gguf"
+                      family: "gemma"
+                      families:
+                        - "gemma"
+                      parameter_size: "4.3B"
+                      quantization_level: "Q4_K_M"
+  /api/ps:
+    get:
+      summary: List running models
+      description: Retrieve a list of models that are currently running
+      operationId: ps
+      x-mint:
+        href: /api/ps
+      x-codeSamples:
+        - lang: bash
+          label: List running models
+          source: |
+            curl http://localhost:11434/api/ps
+      responses:
+        "200":
+          description: Models currently loaded into memory
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/PsResponse"
+              example:
+                models:
+                  - model: "gemma3"
+                    size: 6591830464
+                    digest: "a2af6cc3eb7fa8be8504abaf9b04e88f17a119ec3f04a3addf55f92841195f5a"
+                    details:
+                      parent_model: ""
+                      format: "gguf"
+                      family: "gemma3"
+                      families:
+                        - "gemma3"
+                      parameter_size: "4.3B"
+                      quantization_level: "Q4_K_M"
+                    expires_at: "2025-10-17T16:47:07.93355-07:00"
+                    size_vram: 5333539264
+                    context_length: 4096
+  /api/show:
+    post:
+      summary: Show model details
+      operationId: show
+      x-codeSamples:
+        - lang: bash
+          label: Default
+          source: |
+            curl http://localhost:11434/api/show -d '{
+              "model": "gemma3"
+            }'
+        - lang: bash
+          label: Verbose
+          source: |
+            curl http://localhost:11434/api/show -d '{
+              "model": "gemma3",
+              "verbose": true
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/ShowRequest"
+            example:
+              model: gemma3
+      responses:
+        "200":
+          description: Model information
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/ShowResponse"
+              example:
+                parameters: "temperature 0.7\nnum_ctx 2048"
+                license: "Gemma Terms of Use \n\nLast modified: February 21, 2024..."
+                capabilities:
+                  - "completion"
+                  - "vision"
+                modified_at: "2025-08-14T15:49:43.634137516-07:00"
+                details:
+                  parent_model: ""
+                  format: "gguf"
+                  family: "gemma3"
+                  families:
+                    - "gemma3"
+                  parameter_size: "4.3B"
+                  quantization_level: "Q4_K_M"
+                model_info:
+                  gemma3.attention.head_count: 8
+                  gemma3.attention.head_count_kv: 4
+                  gemma3.attention.key_length: 256
+                  gemma3.attention.sliding_window: 1024
+                  gemma3.attention.value_length: 256
+                  gemma3.block_count: 34
+                  gemma3.context_length: 131072
+                  gemma3.embedding_length: 2560
+                  gemma3.feed_forward_length: 10240
+                  gemma3.mm.tokens_per_image: 256
+                  gemma3.vision.attention.head_count: 16
+                  gemma3.vision.attention.layer_norm_epsilon: 0.000001
+                  gemma3.vision.block_count: 27
+                  gemma3.vision.embedding_length: 1152
+                  gemma3.vision.feed_forward_length: 4304
+                  gemma3.vision.image_size: 896
+                  gemma3.vision.num_channels: 3
+                  gemma3.vision.patch_size: 14
+                  general.architecture: "gemma3"
+                  general.file_type: 15
+                  general.parameter_count: 4299915632
+                  general.quantization_version: 2
+                  tokenizer.ggml.add_bos_token: true
+                  tokenizer.ggml.add_eos_token: false
+                  tokenizer.ggml.add_padding_token: false
+                  tokenizer.ggml.add_unknown_token: false
+                  tokenizer.ggml.bos_token_id: 2
+                  tokenizer.ggml.eos_token_id: 1
+                  tokenizer.ggml.merges: null
+                  tokenizer.ggml.model: "llama"
+                  tokenizer.ggml.padding_token_id: 0
+                  tokenizer.ggml.pre: "default"
+                  tokenizer.ggml.scores: null
+                  tokenizer.ggml.token_type: null
+                  tokenizer.ggml.tokens: null
+                  tokenizer.ggml.unknown_token_id: 3
+  /api/create:
+    post:
+      summary: Create a model
+      operationId: create
+      x-mint:
+        href: /api/create
+      x-codeSamples:
+        - lang: bash
+          label: Default
+          source: |
+            curl http://localhost:11434/api/create -d '{
+              "from": "gemma3",
+              "model": "alpaca",
+              "system": "You are Alpaca, a helpful AI assistant. You only answer with Emojis."
+            }'
+        - lang: bash
+          label: Create from existing
+          source: |
+            curl http://localhost:11434/api/create -d '{
+              "model": "ollama",
+              "from": "gemma3",
+              "system": "You are Ollama the llama."
+            }'
+        - lang: bash
+          label: Quantize
+          source: |
+            curl http://localhost:11434/api/create -d '{
+              "model": "llama3.1:8b-instruct-Q4_K_M",
+              "from": "llama3.1:8b-instruct-fp16",
+              "quantize": "q4_K_M"
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/CreateRequest"
+            example:
+              model: mario
+              from: gemma3
+              system: "You are Mario from Super Mario Bros."
+      responses:
+        "200":
+          description: Stream of create status updates
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/StatusResponse"
+              example:
+                status: "success"
+            application/x-ndjson:
+              schema:
+                $ref: "#/components/schemas/StatusEvent"
+              example:
+                status: "success"
+  /api/copy:
+    post:
+      summary: Copy a model
+      operationId: copy
+      x-mint:
+        href: /api/copy
+      x-codeSamples:
+        - lang: bash
+          label: Copy a model to a new name
+          source: |
+            curl http://localhost:11434/api/copy -d '{
+              "source": "gemma3",
+              "destination": "gemma3-backup"
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/CopyRequest"
+            example:
+              source: gemma3
+              destination: gemma3-backup
+  /api/pull:
+    post:
+      summary: Pull a model
+      operationId: pull
+      x-mint:
+        href: /api/pull
+      x-codeSamples:
+        - lang: bash
+          label: Default
+          source: |
+            curl http://localhost:11434/api/pull -d '{
+              "model": "gemma3"
+            }'
+        - lang: bash
+          label: Non-streaming
+          source: |
+            curl http://localhost:11434/api/pull -d '{
+              "model": "gemma3",
+              "stream": false
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/PullRequest"
+            example:
+              model: gemma3
+      responses:
+        "200":
+          description: Pull status updates.
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/StatusResponse"
+              example:
+                status: "success"
+            application/x-ndjson:
+              schema:
+                $ref: "#/components/schemas/StatusEvent"
+              example:
+                status: "success"
+  /api/push:
+    post:
+      summary: Push a model
+      operationId: push
+      x-mint:
+        href: /api/push
+      x-codeSamples:
+        - lang: bash
+          label: Push model
+          source: |
+            curl http://localhost:11434/api/push -d '{
+              "model": "my-username/my-model"
+            }'
+        - lang: bash
+          label: Non-streaming
+          source: |
+            curl http://localhost:11434/api/push -d '{
+              "model": "my-username/my-model",
+              "stream": false
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/PushRequest"
+            example:
+              model: my-username/my-model
+      responses:
+        "200":
+          description: Push status updates.
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/StatusResponse"
+              example:
+                status: "success"
+            application/x-ndjson:
+              schema:
+                $ref: "#/components/schemas/StatusEvent"
+              example:
+                status: "success"
+  /api/delete:
+    delete:
+      summary: Delete a model
+      operationId: delete
+      x-mint:
+        href: /api/delete
+      x-codeSamples:
+        - lang: bash
+          label: Delete model
+          source: |
+            curl -X DELETE http://localhost:11434/api/delete -d '{
+              "model": "gemma3"
+            }'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: "#/components/schemas/DeleteRequest"
+            example:
+              model: gemma3
+      responses:
+        "200":
+          description: Deletion status updates.
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/StatusResponse"
+              example:
+                status: "success"
+            application/x-ndjson:
+              schema:
+                $ref: "#/components/schemas/StatusEvent"
+  /api/version:
+    get:
+      summary: Get version
+      description: Retrieve the version of the Ollama
+      operationId: version
+      x-codeSamples:
+        - lang: bash
+          label: Default
+          source: |
+            curl http://localhost:11434/api/version
+      responses:
+        "200":
+          description: Version information
+          content:
+            application/json:
+              schema:
+                $ref: "#/components/schemas/VersionResponse"
+              example:
+                version: "0.12.6"
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
+---
+title: Quickstart
+---
+This quickstart will walk your through running your first model with Ollama. To get started, download Ollama on macOS, Windows or Linux.
+<a
+  href="https://ollama.com/download"
+  target="_blank"
+  className="inline-block px-6 py-2 bg-black rounded-full dark:bg-neutral-700 text-white font-normal border-none"
+>
+  Download Ollama
+</a>
+## Run a model
+<Tabs>
+  <Tab title="CLI">
+    Open a terminal and run the command:
+    ```
+    ollama run gemma3
+    ```
+  </Tab>
+  <Tab title="cURL">
+    ```
+    ollama pull gemma3
+    ```
+    Lastly, chat with the model:
+    ```shell
+    curl http://localhost:11434/api/chat -d '{
+      "model": "gemma3",
+      "messages": [{
+        "role": "user",
+        "content": "Hello there!"
+      }],
+      "stream": false
+    }'
+    ```
+  </Tab>
+  <Tab title="Python">
+    Start by downloading a model:
+    ```
+    ollama pull gemma3
+    ```
+    Then install Ollama's Python library:
+    ```
+    pip install ollama
+    ```
+    Lastly, chat with the model:
+    ```python
+    from ollama import chat
+    from ollama import ChatResponse
+    response: ChatResponse = chat(model='gemma3', messages=[
+      {
+        'role': 'user',
+        'content': 'Why is the sky blue?',
+      },
+    ])
+    print(response['message']['content'])
+    # or access fields directly from the response object
+    print(response.message.content)
+    ```
+  </Tab>
+  <Tab title="JavaScript">
+    Start by downloading a model:
+    ```
+    ollama pull gemma3
+    ```
+    Then install the Ollama JavaScript library:
+    ```
+    npm i ollama
+    ```
+    Lastly, chat with the model:
+    ```shell
+    import ollama from 'ollama'
+    const response = await ollama.chat({
+      model: 'gemma3',
+      messages: [{ role: 'user', content: 'Why is the sky blue?' }],
+    })
+    console.log(response.message.content)
+    ```
+  </Tab>
+</Tabs>
+See a full list of available models [here](https://ollama.com/models).
--- a/docs/styling.css
+++ b/docs/styling.css
+body {
+    font-family: ui-sans-serif, system-ui, sans-serif, Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol,Noto Color Emoji;
+}
+pre, code, .font-mono {
+    font-family: ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;
+}
+.nav-logo {
+    height: 44px;
+}
+.eyebrow {
+    color: #666;
+    font-weight: 400;
+}
--- a/docs/template.mdx
+++ b/docs/template.mdx
-# Template
+---
+title: Template
+---
 Ollama provides a powerful templating engine backed by Go's built-in templating engine to construct prompts for your large language model. This feature is a valuable tool to get the most out of your models.
@@ -6,13 +8,13 @@ Ollama provides a powerful templating engine backed by Go's built-in templating
 A basic Go template consists of three main parts:
-* **Layout**: The overall structure of the template.
+- **Layout**: The overall structure of the template.
-* **Variables**: Placeholders for dynamic data that will be replaced with actual values when the template is rendered.
+- **Variables**: Placeholders for dynamic data that will be replaced with actual values when the template is rendered.
-* **Functions**: Custom functions or logic that can be used to manipulate the template's content.
+- **Functions**: Custom functions or logic that can be used to manipulate the template's content.
 Here's an example of a simple chat template:
-```go
+```gotmpl
 {{- range .Messages }}
 {{ .Role }}: {{ .Content }}
 {{- end }}
@@ -20,9 +22,9 @@ Here's an example of a simple chat template:
 In this example, we have:
-* A basic messages structure (layout)
+- A basic messages structure (layout)
-* Three variables: `Messages`, `Role`, and `Content` (variables)
+- Three variables: `Messages`, `Role`, and `Content` (variables)
-* A custom function (action) that iterates over an array of items (`range .Messages`) and displays each item
+- A custom function (action) that iterates over an array of items (`range .Messages`) and displays each item
 ## Adding templates to your model
@@ -61,7 +63,7 @@ TEMPLATE """{{- if .System }}<|start_header_id|>system<|end_header_id|>
 `Messages[].Role` (string): role which can be one of `system`, `user`, `assistant`, or `tool`
-`Messages[].Content` (string):  message content
+`Messages[].Content` (string): message content
 `Messages[].ToolCalls` (list): list of tools the model wants to call
@@ -99,9 +101,9 @@ TEMPLATE """{{- if .System }}<|start_header_id|>system<|end_header_id|>
 Keep the following tips and best practices in mind when working with Go templates:
-* **Be mindful of dot**: Control flow structures like `range` and `with` changes the value `.`
+- **Be mindful of dot**: Control flow structures like `range` and `with` changes the value `.`
-* **Out-of-scope variables**: Use `$.` to reference variables not currently in scope, starting from the root
+- **Out-of-scope variables**: Use `$.` to reference variables not currently in scope, starting from the root
-* **Whitespace control**: Use `-` to trim leading (`{{-`) and trailing (`-}}`) whitespace
+- **Whitespace control**: Use `-` to trim leading (`{{-`) and trailing (`-}}`) whitespace
 ## Examples
@@ -155,13 +157,14 @@ CodeLlama [7B](https://ollama.com/library/codellama:7b-code) and [13B](https://o
 <PRE> {{ .Prompt }} <SUF>{{ .Suffix }} <MID>
 ```
-> [!NOTE]
+<Note>
-> CodeLlama 34B and 70B code completion and all instruct and Python fine-tuned models do not support fill-in-middle.
+  CodeLlama 34B and 70B code completion and all instruct and Python fine-tuned models do not support fill-in-middle.
+</Note>
 #### Codestral
 Codestral [22B](https://ollama.com/library/codestral:22b) supports fill-in-middle.
-```go
+```gotmpl
 [SUFFIX]{{ .Suffix }}[PREFIX] {{ .Prompt }}
 ```
--- a/docs/troubleshooting.mdx
+++ b/docs/troubleshooting.mdx
-# How to troubleshoot issues
+---
+title: Troubleshooting
+description: How to troubleshoot issues encountered with Ollama
+---
 Sometimes Ollama may not perform as expected. One of the best ways to figure out what happened is to take a look at the logs. Find the logs on **Mac** by running the command:
@@ -23,9 +26,11 @@ docker logs <container-name>
 If manually running `ollama serve` in a terminal, the logs will be on that terminal.
 When you run Ollama on **Windows**, there are a few different locations. You can view them in the explorer window by hitting `<cmd>+R` and type in:
- `explorer %LOCALAPPDATA%\Ollama` to view logs.  The most recent server logs will be in `server.log` and older logs will be in `server-#.log`
+- `explorer %LOCALAPPDATA%\Ollama` to view logs. The most recent server logs will be in `server.log` and older logs will be in `server-#.log`
 - `explorer %LOCALAPPDATA%\Programs\Ollama` to browse the binaries (The installer adds this to your user PATH)
 - `explorer %HOMEPATH%\.ollama` to browse where models and configuration is stored
+- `explorer %TEMP%` where temporary executable files are stored in one or more `ollama*` directories
 To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal
@@ -38,14 +43,26 @@ Join the [Discord](https://discord.gg/ollama) for help interpreting the logs.
 ## LLM libraries
-Ollama includes multiple LLM libraries compiled for different GPU libraries and versions. Ollama tries to pick the best one based on the capabilities of your system. If this autodetection has problems, or you run into other problems (e.g. crashes in your GPU) you can workaround this by forcing a specific LLM library.
+Ollama includes multiple LLM libraries compiled for different GPUs and CPU vector features. Ollama tries to pick the best one based on the capabilities of your system. If this autodetection has problems, or you run into other problems (e.g. crashes in your GPU) you can workaround this by forcing a specific LLM library. `cpu_avx2` will perform the best, followed by `cpu_avx` an the slowest but most compatible is `cpu`. Rosetta emulation under MacOS will work with the `cpu` library.
+In the server log, you will see a message that looks something like this (varies from release to release):
+```
+Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]
+```
 **Experimental LLM Library Override**
-You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to limit autodetection, so for example, if you have both CUDA and AMD GPUs, but want to force the CUDA v13 only, use:
+You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:
 ```shell
-OLLAMA_LLM_LIBRARY="cuda_v13" ollama serve
+OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
+```
+You can see what features your CPU has with the following.
+```shell
+cat /proc/cpuinfo| grep flags | head -1
 ```
 ## Installing older or pre-release versions on Linux
@@ -56,13 +73,17 @@ If you run into problems on Linux and want to install an older version, or you'd
 curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.5.7 sh
 ```
+## Linux tmp noexec
+If your system is configured with the "noexec" flag where Ollama stores its temporary executable files, you can specify an alternate location by setting OLLAMA_TMPDIR to a location writable by the user ollama runs as. For example OLLAMA_TMPDIR=/usr/share/ollama/
 ## Linux docker
-If Ollama initially works on the GPU in a docker container, but then switches to running on CPU after some period of time with errors in the server log reporting GPU discovery failures, this can be resolved by disabling systemd cgroup management in Docker.  Edit `/etc/docker/daemon.json` on the host and add `"exec-opts": ["native.cgroupdriver=cgroupfs"]` to the docker configuration.
+If Ollama initially works on the GPU in a docker container, but then switches to running on CPU after some period of time with errors in the server log reporting GPU discovery failures, this can be resolved by disabling systemd cgroup management in Docker. Edit `/etc/docker/daemon.json` on the host and add `"exec-opts": ["native.cgroupdriver=cgroupfs"]` to the docker configuration.
 ## NVIDIA GPU Discovery
-When Ollama starts up, it takes inventory of the GPUs present in the system to determine compatibility and how much VRAM is available.  Sometimes this discovery can fail to find your GPUs.  In general, running the latest driver will yield the best results.
+When Ollama starts up, it takes inventory of the GPUs present in the system to determine compatibility and how much VRAM is available. Sometimes this discovery can fail to find your GPUs. In general, running the latest driver will yield the best results.
 ### Linux NVIDIA Troubleshooting
@@ -70,28 +91,26 @@ If you are using a container to run Ollama, make sure you've set up the containe
 Sometimes the Ollama can have difficulties initializing the GPU. When you check the server logs, this can show up as various error codes, such as "3" (not initialized), "46" (device unavailable), "100" (no device), "999" (unknown), or others. The following troubleshooting techniques may help resolve the problem
- If you are using a container, is the container runtime working?  Try `docker run --gpus all ubuntu nvidia-smi` - if this doesn't work, Ollama won't be able to see your NVIDIA GPU.
+- If you are using a container, is the container runtime working? Try `docker run --gpus all ubuntu nvidia-smi` - if this doesn't work, Ollama won't be able to see your NVIDIA GPU.
 - Is the uvm driver loaded? `sudo nvidia-modprobe -u`
 - Try reloading the nvidia_uvm driver - `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm`
 - Try rebooting
 - Make sure you're running the latest nvidia drivers
 If none of those resolve the problem, gather additional information and file an issue:
 - Set `CUDA_ERROR_LEVEL=50` and try again to get more diagnostic logs
 - Check dmesg for any errors `sudo dmesg | grep -i nvrm` and `sudo dmesg | grep -i nvidia`
-You may get more details for initialization failures by enabling debug prints in the uvm driver.  You should only use this temporarily while troubleshooting
- `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm uvm_debug_prints=1`
 ## AMD GPU Discovery
-On linux, AMD GPU access typically requires `video` and/or `render` group membership to access the `/dev/kfd` device.  If permissions are not set up correctly, Ollama will detect this and report an error in the server log.
+On linux, AMD GPU access typically requires `video` and/or `render` group membership to access the `/dev/kfd` device. If permissions are not set up correctly, Ollama will detect this and report an error in the server log.
-When running in a container, in some Linux distributions and container runtimes, the ollama process may be unable to access the GPU.  Use `ls -lnd /dev/kfd /dev/dri /dev/dri/*` on the host system to determine the **numeric** group IDs on your system, and pass additional `--group-add ...` arguments to the container so it can access the required devices.   For example, in the following output `crw-rw---- 1 0  44 226,   0 Sep 16 16:55 /dev/dri/card0` the group ID column is `44`
+When running in a container, in some Linux distributions and container runtimes, the ollama process may be unable to access the GPU. Use `ls -lnd /dev/kfd /dev/dri /dev/dri/*` on the host system to determine the **numeric** group IDs on your system, and pass additional `--group-add ...` arguments to the container so it can access the required devices. For example, in the following output `crw-rw---- 1 0  44 226,   0 Sep 16 16:55 /dev/dri/card0` the group ID column is `44`
 If you are experiencing problems getting Ollama to correctly discover or use your GPU for inference, the following may help isolate the failure.
- `AMD_LOG_LEVEL=3` Enable info log levels in the AMD HIP/ROCm libraries.  This can help show more detailed error codes that can help troubleshoot problems
+- `AMD_LOG_LEVEL=3` Enable info log levels in the AMD HIP/ROCm libraries. This can help show more detailed error codes that can help troubleshoot problems
 - `OLLAMA_DEBUG=1` During GPU discovery additional information will be reported
 - Check dmesg for any errors from amdgpu or kfd drivers `sudo dmesg | grep -i amdgpu` and `sudo dmesg | grep -i kfd`
@@ -103,4 +122,4 @@ If you experience gibberish responses when models load across multiple AMD GPUs
 ## Windows Terminal Errors
-Older versions of Windows 10 (e.g., 21H1) are known to have a bug where the standard terminal program does not display control characters correctly.  This can result in a long string of strings like `←[?25h←[?25l` being displayed, sometimes erroring with `The parameter is incorrect`  To resolve this problem, please update to Win 10 22H1 or newer.
+Older versions of Windows 10 (e.g., 21H1) are known to have a bug where the standard terminal program does not display control characters correctly. This can result in a long string of strings like `←[?25h←[?25l` being displayed, sometimes erroring with `The parameter is incorrect` To resolve this problem, please update to Win 10 22H1 or newer.
--- a/docs/windows.mdx
+++ b/docs/windows.mdx
-# Ollama Windows
+---
+title: Windows
+---
 Welcome to Ollama for Windows.
@@ -7,20 +9,20 @@ No more WSL required!
 Ollama now runs as a native Windows application, including NVIDIA and AMD Radeon GPU support.
 After installing Ollama for Windows, Ollama will run in the background and
 the `ollama` command line is available in `cmd`, `powershell` or your favorite
-terminal application. As usual the Ollama [api](./api.md) will be served on
+terminal application. As usual the Ollama [API](/api) will be served on
 `http://localhost:11434`.
 ## System Requirements
-* Windows 10 22H2 or newer, Home or Pro
+- Windows 10 22H2 or newer, Home or Pro
-* NVIDIA 452.39 or newer Drivers if you have an NVIDIA card
+- NVIDIA 452.39 or newer Drivers if you have an NVIDIA card
-* AMD Radeon Driver https://www.amd.com/en/support if you have a Radeon card
+- AMD Radeon Driver https://www.amd.com/en/support if you have a Radeon card
 Ollama uses unicode characters for progress indication, which may render as unknown squares in some older terminal fonts in Windows 10. If you see this, try changing your terminal font settings.
 ## Filesystem Requirements
-The Ollama install does not require Administrator, and installs in your home directory by default.  You'll need at least 4GB of space for the binary install.  Once you've installed Ollama, you'll need additional space for storing the Large Language models, which can be tens to hundreds of GB in size.  If your home directory doesn't have enough space, you can change where the binaries are installed, and where the models are stored.
+The Ollama install does not require Administrator, and installs in your home directory by default. You'll need at least 4GB of space for the binary install. Once you've installed Ollama, you'll need additional space for storing the Large Language models, which can be tens to hundreds of GB in size. If your home directory doesn't have enough space, you can change where the binaries are installed, and where the models are stored.
 ### Changing Install Location
@@ -30,6 +32,20 @@ To install the Ollama application in a location different than your home directo
 OllamaSetup.exe /DIR="d:\some\location"
 ```
+### Changing Model Location
+To change where Ollama stores the downloaded models instead of using your home directory, set the environment variable `OLLAMA_MODELS` in your user account.
+1. Start the Settings (Windows 11) or Control Panel (Windows 10) application and search for _environment variables_.
+2. Click on _Edit environment variables for your account_.
+3. Edit or create a new variable for your user account for `OLLAMA_MODELS` where you want the models stored
+4. Click OK/Apply to save.
+If Ollama is already running, Quit the tray application and relaunch it from the Start menu, or a new terminal started after you saved the environment variables.
 ## API Access
 Here's a quick example showing API access from `powershell`
@@ -40,22 +56,24 @@ Here's a quick example showing API access from `powershell`
 ## Troubleshooting
-Ollama on Windows stores files in a few different locations.  You can view them in
+Ollama on Windows stores files in a few different locations. You can view them in
 the explorer window by hitting `<Ctrl>+R` and type in:
 - `explorer %LOCALAPPDATA%\Ollama` contains logs, and downloaded updates
-    - *app.log* contains most resent logs from the GUI application
+  - _app.log_ contains most resent logs from the GUI application
-    - *server.log* contains the most recent server logs
+  - _server.log_ contains the most recent server logs
-    - *upgrade.log* contains log output for upgrades
+  - _upgrade.log_ contains log output for upgrades
 - `explorer %LOCALAPPDATA%\Programs\Ollama` contains the binaries (The installer adds this to your user PATH)
 - `explorer %HOMEPATH%\.ollama` contains models and configuration
+- `explorer %TEMP%` contains temporary executable files in one or more `ollama*` directories
 ## Uninstall
-The Ollama Windows installer registers an Uninstaller application.  Under `Add or remove programs` in Windows Settings, you can uninstall Ollama.
+The Ollama Windows installer registers an Uninstaller application. Under `Add or remove programs` in Windows Settings, you can uninstall Ollama.
-> [!NOTE]
-> If you have [changed the OLLAMA_MODELS location](#changing-model-location), the installer will not remove your downloaded models
+<Note>
+  If you have [changed the OLLAMA_MODELS location](#changing-model-location), the installer will not remove your downloaded models
+</Note>
 ## Standalone CLI
@@ -66,11 +84,12 @@ help you keep up to date.
 If you'd like to install or integrate Ollama as a service, a standalone
 `ollama-windows-amd64.zip` zip file is available containing only the Ollama CLI
-and GPU library dependencies for Nvidia.  If you have an AMD GPU, also download
+and GPU library dependencies for Nvidia. If you have an AMD GPU, also download
 and extract the additional ROCm package `ollama-windows-amd64-rocm.zip` into the
-same directory.  Both zip files are necessary for a complete AMD installation.
+same directory. This allows for embedding Ollama in existing applications, or
-This allows for embedding Ollama in existing applications, or running it as a
+running it as a system service via `ollama serve` with tools such as
-system service via `ollama serve` with tools such as [NSSM](https://nssm.cc/). 
+[NSSM](https://nssm.cc/).
-> [!NOTE]  
+<Note>
-> If you are upgrading from a prior version, you should remove the old directories first.
+  If you are upgrading from a prior version, you should remove the old directories first.
+</Note>