docs: add reference to docs.ollama.com (#12800)

934dd9e1 · Parth Sareen · GitHub · 1188f408 · 934dd9e1 · 934dd9e1
Unverified Commit 934dd9e1 authored Oct 28, 2025 by Parth Sareen Committed by GitHub Oct 28, 2025
20 changed files
--- a/docs/images/n8n-ollama-form.png
+++ b/docs/images/n8n-ollama-form.png
--- a/docs/images/ollama-settings.png
+++ b/docs/images/ollama-settings.png
--- a/docs/images/vscode-model-options.png
+++ b/docs/images/vscode-model-options.png
--- a/docs/images/vscode-models.png
+++ b/docs/images/vscode-models.png
--- a/docs/images/vscode-sidebar.png
+++ b/docs/images/vscode-sidebar.png
--- a/docs/images/welcome.png
+++ b/docs/images/welcome.png
--- a/docs/images/xcode-chat-icon.png
+++ b/docs/images/xcode-chat-icon.png
--- a/docs/images/xcode-intelligence-window.png
+++ b/docs/images/xcode-intelligence-window.png
--- a/docs/images/xcode-locally-hosted.png
+++ b/docs/images/xcode-locally-hosted.png
--- a/docs/images/zed-ollama-dropdown.png
+++ b/docs/images/zed-ollama-dropdown.png
--- a/docs/images/zed-settings.png
+++ b/docs/images/zed-settings.png
--- a/docs/import.mdx
+++ b/docs/import.mdx
+---
+title: Importing a Model
+---
+## Table of Contents
+- [Importing a Safetensors adapter](#Importing-a-fine-tuned-adapter-from-Safetensors-weights)
+- [Importing a Safetensors model](#Importing-a-model-from-Safetensors-weights)
+- [Importing a GGUF file](#Importing-a-GGUF-based-model-or-adapter)
+- [Sharing models on ollama.com](#Sharing-your-model-on-ollamacom)
+## Importing a fine tuned adapter from Safetensors weights
+First, create a `Modelfile` with a `FROM` command pointing at the base model you used for fine tuning, and an `ADAPTER` command which points to the directory with your Safetensors adapter:
+```dockerfile
+FROM <base model name>
+ADAPTER /path/to/safetensors/adapter/directory
+```
+Make sure that you use the same base model in the `FROM` command as you used to create the adapter otherwise you will get erratic results. Most frameworks use different quantization methods, so it's best to use non-quantized (i.e. non-QLoRA) adapters. If your adapter is in the same directory as your `Modelfile`, use `ADAPTER .` to specify the adapter path.
+Now run `ollama create` from the directory where the `Modelfile` was created:
+```shell
+ollama create my-model
+```
+Lastly, test the model:
+```shell
+ollama run my-model
+```
+Ollama supports importing adapters based on several different model architectures including:
+- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2);
+- Mistral (including Mistral 1, Mistral 2, and Mixtral); and
+- Gemma (including Gemma 1 and Gemma 2)
+You can create the adapter using a fine tuning framework or tool which can output adapters in the Safetensors format, such as:
+- Hugging Face [fine tuning framework](https://huggingface.co/docs/transformers/en/training)
+- [Unsloth](https://github.com/unslothai/unsloth)
+- [MLX](https://github.com/ml-explore/mlx)
+## Importing a model from Safetensors weights
+First, create a `Modelfile` with a `FROM` command which points to the directory containing your Safetensors weights:
+```dockerfile
+FROM /path/to/safetensors/directory
+```
+If you create the Modelfile in the same directory as the weights, you can use the command `FROM .`.
+Now run the `ollama create` command from the directory where you created the `Modelfile`:
+```shell
+ollama create my-model
+```
+Lastly, test the model:
+```shell
+ollama run my-model
+```
+Ollama supports importing models for several different architectures including:
+- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2);
+- Mistral (including Mistral 1, Mistral 2, and Mixtral);
+- Gemma (including Gemma 1 and Gemma 2); and
+- Phi3
+This includes importing foundation models as well as any fine tuned models which have been _fused_ with a foundation model.
+## Importing a GGUF based model or adapter
+If you have a GGUF based model or adapter it is possible to import it into Ollama. You can obtain a GGUF model or adapter by:
+- converting a Safetensors model with the `convert_hf_to_gguf.py` from Llama.cpp;
+- converting a Safetensors adapter with the `convert_lora_to_gguf.py` from Llama.cpp; or
+- downloading a model or adapter from a place such as HuggingFace
+To import a GGUF model, create a `Modelfile` containing:
+```dockerfile
+FROM /path/to/file.gguf
+```
+For a GGUF adapter, create the `Modelfile` with:
+```dockerfile
+FROM <model name>
+ADAPTER /path/to/file.gguf
+```
+When importing a GGUF adapter, it's important to use the same base model as the base model that the adapter was created with. You can use:
+- a model from Ollama
+- a GGUF file
+- a Safetensors based model
+Once you have created your `Modelfile`, use the `ollama create` command to build the model.
+```shell
+ollama create my-model
+```
+## Quantizing a Model
+Quantizing a model allows you to run models faster and with less memory consumption but at reduced accuracy. This allows you to run a model on more modest hardware.
+Ollama can quantize FP16 and FP32 based models into different quantization levels using the `-q/--quantize` flag with the `ollama create` command.
+First, create a Modelfile with the FP16 or FP32 based model you wish to quantize.
+```dockerfile
+FROM /path/to/my/gemma/f16/model
+```
+Use `ollama create` to then create the quantized model.
+```shell
+$ ollama create --quantize q4_K_M mymodel
+transferring model data
+quantizing F16 model to Q4_K_M
+creating new layer sha256:735e246cc1abfd06e9cdcf95504d6789a6cd1ad7577108a70d9902fef503c1bd
+creating new layer sha256:0853f0ad24e5865173bbf9ffcc7b0f5d56b66fd690ab1009867e45e7d2c4db0f
+writing manifest
+success
+```
+### Supported Quantizations
+- `q4_0`
+- `q4_1`
+- `q5_0`
+- `q5_1`
+- `q8_0`
+#### K-means Quantizations
+- `q3_K_S`
+- `q3_K_M`
+- `q3_K_L`
+- `q4_K_S`
+- `q4_K_M`
+- `q5_K_S`
+- `q5_K_M`
+- `q6_K`
+## Sharing your model on ollama.com
+You can share any model you have created by pushing it to [ollama.com](https://ollama.com) so that other users can try it out.
+First, use your browser to go to the [Ollama Sign-Up](https://ollama.com/signup) page. If you already have an account, you can skip this step.
+<img src="images/signup.png" alt="Sign-Up" width="40%" />
+The `Username` field will be used as part of your model's name (e.g. `jmorganca/mymodel`), so make sure you are comfortable with the username that you have selected.
+Now that you have created an account and are signed-in, go to the [Ollama Keys Settings](https://ollama.com/settings/keys) page.
+Follow the directions on the page to determine where your Ollama Public Key is located.
+<img src="images/ollama-keys.png" alt="Ollama Keys" width="80%" />
+Click on the `Add Ollama Public Key` button, and copy and paste the contents of your Ollama Public Key into the text field.
+To push a model to [ollama.com](https://ollama.com), first make sure that it is named correctly with your username. You may have to use the `ollama cp` command to copy
+your model to give it the correct name. Once you're happy with your model's name, use the `ollama push` command to push it to [ollama.com](https://ollama.com).
+```shell
+ollama cp mymodel myuser/mymodel
+ollama push myuser/mymodel
+```
+Once your model has been pushed, other users can pull and run it by using the command:
+```shell
+ollama run myuser/mymodel
+```
--- a/docs/index.mdx
+++ b/docs/index.mdx
+---
+title: Ollama's documentation
+sidebarTitle: Welcome
+---
+<img src="/images/welcome.png" noZoom className="rounded-3xl" />
+[Ollama](https://ollama.com) is the easiest way to get up and running with large language models such as gpt-oss, Gemma 3, DeepSeek-R1, Qwen3 and more.
+<CardGroup cols={2}>
+  <Card title="Quickstart" icon="rocket" href="/quickstart">
+    Get up and running with your first model
+  </Card>
+  <Card
+    title="Download Ollama"
+    icon="download"
+    href="https://ollama.com/download"
+  >
+    Download Ollama on macOS, Windows or Linux
+  </Card>
+  <Card title="Cloud" icon="cloud" href="/cloud">
+    Ollama's cloud models offer larger models with better performance.
+  </Card>
+  <Card title="API reference" icon="terminal" href="/api">
+    View Ollama's API reference
+  </Card>
+</CardGroup>
+## Libraries
+<CardGroup cols={2}>
+  <Card
+    title="Ollama's Python Library"
+    icon="python"
+    href="https://github.com/ollama/ollama-python"
+  >
+    The official library for using Ollama with Python
+  </Card>
+  <Card title="Ollama's JavaScript library" icon="js" href="https://github.com/ollama/ollama-js">
+    The official library for using Ollama with JavaScript or TypeScript.
+  </Card>
+  <Card title="Community libraries" icon="github" href="https://github.com/ollama/ollama?tab=readme-ov-file#libraries-1">
+    View a list of 20+ community-supported libraries for Ollama
+  </Card>
+</CardGroup>
+## Community
+<CardGroup cols={2}>
+  <Card title="Discord" icon="discord" href="https://discord.gg/ollama">
+    Join our Discord community
+  </Card>
+  <Card title="Reddit" icon="reddit" href="https://reddit.com/r/ollama">
+    Join our Reddit community
+  </Card>
+</CardGroup>
--- a/docs/integrations/cline.mdx
+++ b/docs/integrations/cline.mdx
+---
+title: Cline
+---
+## Install
+Install [Cline](https://docs.cline.bot/getting-started/installing-cline) in your IDE.
+## Usage with Ollama
+1. Open Cline settings > `API Configuration` and set `API Provider` to `Ollama`
+2. Select a model under `Model` or type one (e.g. `qwen3`)
+3. Update the context window to at least 32K tokens under `Context Window`
+<Note>Coding tools require a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/cline-settings.png" 
+    alt="Cline settings configuration showing API Provider set to Ollama"
+    width="50%"
+  />
+</div>
+## Connecting to ollama.com
+1. Create an [API key](https://ollama.com/settings/keys) from ollama.com
+2. Click on `Use custom base URL` and set it to `https://ollama.com`
+3. Enter your **Ollama API Key**
+4. Select a model from the list
+### Recommended Models
+- `qwen3-coder:480b` 
+- `deepseek-v3.1:671b`
--- a/docs/integrations/codex.mdx
+++ b/docs/integrations/codex.mdx
+---
+title: Codex
+---
+## Install
+Install the [Codex CLI](https://developers.openai.com/codex/cli/):
+```
+npm install -g @openai/codex
+```
+## Usage with Ollama
+<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>
+To use `codex` with Ollama, use the `--oss` flag:
+```
+codex --oss
+```
+### Changing Models
+By default, codex will use the local `gpt-oss:20b` model. However, you can specify a different model with the `-m` flag:
+```
+codex --oss -m gpt-oss:120b
+```
+### Cloud Models
+```
+codex --oss -m gpt-oss:120b-cloud
+```
+## Connecting to ollama.com
+Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
+To use ollama.com directly, edit your `~/.codex/config.toml` file to point to ollama.com.
+```toml
+model = "gpt-oss:120b"
+model_provider = "ollama"
+[model_providers.ollama]
+name = "Ollama"
+base_url = "https://ollama.com/v1"
+env_key = "OLLAMA_API_KEY"
+```
+Run `codex` in a new terminal to load the new settings.
--- a/docs/integrations/droid.mdx
+++ b/docs/integrations/droid.mdx
+---
+title: Droid
+---
+## Install
+Install the [Droid CLI](https://factory.ai/):
+```bash
+curl -fsSL https://app.factory.ai/cli | sh
+```
+<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
+## Usage with Ollama
+Add a local configuration block to `~/.factory/config.json`:
+```json
+{
+  "custom_models": [
+    {
+      "model_display_name": "qwen3-coder [Ollama]",
+      "model": "qwen3-coder",
+      "base_url": "http://localhost:11434/v1/",
+      "api_key": "not-needed",
+      "provider": "generic-chat-completion-api",
+      "max_tokens": 32000 
+    }
+  ]
+}
+```
+## Cloud Models
+`qwen3-coder:480b-cloud` is the recommended model for use with Droid.
+Add the cloud configuration block to `~/.factory/config.json`:
+```json
+{
+  "custom_models": [
+    {
+      "model_display_name": "qwen3-coder [Ollama Cloud]",
+      "model": "qwen3-coder:480b-cloud",
+      "base_url": "http://localhost:11434/v1/",
+      "api_key": "not-needed",
+      "provider": "generic-chat-completion-api",
+      "max_tokens": 128000
+    }
+  ]
+}
+```
+## Connecting to ollama.com
+1. Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
+2. Add the cloud configuration block to `~/.factory/config.json`:
+   ```json
+   {
+     "custom_models": [
+       {
+         "model_display_name": "qwen3-coder [Ollama Cloud]",
+         "model": "qwen3-coder:480b",
+         "base_url": "https://ollama.com/v1/",
+         "api_key": "OLLAMA_API_KEY",
+         "provider": "generic-chat-completion-api",
+         "max_tokens": 128000
+       }
+     ]
+   }
+   ```
+Run `droid` in a new terminal to load the new settings.
\ No newline at end of file
--- a/docs/integrations/goose.mdx
+++ b/docs/integrations/goose.mdx
+---
+title: Goose
+---
+## Goose Desktop
+Install [Goose](https://block.github.io/goose/docs/getting-started/installation/) Desktop.
+### Usage with Ollama
+1. In Goose, open **Settings** → **Configure Provider**.  
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/goose-settings.png" 
+    alt="Goose settings Panel"
+    width="75%"
+  />
+</div>
+2. Find **Ollama**, click **Configure** 
+3. Confirm **API Host** is `http://localhost:11434` and click Submit
+### Connecting to ollama.com
+1. Create an [API key](https://ollama.com/settings/keys) on ollama.com and save it in your `.env` 
+2. In Goose, set **API Host** to `https://ollama.com`
+## Goose CLI
+Install [Goose](https://block.github.io/goose/docs/getting-started/installation/) CLI
+### Usage with Ollama
+1. Run `goose configure`
+2. Select **Configure Providers** and select **Ollama**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/goose-cli.png" 
+    alt="Goose CLI"
+    width="50%"
+  />
+</div>
+3. Enter model name (e.g `qwen3`)
+### Connecting to ollama.com
+1. Create an [API key](https://ollama.com/settings/keys) on ollama.com and save it in your `.env` 
+2. Run `goose configure`
+3. Select **Configure Providers** and select **Ollama**
+4. Update **OLLAMA_HOST** to `https://ollama.com`
--- a/docs/integrations/jetbrains.mdx
+++ b/docs/integrations/jetbrains.mdx
+---
+title: JetBrains
+---
+<Note>This example uses **IntelliJ**; same steps apply to other JetBrains IDEs (e.g., PyCharm).</Note>
+## Install
+Install [IntelliJ](https://www.jetbrains.com/idea/).
+## Usage with Ollama
+<Note>
+  To use **Ollama**,  you will need a [JetBrains AI Subscription](https://www.jetbrains.com/ai-ides/buy/?section=personal&billing=yearly).
+</Note>
+1. In Intellij, click the **chat icon** located in the right sidebar
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/intellij-chat-sidebar.png" 
+    alt="Intellij Sidebar Chat"
+    width="50%"
+  />
+</div>
+2. Select the **current model** in the sidebar, then click **Set up Local Models**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/intellij-current-model.png" 
+    alt="Intellij model bottom right corner"
+    width="50%"
+  />
+</div>
+3. Under **Third Party AI Providers**, choose **Ollama**  
+4. Confirm the **Host URL** is `http://localhost:11434`, then click **Ok**  
+5. Once connected, select a model under **Local models by Ollama**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/intellij-local-models.png" 
+    alt="Zed star icon in bottom right corner"
+    width="50%"
+  />
+</div>
--- a/docs/integrations/n8n.mdx
+++ b/docs/integrations/n8n.mdx
+---
+title: n8n
+---
+## Install
+Install [n8n](https://docs.n8n.io/choose-n8n/).
+## Using Ollama Locally
+1. In the top right corner, click the dropdown and select **Create Credential**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/n8n-credential-creation.png" 
+    alt="Create a n8n Credential"
+    width="75%"
+  />
+</div>
+2. Under **Add new credential** select **Ollama**
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/n8n-ollama-form.png" 
+    alt="Select Ollama under Credential"
+    width="75%"
+  />
+</div>
+3. Confirm Base URL is set to `http://localhost:11434` and click **Save**
+<Note> If connecting to `http://localhost:11434` fails, use `http://127.0.0.1:11434`</Note>
+4. When creating a new workflow, select **Add a first step** and select an **Ollama node** 
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/n8n-chat-node.png" 
+    alt="Add a first step with Ollama node"
+    width="75%"
+  />
+</div>
+5. Select your model of choice (e.g. `qwen3-coder`) 
+<div style={{ display: 'flex', justifyContent: 'center' }}>
+  <img 
+    src="/images/n8n-models.png" 
+    alt="Set up Ollama credentials"
+    width="75%"
+  />
+</div>
+## Connecting to ollama.com
+1. Create an [API key](https://ollama.com/settings/keys) on **ollama.com**.  
+2. In n8n, click **Create Credential** and select **Ollama**
+4. Set the **API URL** to `https://ollama.com`
+5. Enter your **API Key** and click **Save**
--- a/docs/integrations/roo-code.mdx
+++ b/docs/integrations/roo-code.mdx
+---
+title: Roo Code
+---
+## Install
+Install [Roo Code](https://marketplace.visualstudio.com/items?itemName=RooVeterinaryInc.roo-cline) from the VS Code Marketplace.
+## Usage with Ollama
+1. Open Roo Code in VS Code and click the **gear icon** on the top right corner of the Roo Code window to open **Provider Settings**
+2. Set `API Provider` to `Ollama`
+3. (Optional) Update `Base URL` if your Ollama instance is running remotely. The default is `http://localhost:11434`
+4. Enter a valid `Model ID` (for example `qwen3` or `qwen3-coder:480b-cloud`)
+5. Adjust the `Context Window` to at least 32K tokens for coding tasks
+<Note>Coding tools require a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
+## Connecting to ollama.com
+1. Create an [API key](https://ollama.com/settings/keys) from ollama.com
+2. Enable `Use custom base URL` and set it to `https://ollama.com`
+3. Enter your **Ollama API Key**
+4. Select a model from the list
+### Recommended Models
+- `qwen3-coder:480b`
+- `deepseek-v3.1:671b`