Unverified Commit 3d99d977 authored by Parth Sareen's avatar Parth Sareen Committed by GitHub
Browse files

docs: add docs for docs.ollama.com (#12805)

parent 6d02a43a
# Importing a model ---
title: Importing a Model
---
## Table of Contents ## Table of Contents
* [Importing a Safetensors adapter](#Importing-a-fine-tuned-adapter-from-Safetensors-weights) - [Importing a Safetensors adapter](#Importing-a-fine-tuned-adapter-from-Safetensors-weights)
* [Importing a Safetensors model](#Importing-a-model-from-Safetensors-weights) - [Importing a Safetensors model](#Importing-a-model-from-Safetensors-weights)
* [Importing a GGUF file](#Importing-a-GGUF-based-model-or-adapter) - [Importing a GGUF file](#Importing-a-GGUF-based-model-or-adapter)
* [Sharing models on ollama.com](#Sharing-your-model-on-ollamacom) - [Sharing models on ollama.com](#Sharing-your-model-on-ollamacom)
## Importing a fine tuned adapter from Safetensors weights ## Importing a fine tuned adapter from Safetensors weights
...@@ -32,16 +34,15 @@ ollama run my-model ...@@ -32,16 +34,15 @@ ollama run my-model
Ollama supports importing adapters based on several different model architectures including: Ollama supports importing adapters based on several different model architectures including:
* Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2); - Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2);
* Mistral (including Mistral 1, Mistral 2, and Mixtral); and - Mistral (including Mistral 1, Mistral 2, and Mixtral); and
* Gemma (including Gemma 1 and Gemma 2) - Gemma (including Gemma 1 and Gemma 2)
You can create the adapter using a fine tuning framework or tool which can output adapters in the Safetensors format, such as: You can create the adapter using a fine tuning framework or tool which can output adapters in the Safetensors format, such as:
* Hugging Face [fine tuning framework](https://huggingface.co/docs/transformers/en/training) - Hugging Face [fine tuning framework](https://huggingface.co/docs/transformers/en/training)
* [Unsloth](https://github.com/unslothai/unsloth) - [Unsloth](https://github.com/unslothai/unsloth)
* [MLX](https://github.com/ml-explore/mlx) - [MLX](https://github.com/ml-explore/mlx)
## Importing a model from Safetensors weights ## Importing a model from Safetensors weights
...@@ -53,8 +54,6 @@ FROM /path/to/safetensors/directory ...@@ -53,8 +54,6 @@ FROM /path/to/safetensors/directory
If you create the Modelfile in the same directory as the weights, you can use the command `FROM .`. If you create the Modelfile in the same directory as the weights, you can use the command `FROM .`.
If you do not create the Modelfile, ollama will act as if there was a Modelfile with the command `FROM .`.
Now run the `ollama create` command from the directory where you created the `Modelfile`: Now run the `ollama create` command from the directory where you created the `Modelfile`:
```shell ```shell
...@@ -69,19 +68,20 @@ ollama run my-model ...@@ -69,19 +68,20 @@ ollama run my-model
Ollama supports importing models for several different architectures including: Ollama supports importing models for several different architectures including:
* Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2); - Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2);
* Mistral (including Mistral 1, Mistral 2, and Mixtral); - Mistral (including Mistral 1, Mistral 2, and Mixtral);
* Gemma (including Gemma 1 and Gemma 2); and - Gemma (including Gemma 1 and Gemma 2); and
* Phi3 - Phi3
This includes importing foundation models as well as any fine tuned models which have been _fused_ with a foundation model. This includes importing foundation models as well as any fine tuned models which have been _fused_ with a foundation model.
## Importing a GGUF based model or adapter ## Importing a GGUF based model or adapter
If you have a GGUF based model or adapter it is possible to import it into Ollama. You can obtain a GGUF model or adapter by: If you have a GGUF based model or adapter it is possible to import it into Ollama. You can obtain a GGUF model or adapter by:
* converting a Safetensors model with the `convert_hf_to_gguf.py` from Llama.cpp; - converting a Safetensors model with the `convert_hf_to_gguf.py` from Llama.cpp;
* converting a Safetensors adapter with the `convert_lora_to_gguf.py` from Llama.cpp; or - converting a Safetensors adapter with the `convert_lora_to_gguf.py` from Llama.cpp; or
* downloading a model or adapter from a place such as HuggingFace - downloading a model or adapter from a place such as HuggingFace
To import a GGUF model, create a `Modelfile` containing: To import a GGUF model, create a `Modelfile` containing:
...@@ -98,9 +98,9 @@ ADAPTER /path/to/file.gguf ...@@ -98,9 +98,9 @@ ADAPTER /path/to/file.gguf
When importing a GGUF adapter, it's important to use the same base model as the base model that the adapter was created with. You can use: When importing a GGUF adapter, it's important to use the same base model as the base model that the adapter was created with. You can use:
* a model from Ollama - a model from Ollama
* a GGUF file - a GGUF file
* a Safetensors based model - a Safetensors based model
Once you have created your `Modelfile`, use the `ollama create` command to build the model. Once you have created your `Modelfile`, use the `ollama create` command to build the model.
...@@ -134,13 +134,22 @@ success ...@@ -134,13 +134,22 @@ success
### Supported Quantizations ### Supported Quantizations
- `q4_0`
- `q4_1`
- `q5_0`
- `q5_1`
- `q8_0` - `q8_0`
#### K-means Quantizations #### K-means Quantizations
- `q3_K_S`
- `q3_K_M`
- `q3_K_L`
- `q4_K_S` - `q4_K_S`
- `q4_K_M` - `q4_K_M`
- `q5_K_S`
- `q5_K_M`
- `q6_K`
## Sharing your model on ollama.com ## Sharing your model on ollama.com
...@@ -148,7 +157,7 @@ You can share any model you have created by pushing it to [ollama.com](https://o ...@@ -148,7 +157,7 @@ You can share any model you have created by pushing it to [ollama.com](https://o
First, use your browser to go to the [Ollama Sign-Up](https://ollama.com/signup) page. If you already have an account, you can skip this step. First, use your browser to go to the [Ollama Sign-Up](https://ollama.com/signup) page. If you already have an account, you can skip this step.
<img src="images/signup.png" alt="Sign-Up" width="40%"> <img src="images/signup.png" alt="Sign-Up" width="40%" />
The `Username` field will be used as part of your model's name (e.g. `jmorganca/mymodel`), so make sure you are comfortable with the username that you have selected. The `Username` field will be used as part of your model's name (e.g. `jmorganca/mymodel`), so make sure you are comfortable with the username that you have selected.
...@@ -156,7 +165,7 @@ Now that you have created an account and are signed-in, go to the [Ollama Keys S ...@@ -156,7 +165,7 @@ Now that you have created an account and are signed-in, go to the [Ollama Keys S
Follow the directions on the page to determine where your Ollama Public Key is located. Follow the directions on the page to determine where your Ollama Public Key is located.
<img src="images/ollama-keys.png" alt="Ollama Keys" width="80%"> <img src="images/ollama-keys.png" alt="Ollama Keys" width="80%" />
Click on the `Add Ollama Public Key` button, and copy and paste the contents of your Ollama Public Key into the text field. Click on the `Add Ollama Public Key` button, and copy and paste the contents of your Ollama Public Key into the text field.
...@@ -173,4 +182,3 @@ Once your model has been pushed, other users can pull and run it by using the co ...@@ -173,4 +182,3 @@ Once your model has been pushed, other users can pull and run it by using the co
```shell ```shell
ollama run myuser/mymodel ollama run myuser/mymodel
``` ```
---
title: Ollama's documentation
sidebarTitle: Welcome
---
<img src="/images/welcome.png" noZoom className="rounded-3xl" />
[Ollama](https://ollama.com) is the easiest way to get up and running with large language models such as gpt-oss, Gemma 3, DeepSeek-R1, Qwen3 and more.
<CardGroup cols={2}>
<Card title="Quickstart" icon="rocket" href="/quickstart">
Get up and running with your first model
</Card>
<Card
title="Download Ollama"
icon="download"
href="https://ollama.com/download"
>
Download Ollama on macOS, Windows or Linux
</Card>
<Card title="Cloud" icon="cloud" href="/cloud">
Ollama's cloud models offer larger models with better performance.
</Card>
<Card title="API reference" icon="terminal" href="/api">
View Ollama's API reference
</Card>
</CardGroup>
## Libraries
<CardGroup cols={2}>
<Card
title="Ollama's Python Library"
icon="python"
href="https://github.com/ollama/ollama-python"
>
The official library for using Ollama with Python
</Card>
<Card title="Ollama's JavaScript library" icon="js" href="https://github.com/ollama/ollama-js">
The official library for using Ollama with JavaScript or TypeScript.
</Card>
<Card title="Community libraries" icon="github" href="https://github.com/ollama/ollama?tab=readme-ov-file#libraries-1">
View a list of 20+ community-supported libraries for Ollama
</Card>
</CardGroup>
## Community
<CardGroup cols={2}>
<Card title="Discord" icon="discord" href="https://discord.gg/ollama">
Join our Discord community
</Card>
<Card title="Reddit" icon="reddit" href="https://reddit.com/r/ollama">
Join our Reddit community
</Card>
</CardGroup>
---
title: Cline
---
## Install
Install [Cline](https://docs.cline.bot/getting-started/installing-cline) in your IDE.
## Usage with Ollama
1. Open Cline settings > `API Configuration` and set `API Provider` to `Ollama`
2. Select a model under `Model` or type one (e.g. `qwen3`)
3. Update the context window to at least 32K tokens under `Context Window`
<Note>Coding tools require a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/cline-settings.png"
alt="Cline settings configuration showing API Provider set to Ollama"
width="50%"
/>
</div>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com
2. Click on `Use custom base URL` and set it to `https://ollama.com`
3. Enter your **Ollama API Key**
4. Select a model from the list
### Recommended Models
- `qwen3-coder:480b`
- `deepseek-v3.1:671b`
---
title: Codex
---
## Install
Install the [Codex CLI](https://developers.openai.com/codex/cli/):
```
npm install -g @openai/codex
```
## Usage with Ollama
<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>
To use `codex` with Ollama, use the `--oss` flag:
```
codex --oss
```
### Changing Models
By default, codex will use the local `gpt-oss:20b` model. However, you can specify a different model with the `-m` flag:
```
codex --oss -m gpt-oss:120b
```
### Cloud Models
```
codex --oss -m gpt-oss:120b-cloud
```
## Connecting to ollama.com
Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
To use ollama.com directly, edit your `~/.codex/config.toml` file to point to ollama.com.
```toml
model = "gpt-oss:120b"
model_provider = "ollama"
[model_providers.ollama]
name = "Ollama"
base_url = "https://ollama.com/v1"
env_key = "OLLAMA_API_KEY"
```
Run `codex` in a new terminal to load the new settings.
---
title: Droid
---
## Install
Install the [Droid CLI](https://factory.ai/):
```bash
curl -fsSL https://app.factory.ai/cli | sh
```
<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
## Usage with Ollama
Add a local configuration block to `~/.factory/config.json`:
```json
{
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama]",
"model": "qwen3-coder",
"base_url": "http://localhost:11434/v1/",
"api_key": "not-needed",
"provider": "generic-chat-completion-api",
"max_tokens": 32000
}
]
}
```
## Cloud Models
`qwen3-coder:480b-cloud` is the recommended model for use with Droid.
Add the cloud configuration block to `~/.factory/config.json`:
```json
{
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama Cloud]",
"model": "qwen3-coder:480b-cloud",
"base_url": "http://localhost:11434/v1/",
"api_key": "not-needed",
"provider": "generic-chat-completion-api",
"max_tokens": 128000
}
]
}
```
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
2. Add the cloud configuration block to `~/.factory/config.json`:
```json
{
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama Cloud]",
"model": "qwen3-coder:480b",
"base_url": "https://ollama.com/v1/",
"api_key": "OLLAMA_API_KEY",
"provider": "generic-chat-completion-api",
"max_tokens": 128000
}
]
}
```
Run `droid` in a new terminal to load the new settings.
\ No newline at end of file
---
title: Goose
---
## Goose Desktop
Install [Goose](https://block.github.io/goose/docs/getting-started/installation/) Desktop.
### Usage with Ollama
1. In Goose, open **Settings** → **Configure Provider**.
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/goose-settings.png"
alt="Goose settings Panel"
width="75%"
/>
</div>
2. Find **Ollama**, click **Configure**
3. Confirm **API Host** is `http://localhost:11434` and click Submit
### Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on ollama.com and save it in your `.env`
2. In Goose, set **API Host** to `https://ollama.com`
## Goose CLI
Install [Goose](https://block.github.io/goose/docs/getting-started/installation/) CLI
### Usage with Ollama
1. Run `goose configure`
2. Select **Configure Providers** and select **Ollama**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/goose-cli.png"
alt="Goose CLI"
width="50%"
/>
</div>
3. Enter model name (e.g `qwen3`)
### Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on ollama.com and save it in your `.env`
2. Run `goose configure`
3. Select **Configure Providers** and select **Ollama**
4. Update **OLLAMA_HOST** to `https://ollama.com`
---
title: JetBrains
---
<Note>This example uses **IntelliJ**; same steps apply to other JetBrains IDEs (e.g., PyCharm).</Note>
## Install
Install [IntelliJ](https://www.jetbrains.com/idea/).
## Usage with Ollama
<Note>
To use **Ollama**, you will need a [JetBrains AI Subscription](https://www.jetbrains.com/ai-ides/buy/?section=personal&billing=yearly).
</Note>
1. In Intellij, click the **chat icon** located in the right sidebar
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/intellij-chat-sidebar.png"
alt="Intellij Sidebar Chat"
width="50%"
/>
</div>
2. Select the **current model** in the sidebar, then click **Set up Local Models**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/intellij-current-model.png"
alt="Intellij model bottom right corner"
width="50%"
/>
</div>
3. Under **Third Party AI Providers**, choose **Ollama**
4. Confirm the **Host URL** is `http://localhost:11434`, then click **Ok**
5. Once connected, select a model under **Local models by Ollama**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/intellij-local-models.png"
alt="Zed star icon in bottom right corner"
width="50%"
/>
</div>
---
title: n8n
---
## Install
Install [n8n](https://docs.n8n.io/choose-n8n/).
## Using Ollama Locally
1. In the top right corner, click the dropdown and select **Create Credential**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-credential-creation.png"
alt="Create a n8n Credential"
width="75%"
/>
</div>
2. Under **Add new credential** select **Ollama**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-ollama-form.png"
alt="Select Ollama under Credential"
width="75%"
/>
</div>
3. Confirm Base URL is set to `http://localhost:11434` and click **Save**
<Note> If connecting to `http://localhost:11434` fails, use `http://127.0.0.1:11434`</Note>
4. When creating a new workflow, select **Add a first step** and select an **Ollama node**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-chat-node.png"
alt="Add a first step with Ollama node"
width="75%"
/>
</div>
5. Select your model of choice (e.g. `qwen3-coder`)
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-models.png"
alt="Set up Ollama credentials"
width="75%"
/>
</div>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on **ollama.com**.
2. In n8n, click **Create Credential** and select **Ollama**
4. Set the **API URL** to `https://ollama.com`
5. Enter your **API Key** and click **Save**
---
title: Roo Code
---
## Install
Install [Roo Code](https://marketplace.visualstudio.com/items?itemName=RooVeterinaryInc.roo-cline) from the VS Code Marketplace.
## Usage with Ollama
1. Open Roo Code in VS Code and click the **gear icon** on the top right corner of the Roo Code window to open **Provider Settings**
2. Set `API Provider` to `Ollama`
3. (Optional) Update `Base URL` if your Ollama instance is running remotely. The default is `http://localhost:11434`
4. Enter a valid `Model ID` (for example `qwen3` or `qwen3-coder:480b-cloud`)
5. Adjust the `Context Window` to at least 32K tokens for coding tasks
<Note>Coding tools require a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com
2. Enable `Use custom base URL` and set it to `https://ollama.com`
3. Enter your **Ollama API Key**
4. Select a model from the list
### Recommended Models
- `qwen3-coder:480b`
- `deepseek-v3.1:671b`
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment