Unverified Commit 934dd9e1 authored by Parth Sareen's avatar Parth Sareen Committed by GitHub
Browse files

docs: add reference to docs.ollama.com (#12800)

parent 1188f408
---
title: Importing a Model
---
## Table of Contents
- [Importing a Safetensors adapter](#Importing-a-fine-tuned-adapter-from-Safetensors-weights)
- [Importing a Safetensors model](#Importing-a-model-from-Safetensors-weights)
- [Importing a GGUF file](#Importing-a-GGUF-based-model-or-adapter)
- [Sharing models on ollama.com](#Sharing-your-model-on-ollamacom)
## Importing a fine tuned adapter from Safetensors weights
First, create a `Modelfile` with a `FROM` command pointing at the base model you used for fine tuning, and an `ADAPTER` command which points to the directory with your Safetensors adapter:
```dockerfile
FROM <base model name>
ADAPTER /path/to/safetensors/adapter/directory
```
Make sure that you use the same base model in the `FROM` command as you used to create the adapter otherwise you will get erratic results. Most frameworks use different quantization methods, so it's best to use non-quantized (i.e. non-QLoRA) adapters. If your adapter is in the same directory as your `Modelfile`, use `ADAPTER .` to specify the adapter path.
Now run `ollama create` from the directory where the `Modelfile` was created:
```shell
ollama create my-model
```
Lastly, test the model:
```shell
ollama run my-model
```
Ollama supports importing adapters based on several different model architectures including:
- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2);
- Mistral (including Mistral 1, Mistral 2, and Mixtral); and
- Gemma (including Gemma 1 and Gemma 2)
You can create the adapter using a fine tuning framework or tool which can output adapters in the Safetensors format, such as:
- Hugging Face [fine tuning framework](https://huggingface.co/docs/transformers/en/training)
- [Unsloth](https://github.com/unslothai/unsloth)
- [MLX](https://github.com/ml-explore/mlx)
## Importing a model from Safetensors weights
First, create a `Modelfile` with a `FROM` command which points to the directory containing your Safetensors weights:
```dockerfile
FROM /path/to/safetensors/directory
```
If you create the Modelfile in the same directory as the weights, you can use the command `FROM .`.
Now run the `ollama create` command from the directory where you created the `Modelfile`:
```shell
ollama create my-model
```
Lastly, test the model:
```shell
ollama run my-model
```
Ollama supports importing models for several different architectures including:
- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2);
- Mistral (including Mistral 1, Mistral 2, and Mixtral);
- Gemma (including Gemma 1 and Gemma 2); and
- Phi3
This includes importing foundation models as well as any fine tuned models which have been _fused_ with a foundation model.
## Importing a GGUF based model or adapter
If you have a GGUF based model or adapter it is possible to import it into Ollama. You can obtain a GGUF model or adapter by:
- converting a Safetensors model with the `convert_hf_to_gguf.py` from Llama.cpp;
- converting a Safetensors adapter with the `convert_lora_to_gguf.py` from Llama.cpp; or
- downloading a model or adapter from a place such as HuggingFace
To import a GGUF model, create a `Modelfile` containing:
```dockerfile
FROM /path/to/file.gguf
```
For a GGUF adapter, create the `Modelfile` with:
```dockerfile
FROM <model name>
ADAPTER /path/to/file.gguf
```
When importing a GGUF adapter, it's important to use the same base model as the base model that the adapter was created with. You can use:
- a model from Ollama
- a GGUF file
- a Safetensors based model
Once you have created your `Modelfile`, use the `ollama create` command to build the model.
```shell
ollama create my-model
```
## Quantizing a Model
Quantizing a model allows you to run models faster and with less memory consumption but at reduced accuracy. This allows you to run a model on more modest hardware.
Ollama can quantize FP16 and FP32 based models into different quantization levels using the `-q/--quantize` flag with the `ollama create` command.
First, create a Modelfile with the FP16 or FP32 based model you wish to quantize.
```dockerfile
FROM /path/to/my/gemma/f16/model
```
Use `ollama create` to then create the quantized model.
```shell
$ ollama create --quantize q4_K_M mymodel
transferring model data
quantizing F16 model to Q4_K_M
creating new layer sha256:735e246cc1abfd06e9cdcf95504d6789a6cd1ad7577108a70d9902fef503c1bd
creating new layer sha256:0853f0ad24e5865173bbf9ffcc7b0f5d56b66fd690ab1009867e45e7d2c4db0f
writing manifest
success
```
### Supported Quantizations
- `q4_0`
- `q4_1`
- `q5_0`
- `q5_1`
- `q8_0`
#### K-means Quantizations
- `q3_K_S`
- `q3_K_M`
- `q3_K_L`
- `q4_K_S`
- `q4_K_M`
- `q5_K_S`
- `q5_K_M`
- `q6_K`
## Sharing your model on ollama.com
You can share any model you have created by pushing it to [ollama.com](https://ollama.com) so that other users can try it out.
First, use your browser to go to the [Ollama Sign-Up](https://ollama.com/signup) page. If you already have an account, you can skip this step.
<img src="images/signup.png" alt="Sign-Up" width="40%" />
The `Username` field will be used as part of your model's name (e.g. `jmorganca/mymodel`), so make sure you are comfortable with the username that you have selected.
Now that you have created an account and are signed-in, go to the [Ollama Keys Settings](https://ollama.com/settings/keys) page.
Follow the directions on the page to determine where your Ollama Public Key is located.
<img src="images/ollama-keys.png" alt="Ollama Keys" width="80%" />
Click on the `Add Ollama Public Key` button, and copy and paste the contents of your Ollama Public Key into the text field.
To push a model to [ollama.com](https://ollama.com), first make sure that it is named correctly with your username. You may have to use the `ollama cp` command to copy
your model to give it the correct name. Once you're happy with your model's name, use the `ollama push` command to push it to [ollama.com](https://ollama.com).
```shell
ollama cp mymodel myuser/mymodel
ollama push myuser/mymodel
```
Once your model has been pushed, other users can pull and run it by using the command:
```shell
ollama run myuser/mymodel
```
---
title: Ollama's documentation
sidebarTitle: Welcome
---
<img src="/images/welcome.png" noZoom className="rounded-3xl" />
[Ollama](https://ollama.com) is the easiest way to get up and running with large language models such as gpt-oss, Gemma 3, DeepSeek-R1, Qwen3 and more.
<CardGroup cols={2}>
<Card title="Quickstart" icon="rocket" href="/quickstart">
Get up and running with your first model
</Card>
<Card
title="Download Ollama"
icon="download"
href="https://ollama.com/download"
>
Download Ollama on macOS, Windows or Linux
</Card>
<Card title="Cloud" icon="cloud" href="/cloud">
Ollama's cloud models offer larger models with better performance.
</Card>
<Card title="API reference" icon="terminal" href="/api">
View Ollama's API reference
</Card>
</CardGroup>
## Libraries
<CardGroup cols={2}>
<Card
title="Ollama's Python Library"
icon="python"
href="https://github.com/ollama/ollama-python"
>
The official library for using Ollama with Python
</Card>
<Card title="Ollama's JavaScript library" icon="js" href="https://github.com/ollama/ollama-js">
The official library for using Ollama with JavaScript or TypeScript.
</Card>
<Card title="Community libraries" icon="github" href="https://github.com/ollama/ollama?tab=readme-ov-file#libraries-1">
View a list of 20+ community-supported libraries for Ollama
</Card>
</CardGroup>
## Community
<CardGroup cols={2}>
<Card title="Discord" icon="discord" href="https://discord.gg/ollama">
Join our Discord community
</Card>
<Card title="Reddit" icon="reddit" href="https://reddit.com/r/ollama">
Join our Reddit community
</Card>
</CardGroup>
---
title: Cline
---
## Install
Install [Cline](https://docs.cline.bot/getting-started/installing-cline) in your IDE.
## Usage with Ollama
1. Open Cline settings > `API Configuration` and set `API Provider` to `Ollama`
2. Select a model under `Model` or type one (e.g. `qwen3`)
3. Update the context window to at least 32K tokens under `Context Window`
<Note>Coding tools require a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/cline-settings.png"
alt="Cline settings configuration showing API Provider set to Ollama"
width="50%"
/>
</div>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com
2. Click on `Use custom base URL` and set it to `https://ollama.com`
3. Enter your **Ollama API Key**
4. Select a model from the list
### Recommended Models
- `qwen3-coder:480b`
- `deepseek-v3.1:671b`
---
title: Codex
---
## Install
Install the [Codex CLI](https://developers.openai.com/codex/cli/):
```
npm install -g @openai/codex
```
## Usage with Ollama
<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>
To use `codex` with Ollama, use the `--oss` flag:
```
codex --oss
```
### Changing Models
By default, codex will use the local `gpt-oss:20b` model. However, you can specify a different model with the `-m` flag:
```
codex --oss -m gpt-oss:120b
```
### Cloud Models
```
codex --oss -m gpt-oss:120b-cloud
```
## Connecting to ollama.com
Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
To use ollama.com directly, edit your `~/.codex/config.toml` file to point to ollama.com.
```toml
model = "gpt-oss:120b"
model_provider = "ollama"
[model_providers.ollama]
name = "Ollama"
base_url = "https://ollama.com/v1"
env_key = "OLLAMA_API_KEY"
```
Run `codex` in a new terminal to load the new settings.
---
title: Droid
---
## Install
Install the [Droid CLI](https://factory.ai/):
```bash
curl -fsSL https://app.factory.ai/cli | sh
```
<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
## Usage with Ollama
Add a local configuration block to `~/.factory/config.json`:
```json
{
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama]",
"model": "qwen3-coder",
"base_url": "http://localhost:11434/v1/",
"api_key": "not-needed",
"provider": "generic-chat-completion-api",
"max_tokens": 32000
}
]
}
```
## Cloud Models
`qwen3-coder:480b-cloud` is the recommended model for use with Droid.
Add the cloud configuration block to `~/.factory/config.json`:
```json
{
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama Cloud]",
"model": "qwen3-coder:480b-cloud",
"base_url": "http://localhost:11434/v1/",
"api_key": "not-needed",
"provider": "generic-chat-completion-api",
"max_tokens": 128000
}
]
}
```
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
2. Add the cloud configuration block to `~/.factory/config.json`:
```json
{
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama Cloud]",
"model": "qwen3-coder:480b",
"base_url": "https://ollama.com/v1/",
"api_key": "OLLAMA_API_KEY",
"provider": "generic-chat-completion-api",
"max_tokens": 128000
}
]
}
```
Run `droid` in a new terminal to load the new settings.
\ No newline at end of file
---
title: Goose
---
## Goose Desktop
Install [Goose](https://block.github.io/goose/docs/getting-started/installation/) Desktop.
### Usage with Ollama
1. In Goose, open **Settings** → **Configure Provider**.
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/goose-settings.png"
alt="Goose settings Panel"
width="75%"
/>
</div>
2. Find **Ollama**, click **Configure**
3. Confirm **API Host** is `http://localhost:11434` and click Submit
### Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on ollama.com and save it in your `.env`
2. In Goose, set **API Host** to `https://ollama.com`
## Goose CLI
Install [Goose](https://block.github.io/goose/docs/getting-started/installation/) CLI
### Usage with Ollama
1. Run `goose configure`
2. Select **Configure Providers** and select **Ollama**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/goose-cli.png"
alt="Goose CLI"
width="50%"
/>
</div>
3. Enter model name (e.g `qwen3`)
### Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on ollama.com and save it in your `.env`
2. Run `goose configure`
3. Select **Configure Providers** and select **Ollama**
4. Update **OLLAMA_HOST** to `https://ollama.com`
---
title: JetBrains
---
<Note>This example uses **IntelliJ**; same steps apply to other JetBrains IDEs (e.g., PyCharm).</Note>
## Install
Install [IntelliJ](https://www.jetbrains.com/idea/).
## Usage with Ollama
<Note>
To use **Ollama**, you will need a [JetBrains AI Subscription](https://www.jetbrains.com/ai-ides/buy/?section=personal&billing=yearly).
</Note>
1. In Intellij, click the **chat icon** located in the right sidebar
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/intellij-chat-sidebar.png"
alt="Intellij Sidebar Chat"
width="50%"
/>
</div>
2. Select the **current model** in the sidebar, then click **Set up Local Models**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/intellij-current-model.png"
alt="Intellij model bottom right corner"
width="50%"
/>
</div>
3. Under **Third Party AI Providers**, choose **Ollama**
4. Confirm the **Host URL** is `http://localhost:11434`, then click **Ok**
5. Once connected, select a model under **Local models by Ollama**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/intellij-local-models.png"
alt="Zed star icon in bottom right corner"
width="50%"
/>
</div>
---
title: n8n
---
## Install
Install [n8n](https://docs.n8n.io/choose-n8n/).
## Using Ollama Locally
1. In the top right corner, click the dropdown and select **Create Credential**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-credential-creation.png"
alt="Create a n8n Credential"
width="75%"
/>
</div>
2. Under **Add new credential** select **Ollama**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-ollama-form.png"
alt="Select Ollama under Credential"
width="75%"
/>
</div>
3. Confirm Base URL is set to `http://localhost:11434` and click **Save**
<Note> If connecting to `http://localhost:11434` fails, use `http://127.0.0.1:11434`</Note>
4. When creating a new workflow, select **Add a first step** and select an **Ollama node**
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-chat-node.png"
alt="Add a first step with Ollama node"
width="75%"
/>
</div>
5. Select your model of choice (e.g. `qwen3-coder`)
<div style={{ display: 'flex', justifyContent: 'center' }}>
<img
src="/images/n8n-models.png"
alt="Set up Ollama credentials"
width="75%"
/>
</div>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on **ollama.com**.
2. In n8n, click **Create Credential** and select **Ollama**
4. Set the **API URL** to `https://ollama.com`
5. Enter your **API Key** and click **Save**
---
title: Roo Code
---
## Install
Install [Roo Code](https://marketplace.visualstudio.com/items?itemName=RooVeterinaryInc.roo-cline) from the VS Code Marketplace.
## Usage with Ollama
1. Open Roo Code in VS Code and click the **gear icon** on the top right corner of the Roo Code window to open **Provider Settings**
2. Set `API Provider` to `Ollama`
3. (Optional) Update `Base URL` if your Ollama instance is running remotely. The default is `http://localhost:11434`
4. Enter a valid `Model ID` (for example `qwen3` or `qwen3-coder:480b-cloud`)
5. Adjust the `Context Window` to at least 32K tokens for coding tasks
<Note>Coding tools require a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com
2. Enable `Use custom base URL` and set it to `https://ollama.com`
3. Enter your **Ollama API Key**
4. Select a model from the list
### Recommended Models
- `qwen3-coder:480b`
- `deepseek-v3.1:671b`
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment