openai-compatibility.mdx 21.7 KB
Newer Older
1
2
3
---
title: OpenAI compatibility
---
4

5
Ollama provides compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
6
7
8

## Usage

9
### Simple `v1/chat/completions` example
10

11
12
13
<CodeGroup dropdown>

```python basic.py
14
15
16
17
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
18
    api_key='ollama',  # required but ignored
19
20
21
22
23
24
25
26
27
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
28
    model='gpt-oss:20b',
29
)
30
31
print(chat_completion.choices[0].message.content)
```
32

33
34
```javascript basic.js
import OpenAI from "openai";
35

36
37
38
39
const openai = new OpenAI({
  baseURL: "http://localhost:11434/v1/",
  apiKey: "ollama", // required but ignored
});
royjhan's avatar
royjhan committed
40

41
42
43
44
const chatCompletion = await openai.chat.completions.create({
  messages: [{ role: "user", content: "Say this is a test" }],
  model: "gpt-oss:20b",
});
royjhan's avatar
royjhan committed
45

46
47
console.log(chatCompletion.choices[0].message.content);
```
48

49
50
51
52
53
54
55
```shell basic.sh
curl -X POST http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "gpt-oss:20b",
  "messages": [{ "role": "user", "content": "Say this is a test" }]
}'
56
```
57

58
59
60
</CodeGroup>

### Simple `v1/responses` example
61

62
63
64
<CodeGroup dropdown>

```python responses.py
65
66
from openai import OpenAI

67
68
69
70
client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)
71

72
73
74
75
76
77
responses_result = client.responses.create(
  model='qwen3:8b',
  input='Write a short poem about the color blue',
)
print(responses_result.output_text)
```
78

79
```javascript responses.js
80
import OpenAI from "openai";
81
82

const openai = new OpenAI({
83
  baseURL: "http://localhost:11434/v1/",
84
85
  apiKey: "ollama", // required but ignored
});
86

87
88
89
const responsesResult = await openai.responses.create({
  model: "qwen3:8b",
  input: "Write a short poem about the color blue",
90
});
91

92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
console.log(responsesResult.output_text);
```

```shell responses.sh
curl -X POST http://localhost:11434/v1/responses \
-H "Content-Type: application/json" \
-d '{
  "model": "qwen3:8b",
  "input": "Write a short poem about the color blue"
}'
```

</CodeGroup>

### v1/chat/completions with vision example

<CodeGroup dropdown>

```python vision.py
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)

response = client.chat.completions.create(
    model='qwen3-vl:8b',
    messages=[
        {
            'role': 'user',
            'content': [
                {'type': 'text', 'text': "What's in this image?"},
                {
                    'type': 'image_url',
                    'image_url': '',
                },
            ],
        }
    ],
    max_tokens=300,
)
print(response.choices[0].message.content)
```

```javascript vision.js
import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "http://localhost:11434/v1/",
  apiKey: "ollama", // required but ignored
143
});
royjhan's avatar
royjhan committed
144
145

const response = await openai.chat.completions.create({
146
  model: "qwen3-vl:8b",
147
148
149
150
151
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
royjhan's avatar
royjhan committed
152
        {
153
154
155
          type: "image_url",
          image_url:
            "",
royjhan's avatar
royjhan committed
156
        },
157
158
159
160
      ],
    },
  ],
});
161
console.log(response.choices[0].message.content);
162
163
```

164
165
166
167
168
169
170
```shell vision.sh
curl -X POST http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "qwen3-vl:8b",
  "messages": [{ "role": "user", "content": [{"type": "text", "text": "What is this an image of?"}, {"type": "image_url", "image_url": ""}]}]
}'
171
172
```

173
174
</CodeGroup>

175
176
177
178
179
180
181
182
183
184
## Endpoints

### `/v1/chat/completions`

#### Supported features

- [x] Chat completions
- [x] Streaming
- [x] JSON mode
- [x] Reproducible outputs
royjhan's avatar
royjhan committed
185
- [x] Vision
186
- [x] Tools
187
188
189
190
191
192
193
- [ ] Logprobs

#### Supported request fields

- [x] `model`
- [x] `messages`
  - [x] Text `content`
royjhan's avatar
royjhan committed
194
195
196
197
  - [x] Image `content`
    - [x] Base64 encoded image
    - [ ] Image URL
  - [x] Array of `content` parts
198
199
200
201
202
203
- [x] `frequency_penalty`
- [x] `presence_penalty`
- [x] `response_format`
- [x] `seed`
- [x] `stop`
- [x] `stream`
204
205
- [x] `stream_options`
  - [x] `include_usage`
206
207
208
- [x] `temperature`
- [x] `top_p`
- [x] `max_tokens`
royjhan's avatar
royjhan committed
209
- [x] `tools`
210
- [ ] `tool_choice`
royjhan's avatar
royjhan committed
211
- [ ] `logit_bias`
212
- [ ] `user`
Jeffrey Morgan's avatar
Jeffrey Morgan committed
213
- [ ] `n`
214

215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
### `/v1/completions`

#### Supported features

- [x] Completions
- [x] Streaming
- [x] JSON mode
- [x] Reproducible outputs
- [ ] Logprobs

#### Supported request fields

- [x] `model`
- [x] `prompt`
- [x] `frequency_penalty`
- [x] `presence_penalty`
- [x] `seed`
- [x] `stop`
- [x] `stream`
234
235
- [x] `stream_options`
  - [x] `include_usage`
236
237
238
239
240
241
242
243
244
245
246
247
248
249
- [x] `temperature`
- [x] `top_p`
- [x] `max_tokens`
- [x] `suffix`
- [ ] `best_of`
- [ ] `echo`
- [ ] `logit_bias`
- [ ] `user`
- [ ] `n`

#### Notes

- `prompt` currently only accepts a string

250
251
252
253
254
255
256
### `/v1/models`

#### Notes

- `created` corresponds to when the model was last modified
- `owned_by` corresponds to the ollama username, defaulting to `"library"`

royjhan's avatar
royjhan committed
257
258
259
260
261
262
263
### `/v1/models/{model}`

#### Notes

- `created` corresponds to when the model was last modified
- `owned_by` corresponds to the ollama username, defaulting to `"library"`

264
265
266
267
268
269
270
271
272
273
### `/v1/embeddings`

#### Supported request fields

- [x] `model`
- [x] `input`
  - [x] string
  - [x] array of strings
  - [ ] array of tokens
  - [ ] array of token arrays
274
275
- [x] `encoding format`
- [x] `dimensions`
276
277
- [ ] `user`

278
279
### `/v1/responses`

280
281
> Note: Added in Ollama v0.13.3

282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
Ollama supports the [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses). Only the non-stateful flavor is supported (i.e., there is no `previous_response_id` or `conversation` support).

#### Supported features

- [x] Streaming
- [x] Tools (function calling)
- [x] Reasoning summaries (for thinking models)
- [ ] Stateful requests

#### Supported request fields

- [x] `model`
- [x] `input`
- [x] `instructions`
- [x] `tools`
- [x] `stream`
- [x] `temperature`
- [x] `top_p`
- [x] `max_output_tokens`
- [ ] `previous_response_id` (stateful v1/responses not supported)
- [ ] `conversation` (stateful v1/responses not supported)
- [ ] `truncation`

305
306
307
308
309
## Models

Before using a model, pull it locally `ollama pull`:

```shell
310
ollama pull llama3.2
311
312
313
314
315
316
```

### Default model names

For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:

317
```shell
318
ollama cp llama3.2 gpt-3.5-turbo
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
```

Afterwards, this new model name can be specified the `model` field:

```shell
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'
```
336
337
338
339
340

### Setting the context size

The OpenAI API does not have a way of setting the context size for a model. If you need to change the context size, create a `Modelfile` which looks like:

341
```
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
FROM <some model>
PARAMETER num_ctx <context size>
```

Use the `ollama create mymodel` command to create a new model with the updated context size. Call the API with the updated model name:

```shell
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "mymodel",
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'
360
```