openai-compatibility.mdx 21.9 KB
Newer Older
1
2
3
---
title: OpenAI compatibility
---
4

5
Ollama provides compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

## Usage

### OpenAI Python library

```python
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',

    # required but ignored
    api_key='ollama',
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
28
    model='llama3.2',
29
)
30

royjhan's avatar
royjhan committed
31
32
33
34
35
36
37
38
39
response = client.chat.completions.create(
    model="llava",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
40
                    "image_url": "",
royjhan's avatar
royjhan committed
41
42
43
44
45
                },
            ],
        }
    ],
    max_tokens=300,
46
47
48
)

completion = client.completions.create(
49
    model="llama3.2",
50
51
    prompt="Say this is a test",
)
royjhan's avatar
royjhan committed
52

53
list_completion = client.models.list()
royjhan's avatar
royjhan committed
54

55
model = client.models.retrieve("llama3.2")
56
57
58

embeddings = client.embeddings.create(
    model="all-minilm",
59
    input=["why is the sky blue?", "why is the grass green?"],
60
)
61
```
62

63
#### Structured outputs
64
65

```python
66
from pydantic import BaseModel
67
68
69
70
71
72
73
from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

# Define the schema for the response
class FriendInfo(BaseModel):
    name: str
74
    age: int
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
    is_available: bool

class FriendList(BaseModel):
    friends: list[FriendInfo]

try:
    completion = client.beta.chat.completions.parse(
        temperature=0,
        model="llama3.1:8b",
        messages=[
            {"role": "user", "content": "I have two friends. The first is Ollama 22 years old busy saving the world, and the second is Alonso 23 years old and wants to hang out. Return a list of friends in JSON format"}
        ],
        response_format=FriendList,
    )

    friends_response = completion.choices[0].message
    if friends_response.parsed:
        print(friends_response.parsed)
    elif friends_response.refusal:
        print(friends_response.refusal)
except Exception as e:
    print(f"Error: {e}")
```
98
99
100
101

### OpenAI JavaScript library

```javascript
102
import OpenAI from "openai";
103
104

const openai = new OpenAI({
105
  baseURL: "http://localhost:11434/v1/",
106
107

  // required but ignored
108
109
  apiKey: "ollama",
});
110
111

const chatCompletion = await openai.chat.completions.create({
112
113
114
  messages: [{ role: "user", content: "Say this is a test" }],
  model: "llama3.2",
});
royjhan's avatar
royjhan committed
115
116

const response = await openai.chat.completions.create({
117
118
119
120
121
122
  model: "llava",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
royjhan's avatar
royjhan committed
123
        {
124
125
126
          type: "image_url",
          image_url:
            "",
royjhan's avatar
royjhan committed
127
        },
128
129
130
131
      ],
    },
  ],
});
132

133
const completion = await openai.completions.create({
134
135
136
  model: "llama3.2",
  prompt: "Say this is a test.",
});
137

138
const listCompletion = await openai.models.list();
royjhan's avatar
royjhan committed
139

140
const model = await openai.models.retrieve("llama3.2");
141
142
143
144

const embedding = await openai.embeddings.create({
  model: "all-minilm",
  input: ["why is the sky blue?", "why is the grass green?"],
145
});
146
147
148
149
```

### `curl`

150
```shell
151
152
153
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
154
        "model": "llama3.2",
155
156
157
158
159
160
161
162
163
164
165
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'
166

royjhan's avatar
royjhan committed
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llava",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What'\''s in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
182
               "url": ""
royjhan's avatar
royjhan committed
183
184
185
186
187
188
189
190
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

191
192
193
curl http://localhost:11434/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
194
        "model": "llama3.2",
195
196
197
        "prompt": "Say this is a test"
    }'

198
curl http://localhost:11434/v1/models
royjhan's avatar
royjhan committed
199

200
curl http://localhost:11434/v1/models/llama3.2
201
202
203
204
205
206
207

curl http://localhost:11434/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
        "model": "all-minilm",
        "input": ["why is the sky blue?", "why is the grass green?"]
    }'
208
209
210
211
212
213
214
215
216
217
218
219
```

## Endpoints

### `/v1/chat/completions`

#### Supported features

- [x] Chat completions
- [x] Streaming
- [x] JSON mode
- [x] Reproducible outputs
royjhan's avatar
royjhan committed
220
- [x] Vision
221
- [x] Tools
222
223
224
225
226
227
228
- [ ] Logprobs

#### Supported request fields

- [x] `model`
- [x] `messages`
  - [x] Text `content`
royjhan's avatar
royjhan committed
229
230
231
232
  - [x] Image `content`
    - [x] Base64 encoded image
    - [ ] Image URL
  - [x] Array of `content` parts
233
234
235
236
237
238
- [x] `frequency_penalty`
- [x] `presence_penalty`
- [x] `response_format`
- [x] `seed`
- [x] `stop`
- [x] `stream`
239
240
- [x] `stream_options`
  - [x] `include_usage`
241
242
243
- [x] `temperature`
- [x] `top_p`
- [x] `max_tokens`
royjhan's avatar
royjhan committed
244
- [x] `tools`
245
- [ ] `tool_choice`
royjhan's avatar
royjhan committed
246
- [ ] `logit_bias`
247
- [ ] `user`
Jeffrey Morgan's avatar
Jeffrey Morgan committed
248
- [ ] `n`
249

250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
### `/v1/completions`

#### Supported features

- [x] Completions
- [x] Streaming
- [x] JSON mode
- [x] Reproducible outputs
- [ ] Logprobs

#### Supported request fields

- [x] `model`
- [x] `prompt`
- [x] `frequency_penalty`
- [x] `presence_penalty`
- [x] `seed`
- [x] `stop`
- [x] `stream`
269
270
- [x] `stream_options`
  - [x] `include_usage`
271
272
273
274
275
276
277
278
279
280
281
282
283
284
- [x] `temperature`
- [x] `top_p`
- [x] `max_tokens`
- [x] `suffix`
- [ ] `best_of`
- [ ] `echo`
- [ ] `logit_bias`
- [ ] `user`
- [ ] `n`

#### Notes

- `prompt` currently only accepts a string

285
286
287
288
289
290
291
### `/v1/models`

#### Notes

- `created` corresponds to when the model was last modified
- `owned_by` corresponds to the ollama username, defaulting to `"library"`

royjhan's avatar
royjhan committed
292
293
294
295
296
297
298
### `/v1/models/{model}`

#### Notes

- `created` corresponds to when the model was last modified
- `owned_by` corresponds to the ollama username, defaulting to `"library"`

299
300
301
302
303
304
305
306
307
308
### `/v1/embeddings`

#### Supported request fields

- [x] `model`
- [x] `input`
  - [x] string
  - [x] array of strings
  - [ ] array of tokens
  - [ ] array of token arrays
309
310
- [x] `encoding format`
- [x] `dimensions`
311
312
- [ ] `user`

313
314
315
316
317
## Models

Before using a model, pull it locally `ollama pull`:

```shell
318
ollama pull llama3.2
319
320
321
322
323
324
```

### Default model names

For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:

325
```shell
326
ollama cp llama3.2 gpt-3.5-turbo
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
```

Afterwards, this new model name can be specified the `model` field:

```shell
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'
```
344
345
346
347
348

### Setting the context size

The OpenAI API does not have a way of setting the context size for a model. If you need to change the context size, create a `Modelfile` which looks like:

349
```
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
FROM <some model>
PARAMETER num_ctx <context size>
```

Use the `ollama create mymodel` command to create a new model with the updated context size. Call the API with the updated model name:

```shell
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "mymodel",
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'
368
```