api.md 18.6 KB
Newer Older
1
2
# API

3
4
5
## Endpoints

- [Generate a completion](#generate-a-completion)
6
- [Generate a chat completion](#generate-a-chat-completion)
Matt Williams's avatar
Matt Williams committed
7
8
9
10
11
12
13
14
15
- [Create a Model](#create-a-model)
- [List Local Models](#list-local-models)
- [Show Model Information](#show-model-information)
- [Copy a Model](#copy-a-model)
- [Delete a Model](#delete-a-model)
- [Pull a Model](#pull-a-model)
- [Push a Model](#push-a-model)
- [Generate Embeddings](#generate-embeddings)

16
## Conventions
Matt Williams's avatar
Matt Williams committed
17

18
### Model names
Matt Williams's avatar
Matt Williams committed
19

Matt Williams's avatar
Matt Williams committed
20
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
Matt Williams's avatar
Matt Williams committed
21
22
23

### Durations

24
All durations are returned in nanoseconds.
Matt Williams's avatar
Matt Williams committed
25

26
27
### Streaming responses

Bruce MacDonald's avatar
Bruce MacDonald committed
28
Certain endpoints stream responses as JSON objects.
29

30
## Generate a completion
Matt Williams's avatar
Matt Williams committed
31

Matt Williams's avatar
Matt Williams committed
32
```shell
33
34
POST /api/generate
```
35

Bruce MacDonald's avatar
Bruce MacDonald committed
36
Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
Matt Williams's avatar
Matt Williams committed
37

38
### Parameters
Matt Williams's avatar
Matt Williams committed
39

40
41
- `model`: (required) the [model name](#model-names)
- `prompt`: the prompt to generate a response for
Matt Williams's avatar
Matt Williams committed
42

43
Advanced parameters (optional):
Matt Williams's avatar
Matt Williams committed
44

45
- `format`: the format to return a response in. Currently the only accepted value is `json`
46
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
47
- `system`: system message to (overrides what is defined in the `Modelfile`)
48
49
- `template`: the full prompt or prompt template (overrides what is defined in the `Modelfile`)
- `context`: the context parameter returned from a previous request to `/generate`, this can be used to keep a short conversational memory
50
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
Bruce MacDonald's avatar
Bruce MacDonald committed
51
- `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API.
52

53
54
### JSON mode

55
56
57
Enable JSON mode by setting the `format` parameter to `json`. This will structure the response as valid JSON. See the JSON mode [example](#request-json-mode) below.

> Note: it's important to instruct the model to use JSON in the `prompt`. Otherwise, the model may generate large amounts whitespace.
58

59
60
### Examples

Jeffrey Morgan's avatar
Jeffrey Morgan committed
61
#### Request
Matt Williams's avatar
Matt Williams committed
62

Matt Williams's avatar
Matt Williams committed
63
```shell
64
curl http://localhost:11434/api/generate -d '{
65
  "model": "llama2",
66
67
68
  "prompt": "Why is the sky blue?"
}'
```
Matt Williams's avatar
Matt Williams committed
69

70
#### Response
Matt Williams's avatar
Matt Williams committed
71

72
A stream of JSON objects is returned:
Matt Williams's avatar
Matt Williams committed
73

74
```json
Matt Williams's avatar
Matt Williams committed
75
{
76
  "model": "llama2",
77
78
  "created_at": "2023-08-04T08:52:19.385406455-07:00",
  "response": "The",
Matt Williams's avatar
Matt Williams committed
79
80
81
82
  "done": false
}
```

83
The final response in the stream also includes additional data about the generation:
Matt Williams's avatar
Matt Williams committed
84

85
86
87
88
89
90
91
92
- `total_duration`: time spent generating the response
- `load_duration`: time spent in nanoseconds loading the model
- `sample_count`: number of samples generated
- `sample_duration`: time spent generating samples
- `prompt_eval_count`: number of tokens in the prompt
- `prompt_eval_duration`: time spent in nanoseconds evaluating the prompt
- `eval_count`: number of tokens the response
- `eval_duration`: time in nanoseconds spent generating the response
93
- `context`: an encoding of the conversation used in this response, this can be sent in the next request to keep a conversational memory
94
- `response`: empty if the response was streamed, if not streamed, this will contain the full response
95
96
97

To calculate how fast the response is generated in tokens per second (token/s), divide `eval_count` / `eval_duration`.

98
```json
Matt Williams's avatar
Matt Williams committed
99
{
100
  "model": "llama2",
101
  "created_at": "2023-08-04T19:22:45.499127Z",
102
  "response": "",
Bruce MacDonald's avatar
Bruce MacDonald committed
103
  "context": [1, 2, 3],
104
105
106
107
108
109
110
111
112
113
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 113,
  "eval_duration": 1325948000
}
Matt Williams's avatar
Matt Williams committed
114
115
```

116
#### Request (No streaming)
117

Bruce MacDonald's avatar
Bruce MacDonald committed
118
119
A response can be recieved in one reply when streaming is off.

120
```shell
121
curl http://localhost:11434/api/generate -d '{
122
  "model": "llama2",
123
124
125
126
127
128
129
  "prompt": "Why is the sky blue?",
  "stream": false
}'
```

#### Response

130
131
132
133
If `stream` is set to `false`, the response will be a single JSON object:

```json
{
134
  "model": "llama2",
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
  "created_at": "2023-08-04T19:22:45.499127Z",
  "response": "The sky is blue because it is the color of the sky.",
  "context": [1, 2, 3],
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 13,
  "eval_duration": 1325948000
}
```

Bruce MacDonald's avatar
Bruce MacDonald committed
150
#### Request (Raw Mode)
151

Bruce MacDonald's avatar
Bruce MacDonald committed
152
In some cases you may wish to bypass the templating system and provide a full prompt. In this case, you can use the `raw` parameter to disable formatting.
153
154

```shell
155
curl http://localhost:11434/api/generate -d '{
156
157
158
159
160
161
162
163
164
165
166
167
168
169
  "model": "mistral",
  "prompt": "[INST] why is the sky blue? [/INST]",
  "raw": true,
  "stream": false
}'
```

#### Response

```json
{
  "model": "mistral",
  "created_at": "2023-11-03T15:36:02.583064Z",
  "response": " The sky appears blue because of a phenomenon called Rayleigh scattering.",
Bruce MacDonald's avatar
Bruce MacDonald committed
170
  "context": [1, 2, 3],
171
172
173
174
175
176
177
178
179
180
  "done": true,
  "total_duration": 14648695333,
  "load_duration": 3302671417,
  "prompt_eval_count": 14,
  "prompt_eval_duration": 286243000,
  "eval_count": 129,
  "eval_duration": 10931424000
}
```

181
182
183
#### Request (JSON mode)

```shell
184
curl http://localhost:11434/api/generate -d '{
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
  "model": "llama2",
  "prompt": "What color is the sky at different times of the day? Respond using JSON",
  "format": "json",
  "stream": false
}'
```

#### Response

```json
{
  "model": "llama2",
  "created_at": "2023-11-09T21:07:55.186497Z",
  "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
  "done": true,
  "total_duration": 4661289125,
  "load_duration": 1714434500,
  "prompt_eval_count": 36,
  "prompt_eval_duration": 264132000,
  "eval_count": 75,
  "eval_duration": 2112149000
}
```

The value of `response` will be a string containing JSON similar to:

```json
{
  "morning": {
    "color": "blue"
  },
  "noon": {
    "color": "blue-gray"
  },
  "afternoon": {
    "color": "warm gray"
  },
  "evening": {
    "color": "orange"
  }
}
```

#### Request (With options)
229
230
231
232

If you want to set custom options for the model at runtime rather than in the Modelfile, you can do so with the `options` parameter. This example sets every available option, but you can set any of them individually and omit the ones you do not want to override.

```shell
233
curl http://localhost:11434/api/generate -d '{
234
  "model": "llama2",
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
  "prompt": "Why is the sky blue?",
  "stream": false,
  "options": {
    "num_keep": 5,
    "seed": 42,
    "num_predict": 100,
    "top_k": 20,
    "top_p": 0.9,
    "tfs_z": 0.5,
    "typical_p": 0.7,
    "repeat_last_n": 33,
    "temperature": 0.8,
    "repeat_penalty": 1.2,
    "presence_penalty": 1.5,
    "frequency_penalty": 1.0,
    "mirostat": 1,
    "mirostat_tau": 0.8,
    "mirostat_eta": 0.6,
    "penalize_newline": true,
    "stop": ["\n", "user:"],
    "numa": false,
256
    "num_ctx": 1024,
257
258
259
260
261
262
263
264
265
266
267
268
269
270
    "num_batch": 2,
    "num_gqa": 1,
    "num_gpu": 1,
    "main_gpu": 0,
    "low_vram": false,
    "f16_kv": true,
    "logits_all": false,
    "vocab_only": false,
    "use_mmap": true,
    "use_mlock": false,
    "embedding_only": false,
    "rope_frequency_base": 1.1,
    "rope_frequency_scale": 0.8,
    "num_thread": 8
271
  }
272
273
274
275
276
277
278
}'
```

#### Response

```json
{
279
  "model": "llama2",
280
281
282
283
284
285
286
287
288
289
290
291
292
293
  "created_at": "2023-08-04T19:22:45.499127Z",
  "response": "The sky is blue because it is the color of the sky.",
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 13,
  "eval_duration": 1325948000
}
```

294
## Generate a chat completion
295

Bruce MacDonald's avatar
Bruce MacDonald committed
296
297
298
299
300
301
302
303
304
305
306
```shell
POST /api/chat
```

Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.

### Parameters

- `model`: (required) the [model name](#model-names)
- `messages`: the messages of the chat, this can be used to keep a chat memory

307
308
309
310
  A message is an object with the following fields:
  - `role`: the role of the message, either `user`, `assistant` or `system`
  - `content`: the content of the message

Bruce MacDonald's avatar
Bruce MacDonald committed
311
312
313
314
315
316
317
318
319
320
Advanced parameters (optional):

- `format`: the format to return a response in. Currently the only accepted value is `json`
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
- `template`: the full prompt or prompt template (overrides what is defined in the `Modelfile`)
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects

### Examples

#### Request
321

Bruce MacDonald's avatar
Bruce MacDonald committed
322
323
324
Send a chat message with a streaming response.

```shell
325
curl http://localhost:11434/api/chat -d '{
Bruce MacDonald's avatar
Bruce MacDonald committed
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
  "model": "llama2",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]
}'
```

#### Response

A stream of JSON objects is returned:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T08:52:19.385406455-07:00",
  "message": {
    "role": "assisant",
    "content": "The"
  },
  "done": false
}
```

Final response:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T19:22:45.499127Z",
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 113,
  "eval_duration": 1325948000
}
```

#### Request (With History)
371

Bruce MacDonald's avatar
Bruce MacDonald committed
372
373
374
Send a chat message with a conversation history.

```shell
375
curl http://localhost:11434/api/chat -d '{
Bruce MacDonald's avatar
Bruce MacDonald committed
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
  "model": "llama2",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    },
    {
      "role": "assistant",
      "content": "due to rayleigh scattering."
    },
    {
      "role": "user",
      "content": "how is that different than mie scattering?"
    }
  ]
}'
```

#### Response

A stream of JSON objects is returned:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T08:52:19.385406455-07:00",
  "message": {
    "role": "assisant",
    "content": "The"
  },
  "done": false
}
```

Final response:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T19:22:45.499127Z",
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 113,
  "eval_duration": 1325948000
}
```

428
## Create a Model
Matt Williams's avatar
Matt Williams committed
429

Matt Williams's avatar
Matt Williams committed
430
```shell
431
432
433
POST /api/create
```

Michael Yang's avatar
Michael Yang committed
434
Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation should also create any file blobs, fields such as `FROM` and `ADAPTER`, explicitly with the server using [Create a Blob](#create-a-blob) and the value to the path indicated in the response.
435

436
### Parameters
Matt Williams's avatar
Matt Williams committed
437

438
- `name`: name of the model to create
Jeffrey Morgan's avatar
Jeffrey Morgan committed
439
- `modelfile` (optional): contents of the Modelfile
440
- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
Jeffrey Morgan's avatar
Jeffrey Morgan committed
441
- `path` (optional): path to the Modelfile
Matt Williams's avatar
Matt Williams committed
442

443
444
445
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
446

Matt Williams's avatar
Matt Williams committed
447
```shell
448
curl http://localhost:11434/api/create -d '{
449
  "name": "mario",
450
  "modelfile": "FROM llama2\nSYSTEM You are mario from Super Mario Bros."
451
}'
Matt Williams's avatar
Matt Williams committed
452
453
```

454
#### Response
Matt Williams's avatar
Matt Williams committed
455

Matt Williams's avatar
Matt Williams committed
456
A stream of JSON objects. When finished, `status` is `success`.
Matt Williams's avatar
Matt Williams committed
457

458
```json
Matt Williams's avatar
Matt Williams committed
459
460
461
462
463
{
  "status": "parsing modelfile"
}
```

Michael Yang's avatar
Michael Yang committed
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
### Check if a Blob Exists

```shell
HEAD /api/blobs/:digest
```

Check if a blob is known to the server.

#### Query Parameters

- `digest`: the SHA256 digest of the blob

#### Examples

##### Request

```shell
curl -I http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
```

##### Response

Return 200 OK if the blob exists, 404 Not Found if it does not.

Michael Yang's avatar
Michael Yang committed
488
### Create a Blob
Michael Yang's avatar
Michael Yang committed
489
490
491
492
493
494
495

```shell
POST /api/blobs/:digest
```

Create a blob from a file. Returns the server file path.

Michael Yang's avatar
Michael Yang committed
496
#### Query Parameters
Michael Yang's avatar
Michael Yang committed
497
498
499

- `digest`: the expected SHA256 digest of the file

Michael Yang's avatar
Michael Yang committed
500
#### Examples
Michael Yang's avatar
Michael Yang committed
501

Michael Yang's avatar
Michael Yang committed
502
503
##### Request

Michael Yang's avatar
Michael Yang committed
504
```shell
Michael Yang's avatar
Michael Yang committed
505
curl -T model.bin -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
Michael Yang's avatar
Michael Yang committed
506
507
```

Michael Yang's avatar
Michael Yang committed
508
##### Response
Michael Yang's avatar
Michael Yang committed
509

Michael Yang's avatar
Michael Yang committed
510
Return 201 Created if the blob was successfully created.
Michael Yang's avatar
Michael Yang committed
511

512
## List Local Models
Matt Williams's avatar
Matt Williams committed
513

Matt Williams's avatar
Matt Williams committed
514
```shell
515
GET /api/tags
Matt Williams's avatar
Matt Williams committed
516
517
```

518
List models that are available locally.
Matt Williams's avatar
Matt Williams committed
519

520
521
522
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
523

Matt Williams's avatar
Matt Williams committed
524
```shell
525
526
curl http://localhost:11434/api/tags
```
Matt Williams's avatar
Matt Williams committed
527

528
#### Response
Matt Williams's avatar
Matt Williams committed
529

530
531
A single JSON object will be returned.

532
```json
Matt Williams's avatar
Matt Williams committed
533
534
535
{
  "models": [
    {
536
      "name": "llama2",
537
538
539
540
541
542
      "modified_at": "2023-08-02T17:02:23.713454393-07:00",
      "size": 3791730596
    },
    {
      "name": "llama2:13b",
      "modified_at": "2023-08-08T12:08:38.093596297-07:00",
Matt Williams's avatar
Matt Williams committed
543
      "size": 7323310500
Matt Williams's avatar
Matt Williams committed
544
545
    }
  ]
Matt Williams's avatar
Matt Williams committed
546
547
548
}
```

Matt Williams's avatar
Matt Williams committed
549
550
551
552
553
554
## Show Model Information

```shell
POST /api/show
```

555
Show details about a model including modelfile, template, parameters, license, and system message.
Matt Williams's avatar
Matt Williams committed
556
557
558
559

### Parameters

- `name`: name of the model to show
Matt Williams's avatar
Matt Williams committed
560

561
562
563
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
564

565
```shell
Matt Williams's avatar
Matt Williams committed
566
curl http://localhost:11434/api/show -d '{
567
  "name": "llama2"
Matt Williams's avatar
Matt Williams committed
568
}'
Matt Williams's avatar
Matt Williams committed
569
```
Matt Williams's avatar
Matt Williams committed
570

571
#### Response
Matt Williams's avatar
Matt Williams committed
572
573
574

```json
{
575
576
577
578
  "license": "<contents of license block>",
  "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llama2:latest\n\nFROM /Users/username/.ollama/models/blobs/sha256:8daa9615cce30c259a9555b1cc250d461d1bc69980a274b44d7eda0be78076d8\nTEMPLATE \"\"\"[INST] {{ if and .First .System }}<<SYS>>{{ .System }}<</SYS>>\n\n{{ end }}{{ .Prompt }} [/INST] \"\"\"\nSYSTEM \"\"\"\"\"\"\nPARAMETER stop [INST]\nPARAMETER stop [/INST]\nPARAMETER stop <<SYS>>\nPARAMETER stop <</SYS>>\n",
  "parameters": "stop                           [INST]\nstop                           [/INST]\nstop                           <<SYS>>\nstop                           <</SYS>>",
  "template": "[INST] {{ if and .First .System }}<<SYS>>{{ .System }}<</SYS>>\n\n{{ end }}{{ .Prompt }} [/INST] "
Matt Williams's avatar
Matt Williams committed
579
580
581
582
583
584
}
```

## Copy a Model

```shell
585
POST /api/copy
Matt Williams's avatar
Matt Williams committed
586
```
587

588
Copy a model. Creates a model with another name from an existing model.
Matt Williams's avatar
Matt Williams committed
589

590
591
592
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
593

Matt Williams's avatar
Matt Williams committed
594
```shell
595
curl http://localhost:11434/api/copy -d '{
596
  "source": "llama2",
597
  "destination": "llama2-backup"
Matt Williams's avatar
Matt Williams committed
598
599
600
}'
```

601
#### Response
602
603
604

The only response is a 200 OK if successful.

Matt Williams's avatar
Matt Williams committed
605
## Delete a Model
Matt Williams's avatar
Matt Williams committed
606

Matt Williams's avatar
Matt Williams committed
607
```shell
608
DELETE /api/delete
Matt Williams's avatar
Matt Williams committed
609
610
```

611
Delete a model and its data.
Matt Williams's avatar
Matt Williams committed
612

613
### Parameters
Matt Williams's avatar
Matt Williams committed
614

615
- `name`: model name to delete
Matt Williams's avatar
Matt Williams committed
616

617
618
619
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
620

Matt Williams's avatar
Matt Williams committed
621
```shell
622
623
curl -X DELETE http://localhost:11434/api/delete -d '{
  "name": "llama2:13b"
Matt Williams's avatar
Matt Williams committed
624
625
626
}'
```

627
#### Response
628
629
630

If successful, the only response is a 200 OK.

631
## Pull a Model
Matt Williams's avatar
Matt Williams committed
632

Matt Williams's avatar
Matt Williams committed
633
```shell
634
635
636
POST /api/pull
```

Matt Williams's avatar
Matt Williams committed
637
Download a model from the ollama library. Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.
Matt Williams's avatar
Matt Williams committed
638

639
### Parameters
Matt Williams's avatar
Matt Williams committed
640

641
- `name`: name of the model to pull
Matt Williams's avatar
Matt Williams committed
642
- `insecure`: (optional) allow insecure connections to the library. Only use this if you are pulling from your own library during development.
643
- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
Matt Williams's avatar
Matt Williams committed
644

645
646
647
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
648

Matt Williams's avatar
Matt Williams committed
649
```shell
650
curl http://localhost:11434/api/pull -d '{
651
  "name": "llama2"
652
}'
Matt Williams's avatar
Matt Williams committed
653
654
```

655
#### Response
656

657
658
659
660
661
662
663
664
665
666
667
668
If `stream` is not specified, or set to `true`, a stream of JSON objects is returned:

The first object is the manifest:

```json
{
  "status": "pulling manifest"
}
```

Then there is a series of downloading responses. Until any of the download is completed, the `completed` key may not be included. The number of files to be downloaded depends on the number of layers specified in the manifest.

669
```json
Matt Williams's avatar
Matt Williams committed
670
{
671
672
  "status": "downloading digestname",
  "digest": "digestname",
673
  "total": 2142590208,
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
  "completed": 241970
}
```

After all the files are downloaded, the final responses are:

```json
{
    "status": "verifying sha256 digest"
}
{
    "status": "writing manifest"
}
{
    "status": "removing any unused layers"
}
{
    "status": "success"
}
```

if `stream` is set to false, then the response is a single JSON object:

```json
{
  "status": "success"
Matt Williams's avatar
Matt Williams committed
700
701
}
```
702

Matt Williams's avatar
Matt Williams committed
703
704
705
706
707
708
709
710
711
712
713
## Push a Model

```shell
POST /api/push
```

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first.

### Parameters

- `name`: name of the model to push in the form of `<namespace>/<model>:<tag>`
714
- `insecure`: (optional) allow insecure connections to the library. Only use this if you are pushing to your library during development.
715
- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
Matt Williams's avatar
Matt Williams committed
716

717
718
719
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
720
721

```shell
722
curl http://localhost:11434/api/push -d '{
Matt Williams's avatar
Matt Williams committed
723
724
725
726
  "name": "mattw/pygmalion:latest"
}'
```

727
#### Response
728

729
If `stream` is not specified, or set to `true`, a stream of JSON objects is returned:
Matt Williams's avatar
Matt Williams committed
730
731

```json
732
{ "status": "retrieving manifest" }
733
```
Matt Williams's avatar
Matt Williams committed
734
735
736
737
738

and then:

```json
{
739
740
741
  "status": "starting upload",
  "digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab",
  "total": 1928429856
Matt Williams's avatar
Matt Williams committed
742
743
744
745
746
747
748
}
```

Then there is a series of uploading responses:

```json
{
749
750
751
752
  "status": "starting upload",
  "digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab",
  "total": 1928429856
}
Matt Williams's avatar
Matt Williams committed
753
754
755
756
757
758
759
760
761
```

Finally, when the upload is complete:

```json
{"status":"pushing manifest"}
{"status":"success"}
```

762
763
764
If `stream` is set to `false`, then the response is a single JSON object:

```json
765
{ "status": "success" }
766
767
```

Matt Williams's avatar
Matt Williams committed
768
769
770
## Generate Embeddings

```shell
771
772
773
774
775
776
777
778
779
780
POST /api/embeddings
```

Generate embeddings from a model

### Parameters

- `model`: name of model to generate embeddings from
- `prompt`: text to generate embeddings for

781
782
783
784
Advanced parameters:

- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`

785
786
787
### Examples

#### Request
788

Matt Williams's avatar
Matt Williams committed
789
```shell
790
curl http://localhost:11434/api/embeddings -d '{
791
  "model": "llama2",
792
793
794
795
  "prompt": "Here is an article about llamas..."
}'
```

796
#### Response
797
798
799

```json
{
Alexander F. Rødseth's avatar
Alexander F. Rødseth committed
800
  "embedding": [
801
802
803
    0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,
    0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281
  ]
Costa Alexoglou's avatar
Costa Alexoglou committed
804
805
}
```