api.md 18.4 KB
Newer Older
1
2
# API

3
4
5
## Endpoints

- [Generate a completion](#generate-a-completion)
6
- [Send Chat Messages](#send-chat-messages)
Matt Williams's avatar
Matt Williams committed
7
8
9
10
11
12
13
14
15
- [Create a Model](#create-a-model)
- [List Local Models](#list-local-models)
- [Show Model Information](#show-model-information)
- [Copy a Model](#copy-a-model)
- [Delete a Model](#delete-a-model)
- [Pull a Model](#pull-a-model)
- [Push a Model](#push-a-model)
- [Generate Embeddings](#generate-embeddings)

16
## Conventions
Matt Williams's avatar
Matt Williams committed
17

18
### Model names
Matt Williams's avatar
Matt Williams committed
19

Matt Williams's avatar
Matt Williams committed
20
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
Matt Williams's avatar
Matt Williams committed
21
22
23

### Durations

24
All durations are returned in nanoseconds.
Matt Williams's avatar
Matt Williams committed
25

26
27
### Streaming responses

Bruce MacDonald's avatar
Bruce MacDonald committed
28
Certain endpoints stream responses as JSON objects.
29

30
## Generate a completion
Matt Williams's avatar
Matt Williams committed
31

Matt Williams's avatar
Matt Williams committed
32
```shell
33
34
POST /api/generate
```
35

Bruce MacDonald's avatar
Bruce MacDonald committed
36
Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
Matt Williams's avatar
Matt Williams committed
37

38
### Parameters
Matt Williams's avatar
Matt Williams committed
39

40
41
- `model`: (required) the [model name](#model-names)
- `prompt`: the prompt to generate a response for
Matt Williams's avatar
Matt Williams committed
42

43
Advanced parameters (optional):
Matt Williams's avatar
Matt Williams committed
44

45
- `format`: the format to return a response in. Currently the only accepted value is `json`
46
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
Bruce MacDonald's avatar
Bruce MacDonald committed
47
- `system`: system prompt to (overrides what is defined in the `Modelfile`)
48
49
- `template`: the full prompt or prompt template (overrides what is defined in the `Modelfile`)
- `context`: the context parameter returned from a previous request to `/generate`, this can be used to keep a short conversational memory
50
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
Bruce MacDonald's avatar
Bruce MacDonald committed
51
- `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API.
52

53
54
### JSON mode

55
56
57
Enable JSON mode by setting the `format` parameter to `json`. This will structure the response as valid JSON. See the JSON mode [example](#request-json-mode) below.

> Note: it's important to instruct the model to use JSON in the `prompt`. Otherwise, the model may generate large amounts whitespace.
58

59
60
### Examples

Bruce MacDonald's avatar
Bruce MacDonald committed
61
#### Request (Prompt)
Matt Williams's avatar
Matt Williams committed
62

Matt Williams's avatar
Matt Williams committed
63
```shell
64
curl http://localhost:11434/api/generate -d '{
65
  "model": "llama2",
66
67
68
  "prompt": "Why is the sky blue?"
}'
```
Matt Williams's avatar
Matt Williams committed
69

70
#### Response
Matt Williams's avatar
Matt Williams committed
71

72
A stream of JSON objects is returned:
Matt Williams's avatar
Matt Williams committed
73

74
```json
Matt Williams's avatar
Matt Williams committed
75
{
76
  "model": "llama2",
77
78
  "created_at": "2023-08-04T08:52:19.385406455-07:00",
  "response": "The",
Matt Williams's avatar
Matt Williams committed
79
80
81
82
  "done": false
}
```

83
The final response in the stream also includes additional data about the generation:
Matt Williams's avatar
Matt Williams committed
84

85
86
87
88
89
90
91
92
- `total_duration`: time spent generating the response
- `load_duration`: time spent in nanoseconds loading the model
- `sample_count`: number of samples generated
- `sample_duration`: time spent generating samples
- `prompt_eval_count`: number of tokens in the prompt
- `prompt_eval_duration`: time spent in nanoseconds evaluating the prompt
- `eval_count`: number of tokens the response
- `eval_duration`: time in nanoseconds spent generating the response
93
- `context`: an encoding of the conversation used in this response, this can be sent in the next request to keep a conversational memory
94
- `response`: empty if the response was streamed, if not streamed, this will contain the full response
95
96
97

To calculate how fast the response is generated in tokens per second (token/s), divide `eval_count` / `eval_duration`.

98
```json
Matt Williams's avatar
Matt Williams committed
99
{
100
  "model": "llama2",
101
  "created_at": "2023-08-04T19:22:45.499127Z",
102
  "response": "",
Bruce MacDonald's avatar
Bruce MacDonald committed
103
  "context": [1, 2, 3],
104
105
106
107
108
109
110
111
112
113
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 113,
  "eval_duration": 1325948000
}
Matt Williams's avatar
Matt Williams committed
114
115
```

116
#### Request (No streaming)
117

Bruce MacDonald's avatar
Bruce MacDonald committed
118
119
A response can be recieved in one reply when streaming is off.

120
```shell
121
curl http://localhost:11434/api/generate -d '{
122
  "model": "llama2",
123
124
125
126
127
128
129
  "prompt": "Why is the sky blue?",
  "stream": false
}'
```

#### Response

130
131
132
133
If `stream` is set to `false`, the response will be a single JSON object:

```json
{
134
  "model": "llama2",
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
  "created_at": "2023-08-04T19:22:45.499127Z",
  "response": "The sky is blue because it is the color of the sky.",
  "context": [1, 2, 3],
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 13,
  "eval_duration": 1325948000
}
```

Bruce MacDonald's avatar
Bruce MacDonald committed
150
#### Request (Raw Mode)
151

Bruce MacDonald's avatar
Bruce MacDonald committed
152
In some cases you may wish to bypass the templating system and provide a full prompt. In this case, you can use the `raw` parameter to disable formatting.
153
154

```shell
155
curl http://localhost:11434/api/generate -d '{
156
157
158
159
160
161
162
163
164
165
166
167
168
169
  "model": "mistral",
  "prompt": "[INST] why is the sky blue? [/INST]",
  "raw": true,
  "stream": false
}'
```

#### Response

```json
{
  "model": "mistral",
  "created_at": "2023-11-03T15:36:02.583064Z",
  "response": " The sky appears blue because of a phenomenon called Rayleigh scattering.",
Bruce MacDonald's avatar
Bruce MacDonald committed
170
  "context": [1, 2, 3],
171
172
173
174
175
176
177
178
179
180
  "done": true,
  "total_duration": 14648695333,
  "load_duration": 3302671417,
  "prompt_eval_count": 14,
  "prompt_eval_duration": 286243000,
  "eval_count": 129,
  "eval_duration": 10931424000
}
```

181
182
183
#### Request (JSON mode)

```shell
184
curl http://localhost:11434/api/generate -d '{
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
  "model": "llama2",
  "prompt": "What color is the sky at different times of the day? Respond using JSON",
  "format": "json",
  "stream": false
}'
```

#### Response

```json
{
  "model": "llama2",
  "created_at": "2023-11-09T21:07:55.186497Z",
  "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
  "done": true,
  "total_duration": 4661289125,
  "load_duration": 1714434500,
  "prompt_eval_count": 36,
  "prompt_eval_duration": 264132000,
  "eval_count": 75,
  "eval_duration": 2112149000
}
```

The value of `response` will be a string containing JSON similar to:

```json
{
  "morning": {
    "color": "blue"
  },
  "noon": {
    "color": "blue-gray"
  },
  "afternoon": {
    "color": "warm gray"
  },
  "evening": {
    "color": "orange"
  }
}
```

#### Request (With options)
229
230
231
232

If you want to set custom options for the model at runtime rather than in the Modelfile, you can do so with the `options` parameter. This example sets every available option, but you can set any of them individually and omit the ones you do not want to override.

```shell
233
curl http://localhost:11434/api/generate -d '{
234
  "model": "llama2",
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
  "prompt": "Why is the sky blue?",
  "stream": false,
  "options": {
    "num_keep": 5,
    "seed": 42,
    "num_predict": 100,
    "top_k": 20,
    "top_p": 0.9,
    "tfs_z": 0.5,
    "typical_p": 0.7,
    "repeat_last_n": 33,
    "temperature": 0.8,
    "repeat_penalty": 1.2,
    "presence_penalty": 1.5,
    "frequency_penalty": 1.0,
    "mirostat": 1,
    "mirostat_tau": 0.8,
    "mirostat_eta": 0.6,
    "penalize_newline": true,
    "stop": ["\n", "user:"],
    "numa": false,
    "num_ctx": 4,
    "num_batch": 2,
    "num_gqa": 1,
    "num_gpu": 1,
    "main_gpu": 0,
    "low_vram": false,
    "f16_kv": true,
    "logits_all": false,
    "vocab_only": false,
    "use_mmap": true,
    "use_mlock": false,
    "embedding_only": false,
    "rope_frequency_base": 1.1,
    "rope_frequency_scale": 0.8,
    "num_thread": 8
    }
}'
```

#### Response

```json
{
279
  "model": "llama2",
280
281
282
283
284
285
286
287
288
289
290
291
292
293
  "created_at": "2023-08-04T19:22:45.499127Z",
  "response": "The sky is blue because it is the color of the sky.",
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 13,
  "eval_duration": 1325948000
}
```

294
## Send Chat Messages (coming in 0.1.14)
295

Bruce MacDonald's avatar
Bruce MacDonald committed
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
```shell
POST /api/chat
```

Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.

### Parameters

- `model`: (required) the [model name](#model-names)
- `messages`: the messages of the chat, this can be used to keep a chat memory

Advanced parameters (optional):

- `format`: the format to return a response in. Currently the only accepted value is `json`
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
- `template`: the full prompt or prompt template (overrides what is defined in the `Modelfile`)
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects

### Examples

#### Request
317

Bruce MacDonald's avatar
Bruce MacDonald committed
318
319
320
Send a chat message with a streaming response.

```shell
321
curl http://localhost:11434/api/chat -d '{
Bruce MacDonald's avatar
Bruce MacDonald committed
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
  "model": "llama2",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]
}'
```

#### Response

A stream of JSON objects is returned:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T08:52:19.385406455-07:00",
  "message": {
    "role": "assisant",
    "content": "The"
  },
  "done": false
}
```

Final response:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T19:22:45.499127Z",
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 113,
  "eval_duration": 1325948000
}
```

#### Request (With History)
367

Bruce MacDonald's avatar
Bruce MacDonald committed
368
369
370
Send a chat message with a conversation history.

```shell
371
curl http://localhost:11434/api/chat -d '{
Bruce MacDonald's avatar
Bruce MacDonald committed
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
  "model": "llama2",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    },
    {
      "role": "assistant",
      "content": "due to rayleigh scattering."
    },
    {
      "role": "user",
      "content": "how is that different than mie scattering?"
    }
  ]
}'
```

#### Response

A stream of JSON objects is returned:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T08:52:19.385406455-07:00",
  "message": {
    "role": "assisant",
    "content": "The"
  },
  "done": false
}
```

Final response:

```json
{
  "model": "llama2",
  "created_at": "2023-08-04T19:22:45.499127Z",
  "done": true,
  "total_duration": 5589157167,
  "load_duration": 3013701500,
  "sample_count": 114,
  "sample_duration": 81442000,
  "prompt_eval_count": 46,
  "prompt_eval_duration": 1160282000,
  "eval_count": 113,
  "eval_duration": 1325948000
}
```

424
## Create a Model
Matt Williams's avatar
Matt Williams committed
425

Matt Williams's avatar
Matt Williams committed
426
```shell
427
428
429
POST /api/create
```

Michael Yang's avatar
Michael Yang committed
430
Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation should also create any file blobs, fields such as `FROM` and `ADAPTER`, explicitly with the server using [Create a Blob](#create-a-blob) and the value to the path indicated in the response.
431

432
### Parameters
Matt Williams's avatar
Matt Williams committed
433

434
- `name`: name of the model to create
Jeffrey Morgan's avatar
Jeffrey Morgan committed
435
- `modelfile` (optional): contents of the Modelfile
436
- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
Jeffrey Morgan's avatar
Jeffrey Morgan committed
437
- `path` (optional): path to the Modelfile
Matt Williams's avatar
Matt Williams committed
438

439
440
441
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
442

Matt Williams's avatar
Matt Williams committed
443
```shell
444
curl http://localhost:11434/api/create -d '{
445
  "name": "mario",
446
  "modelfile": "FROM llama2\nSYSTEM You are mario from Super Mario Bros."
447
}'
Matt Williams's avatar
Matt Williams committed
448
449
```

450
#### Response
Matt Williams's avatar
Matt Williams committed
451

Matt Williams's avatar
Matt Williams committed
452
A stream of JSON objects. When finished, `status` is `success`.
Matt Williams's avatar
Matt Williams committed
453

454
```json
Matt Williams's avatar
Matt Williams committed
455
456
457
458
459
{
  "status": "parsing modelfile"
}
```

Michael Yang's avatar
Michael Yang committed
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
### Check if a Blob Exists

```shell
HEAD /api/blobs/:digest
```

Check if a blob is known to the server.

#### Query Parameters

- `digest`: the SHA256 digest of the blob

#### Examples

##### Request

```shell
curl -I http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
```

##### Response

Return 200 OK if the blob exists, 404 Not Found if it does not.

Michael Yang's avatar
Michael Yang committed
484
### Create a Blob
Michael Yang's avatar
Michael Yang committed
485
486
487
488
489
490
491

```shell
POST /api/blobs/:digest
```

Create a blob from a file. Returns the server file path.

Michael Yang's avatar
Michael Yang committed
492
#### Query Parameters
Michael Yang's avatar
Michael Yang committed
493
494
495

- `digest`: the expected SHA256 digest of the file

Michael Yang's avatar
Michael Yang committed
496
#### Examples
Michael Yang's avatar
Michael Yang committed
497

Michael Yang's avatar
Michael Yang committed
498
499
##### Request

Michael Yang's avatar
Michael Yang committed
500
```shell
Michael Yang's avatar
Michael Yang committed
501
curl -T model.bin -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
Michael Yang's avatar
Michael Yang committed
502
503
```

Michael Yang's avatar
Michael Yang committed
504
##### Response
Michael Yang's avatar
Michael Yang committed
505

Michael Yang's avatar
Michael Yang committed
506
Return 201 Created if the blob was successfully created.
Michael Yang's avatar
Michael Yang committed
507

508
## List Local Models
Matt Williams's avatar
Matt Williams committed
509

Matt Williams's avatar
Matt Williams committed
510
```shell
511
GET /api/tags
Matt Williams's avatar
Matt Williams committed
512
513
```

514
List models that are available locally.
Matt Williams's avatar
Matt Williams committed
515

516
517
518
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
519

Matt Williams's avatar
Matt Williams committed
520
```shell
521
522
curl http://localhost:11434/api/tags
```
Matt Williams's avatar
Matt Williams committed
523

524
#### Response
Matt Williams's avatar
Matt Williams committed
525

526
527
A single JSON object will be returned.

528
```json
Matt Williams's avatar
Matt Williams committed
529
530
531
{
  "models": [
    {
532
      "name": "llama2",
533
534
535
536
537
538
      "modified_at": "2023-08-02T17:02:23.713454393-07:00",
      "size": 3791730596
    },
    {
      "name": "llama2:13b",
      "modified_at": "2023-08-08T12:08:38.093596297-07:00",
Matt Williams's avatar
Matt Williams committed
539
      "size": 7323310500
Matt Williams's avatar
Matt Williams committed
540
541
    }
  ]
Matt Williams's avatar
Matt Williams committed
542
543
544
}
```

Matt Williams's avatar
Matt Williams committed
545
546
547
548
549
550
551
552
553
554
555
## Show Model Information

```shell
POST /api/show
```

Show details about a model including modelfile, template, parameters, license, and system prompt.

### Parameters

- `name`: name of the model to show
Matt Williams's avatar
Matt Williams committed
556

557
558
559
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
560

561
```shell
Matt Williams's avatar
Matt Williams committed
562
curl http://localhost:11434/api/show -d '{
563
  "name": "llama2"
Matt Williams's avatar
Matt Williams committed
564
}'
Matt Williams's avatar
Matt Williams committed
565
```
Matt Williams's avatar
Matt Williams committed
566

567
#### Response
Matt Williams's avatar
Matt Williams committed
568
569
570

```json
{
571
572
573
574
  "license": "<contents of license block>",
  "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llama2:latest\n\nFROM /Users/username/.ollama/models/blobs/sha256:8daa9615cce30c259a9555b1cc250d461d1bc69980a274b44d7eda0be78076d8\nTEMPLATE \"\"\"[INST] {{ if and .First .System }}<<SYS>>{{ .System }}<</SYS>>\n\n{{ end }}{{ .Prompt }} [/INST] \"\"\"\nSYSTEM \"\"\"\"\"\"\nPARAMETER stop [INST]\nPARAMETER stop [/INST]\nPARAMETER stop <<SYS>>\nPARAMETER stop <</SYS>>\n",
  "parameters": "stop                           [INST]\nstop                           [/INST]\nstop                           <<SYS>>\nstop                           <</SYS>>",
  "template": "[INST] {{ if and .First .System }}<<SYS>>{{ .System }}<</SYS>>\n\n{{ end }}{{ .Prompt }} [/INST] "
Matt Williams's avatar
Matt Williams committed
575
576
577
578
579
580
}
```

## Copy a Model

```shell
581
POST /api/copy
Matt Williams's avatar
Matt Williams committed
582
```
583

584
Copy a model. Creates a model with another name from an existing model.
Matt Williams's avatar
Matt Williams committed
585

586
587
588
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
589

Matt Williams's avatar
Matt Williams committed
590
```shell
591
curl http://localhost:11434/api/copy -d '{
592
  "source": "llama2",
593
  "destination": "llama2-backup"
Matt Williams's avatar
Matt Williams committed
594
595
596
}'
```

597
#### Response
598
599
600

The only response is a 200 OK if successful.

Matt Williams's avatar
Matt Williams committed
601
## Delete a Model
Matt Williams's avatar
Matt Williams committed
602

Matt Williams's avatar
Matt Williams committed
603
```shell
604
DELETE /api/delete
Matt Williams's avatar
Matt Williams committed
605
606
```

607
Delete a model and its data.
Matt Williams's avatar
Matt Williams committed
608

609
### Parameters
Matt Williams's avatar
Matt Williams committed
610

611
- `name`: model name to delete
Matt Williams's avatar
Matt Williams committed
612

613
614
615
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
616

Matt Williams's avatar
Matt Williams committed
617
```shell
618
619
curl -X DELETE http://localhost:11434/api/delete -d '{
  "name": "llama2:13b"
Matt Williams's avatar
Matt Williams committed
620
621
622
}'
```

623
#### Response
624
625
626

If successful, the only response is a 200 OK.

627
## Pull a Model
Matt Williams's avatar
Matt Williams committed
628

Matt Williams's avatar
Matt Williams committed
629
```shell
630
631
632
POST /api/pull
```

Matt Williams's avatar
Matt Williams committed
633
Download a model from the ollama library. Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.
Matt Williams's avatar
Matt Williams committed
634

635
### Parameters
Matt Williams's avatar
Matt Williams committed
636

637
- `name`: name of the model to pull
Matt Williams's avatar
Matt Williams committed
638
- `insecure`: (optional) allow insecure connections to the library. Only use this if you are pulling from your own library during development.
639
- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
Matt Williams's avatar
Matt Williams committed
640

641
642
643
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
644

Matt Williams's avatar
Matt Williams committed
645
```shell
646
curl http://localhost:11434/api/pull -d '{
647
  "name": "llama2"
648
}'
Matt Williams's avatar
Matt Williams committed
649
650
```

651
#### Response
652

653
654
655
656
657
658
659
660
661
662
663
664
If `stream` is not specified, or set to `true`, a stream of JSON objects is returned:

The first object is the manifest:

```json
{
  "status": "pulling manifest"
}
```

Then there is a series of downloading responses. Until any of the download is completed, the `completed` key may not be included. The number of files to be downloaded depends on the number of layers specified in the manifest.

665
```json
Matt Williams's avatar
Matt Williams committed
666
{
667
668
  "status": "downloading digestname",
  "digest": "digestname",
669
  "total": 2142590208,
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
  "completed": 241970
}
```

After all the files are downloaded, the final responses are:

```json
{
    "status": "verifying sha256 digest"
}
{
    "status": "writing manifest"
}
{
    "status": "removing any unused layers"
}
{
    "status": "success"
}
```

if `stream` is set to false, then the response is a single JSON object:

```json
{
  "status": "success"
Matt Williams's avatar
Matt Williams committed
696
697
}
```
698

Matt Williams's avatar
Matt Williams committed
699
700
701
702
703
704
705
706
707
708
709
## Push a Model

```shell
POST /api/push
```

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first.

### Parameters

- `name`: name of the model to push in the form of `<namespace>/<model>:<tag>`
710
- `insecure`: (optional) allow insecure connections to the library. Only use this if you are pushing to your library during development.
711
- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
Matt Williams's avatar
Matt Williams committed
712

713
714
715
### Examples

#### Request
Matt Williams's avatar
Matt Williams committed
716
717

```shell
718
curl http://localhost:11434/api/push -d '{
Matt Williams's avatar
Matt Williams committed
719
720
721
722
  "name": "mattw/pygmalion:latest"
}'
```

723
#### Response
724

725
If `stream` is not specified, or set to `true`, a stream of JSON objects is returned:
Matt Williams's avatar
Matt Williams committed
726
727

```json
728
{ "status": "retrieving manifest" }
729
```
Matt Williams's avatar
Matt Williams committed
730
731
732
733
734

and then:

```json
{
735
736
737
  "status": "starting upload",
  "digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab",
  "total": 1928429856
Matt Williams's avatar
Matt Williams committed
738
739
740
741
742
743
744
}
```

Then there is a series of uploading responses:

```json
{
745
746
747
748
  "status": "starting upload",
  "digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab",
  "total": 1928429856
}
Matt Williams's avatar
Matt Williams committed
749
750
751
752
753
754
755
756
757
```

Finally, when the upload is complete:

```json
{"status":"pushing manifest"}
{"status":"success"}
```

758
759
760
If `stream` is set to `false`, then the response is a single JSON object:

```json
761
{ "status": "success" }
762
763
```

Matt Williams's avatar
Matt Williams committed
764
765
766
## Generate Embeddings

```shell
767
768
769
770
771
772
773
774
775
776
POST /api/embeddings
```

Generate embeddings from a model

### Parameters

- `model`: name of model to generate embeddings from
- `prompt`: text to generate embeddings for

777
778
779
780
Advanced parameters:

- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`

781
782
783
### Examples

#### Request
784

Matt Williams's avatar
Matt Williams committed
785
```shell
786
curl http://localhost:11434/api/embeddings -d '{
787
  "model": "llama2",
788
789
790
791
  "prompt": "Here is an article about llamas..."
}'
```

792
#### Response
793
794
795

```json
{
Alexander F. Rødseth's avatar
Alexander F. Rødseth committed
796
  "embedding": [
797
798
799
    0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,
    0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281
  ]
Costa Alexoglou's avatar
Costa Alexoglou committed
800
801
}
```