modelfile.md 6.74 KB
Newer Older
Michael Chiang's avatar
Michael Chiang committed
1
# Ollama Model File
Matt Williams's avatar
Matt Williams committed
2

Michael Chiang's avatar
Michael Chiang committed
3
A model file is the blueprint to create and share models with Ollama.
Matt Williams's avatar
Matt Williams committed
4
5
6

## Format

Michael Chiang's avatar
Michael Chiang committed
7
The format of the Modelfile:
Matt Williams's avatar
Matt Williams committed
8
9
10
11
12
13

```modelfile
# comment
INSTRUCTION arguments
```

Michael Yang's avatar
Michael Yang committed
14
15
16
17
18
19
20
| Instruction               | Description                                              |
|-------------------------- |--------------------------------------------------------- |
| `FROM`<br>(required)      | Defines the base model to be used when creating a model  |
| `PARAMETER`<br>(optional) | Sets the parameters for how the model will be run        |
| `TEMPLATE`<br>(optional)  | Sets the prompt template to use when the model will be run        |
| `SYSTEM`<br>(optional)    | // todo |
| `LICENSE`<br>(optional)   | Specify the license of the model. It is additive, and                          |
Matt Williams's avatar
Matt Williams committed
21

Michael Chiang's avatar
Michael Chiang committed
22
## Examples
Matt Williams's avatar
Matt Williams committed
23

Michael Chiang's avatar
Michael Chiang committed
24
An example of a model file creating a mario blueprint:
Matt Williams's avatar
Matt Williams committed
25

Michael Chiang's avatar
Michael Chiang committed
26
27
28
```
FROM llama2
PARAMETER temperature 1
Michael Yang's avatar
Michael Yang committed
29
30
TEMPLATE """
System: {{ .System }}
Michael Chiang's avatar
Michael Chiang committed
31
32
33
User: {{ .Prompt }}
Assistant:
"""
Michael Yang's avatar
Michael Yang committed
34
35

SYSTEM You are Mario from super mario bros, acting as an assistant.
Matt Williams's avatar
Matt Williams committed
36
37
```

Michael Chiang's avatar
Michael Chiang committed
38
To use this:
Matt Williams's avatar
Matt Williams committed
39

Michael Chiang's avatar
Michael Chiang committed
40
41
42
43
1. Save it as a file (eg. modelfile)
2. `ollama create NAME -f <location of the file eg. ./modelfile>'`
3. `ollama run NAME`
4. Start using the model!
44

Michael Chiang's avatar
Michael Chiang committed
45
46
47
48
49
50
## FROM (Required)

The FROM instruction defines the base model to be used when creating a model.

```
FROM <model name>:<tag>
51
```
Matt Williams's avatar
Matt Williams committed
52

Michael Chiang's avatar
Michael Chiang committed
53
### Build from llama2
54

Michael Chiang's avatar
Michael Chiang committed
55
56
57
```
FROM llama2:latest
```
Matt Williams's avatar
Matt Williams committed
58

Michael Chiang's avatar
Michael Chiang committed
59
60
61
62
63
64
65
A list of available base models:
<https://github.com/jmorganca/ollama#model-library>

### Build from a bin file

```
FROM ./ollama-model.bin 
Matt Williams's avatar
Matt Williams committed
66
67
```

Michael Chiang's avatar
Michael Chiang committed
68
69
## PARAMETER (Optional)

Michael Yang's avatar
Michael Yang committed
70
The `PARAMETER` instruction defines a parameter that can be set when the model is run.
71

Michael Chiang's avatar
Michael Chiang committed
72
73
74
```
PARAMETER <parameter> <parametervalue>
```
Matt Williams's avatar
Matt Williams committed
75

Michael Chiang's avatar
Michael Chiang committed
76
### Valid Parameters and Values
Matt Williams's avatar
Matt Williams committed
77

Michael Chiang's avatar
Michael Chiang committed
78
79
80
81
82
83
84
85
86
87
88
89
90
91
| Parameter     | Description                                                                                                                                                                                                                                             | Value Type | Example Usage     |
|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-------------------|
| NumCtx        | Sets the size of the prompt context size length model. (Default: 2048)                                                                                                                                                                                  | int        | Numctx 4096       |
| temperature   | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)                                                                                                                                     | float      | Temperature 0.7   |
| TopK          | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)                                                                        | int        | TopK 40           |
| TopP          | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)                                                                 | float      | TopP 0.9          |
| NumGPU        | The number of GPUs to use. On macOS it defaults to 1 to enable metal support, 0 to disable.                                                                                                                                                             | int        | numGPU 1          |
| RepeatLastN   | Sets how far back for the model to look back to prevent repetition.  (Default: 64, 0 = disabled, -1 = ctx-size)                                                                                                                                         | int        | RepeatLastN 64    |
| RepeatPenalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)                                                                     | float      | RepeatPenalty 1.1 |
| TFSZ          | Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)                                               | float      | TFSZ 1            |
| Mirostat      | Enable Mirostat sampling for controlling perplexity.  (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)                                                                                                                                        | int        | Mirostat 0        |
| MirostatTau   | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)                                                                                                         | float      | MirostatTau 5.0   |
| MirostatEta   | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)                        | float      | MirostatEta 0.1   |
| NumThread     | Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores). | int        | NumThread 8       |
Matt Williams's avatar
Matt Williams committed
92

Michael Yang's avatar
Michael Yang committed
93
## TEMPLATE
Matt Williams's avatar
Matt Williams committed
94

Michael Yang's avatar
Michael Yang committed
95
`TEMPLATE` is a set of instructions to an LLM to cause the model to return desired response(s). Typically there are 3-4 components to a prompt: system, input, and response.
Matt Williams's avatar
Matt Williams committed
96
97

```modelfile
Michael Yang's avatar
Michael Yang committed
98
TEMPLATE """
Matt Williams's avatar
Matt Williams committed
99
### System:
Michael Yang's avatar
Michael Yang committed
100
101
{{ .System }}

Matt Williams's avatar
Matt Williams committed
102
103
104
105
106
107
### Instruction:
{{ .Prompt }}

### Response:
"""

Michael Yang's avatar
Michael Yang committed
108
109
110
111
SYSTEM """
You are a content marketer who needs to come up with a short but succinct tweet. Make sure to include the appropriate hashtags and links. Sometimes when appropriate, describe a meme that can be includes as well. All answers should be in the form of a tweet which has a max size of 280 characters. Every instruction will be the topic to create a tweet about.
"""

Michael Chiang's avatar
Michael Chiang committed
112
113
```

Michael Yang's avatar
Michael Yang committed
114
115
116
117
## SYSTEM

// todo

Michael Chiang's avatar
Michael Chiang committed
118
119
120
121
## Notes

- the **modelfile is not case sensitive**. In the examples, we use uppercase for instructions to make it easier to distinguish it from arguments.
- Instructions can be in any order. In the examples, we start with FROM instruction to keep it easily readable.