README.md 7.82 KB
Newer Older
Michael Chiang's avatar
Michael Chiang committed
1
2
<div align="center">
  <picture>
Michael Chiang's avatar
Michael Chiang committed
3
4
    <source media="(prefers-color-scheme: dark)" height="200px" srcset="https://github.com/jmorganca/ollama/assets/3325447/56ea1849-1284-4645-8970-956de6e51c3c">
    <img alt="logo" height="200px" src="https://github.com/jmorganca/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7">
Michael Chiang's avatar
Michael Chiang committed
5
6
  </picture>
</div>
Jeffrey Morgan's avatar
Jeffrey Morgan committed
7

Bruce MacDonald's avatar
Bruce MacDonald committed
8
# Ollama
Jeffrey Morgan's avatar
Jeffrey Morgan committed
9

10
[![Discord](https://dcbadge.vercel.app/api/server/ollama?style=flat&compact=true)](https://discord.gg/ollama)
11

Jeffrey Morgan's avatar
Jeffrey Morgan committed
12
Run, create, and share large language models (LLMs).
13

Jeffrey Morgan's avatar
Jeffrey Morgan committed
14
15
> Note: Ollama is in early preview. Please report any issues you find.

16
17
## Download

Bruce MacDonald's avatar
Bruce MacDonald committed
18
- [Download](https://ollama.ai/download) for macOS
19
20
21
- Download for Windows and Linux (coming soon)
- Build [from source](#building)

22
23
24
25
26
27
28
29
30
31
## Quickstart

To run and chat with [Llama 2](https://ai.meta.com/llama), the new model by Meta:

```
ollama run llama2
```

## Model library

32
Ollama supports a list of open-source models available on [ollama.ai/library](https://ollama.ai/library 'ollama model library')
33

34
Here are some example open-source models that can be downloaded:
35

36
37
38
39
| Model                    | Parameters | Size  | Download                        |
| ------------------------ | ---------- | ----- | ------------------------------- |
| Llama2                   | 7B         | 3.8GB | `ollama pull llama2`            |
| Llama2 13B               | 13B        | 7.3GB | `ollama pull llama2:13b`        |
40
41
| Llama2 70B               | 70B        | 39GB  | `ollama pull llama2:70b`        |
| Llama2 Uncensored        | 7B         | 3.8GB | `ollama pull llama2-uncensored` |
42
| Code Llama               | 7B         | 3.8GB | `ollama pull codellama`         |
43
| Orca Mini                | 3B         | 1.9GB | `ollama pull orca-mini`         |
44
| Vicuna                   | 7B         | 3.8GB | `ollama pull vicuna`            |
45
46
| Nous-Hermes              | 7B         | 3.8GB | `ollama pull nous-hermes`       |
| Nous-Hermes 13B          | 13B        | 7.3GB | `ollama pull nous-hermes:13b`   |
47
| Wizard Vicuna Uncensored | 13B        | 7.3GB | `ollama pull wizard-vicuna`     |
48

Jeffrey Morgan's avatar
Jeffrey Morgan committed
49
50
> Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

51
## Examples
Jeffrey Morgan's avatar
Jeffrey Morgan committed
52

Michael Yang's avatar
Michael Yang committed
53
54
55
56
57
58
59
60
### Pull a public model

```
ollama pull llama2
```

> This command can also be used to update a local model. Only updated changes will be pulled.

Michael Yang's avatar
Michael Yang committed
61
### Run a model interactively
62

Jeffrey Morgan's avatar
Jeffrey Morgan committed
63
```
64
65
66
ollama run llama2
>>> hi
Hello! How can I help you today?
Jeffrey Morgan's avatar
Jeffrey Morgan committed
67
68
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
69
For multiline input, you can wrap text with `"""`:
70
71
72
73
74
75
76
77

```
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
```

Michael Yang's avatar
Michael Yang committed
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
### Run a model non-interactively

```
$ ollama run llama2 'tell me a joke'
 Sure! Here's a quick one:
 Why did the scarecrow win an award? Because he was outstanding in his field!
```

```
$ cat <<EOF >prompts.txt
tell me a joke about llamas
tell me another one
EOF
$ ollama run llama2 <prompts.txt
>>> tell me a joke about llamas
 Why did the llama refuse to play hide-and-seek?
 nobody likes to be hided!

>>> tell me another one
 Sure, here's another one:

Why did the llama go to the bar?
To have a hay-often good time!
```

### Run a model on contents of a text file

```
$ ollama run llama2 "summarize this file:" "$(cat README.md)"
 Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
```

Michael Yang's avatar
Michael Yang committed
110
### Customize a model
111
112
113
114

Pull a base model:

```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
115
ollama pull llama2
116
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
117

118
Create a `Modelfile`:
Jeffrey Morgan's avatar
Jeffrey Morgan committed
119

Jeffrey Morgan's avatar
Jeffrey Morgan committed
120
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
121
FROM llama2
122
123
124
125
126

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
Jeffrey Morgan's avatar
Jeffrey Morgan committed
127
SYSTEM """
Jeffrey Morgan's avatar
Jeffrey Morgan committed
128
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
129
"""
Jeffrey Morgan's avatar
Jeffrey Morgan committed
130
```
Bruce MacDonald's avatar
Bruce MacDonald committed
131

132
Next, create and run the model:
Bruce MacDonald's avatar
Bruce MacDonald committed
133
134

```
135
136
137
138
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
Bruce MacDonald's avatar
Bruce MacDonald committed
139
140
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
141
For more examples, see the [examples](./examples) directory. For more information on creating a Modelfile, see the [Modelfile](./docs/modelfile.md) documentation.
142

Michael Yang's avatar
Michael Yang committed
143
### Listing local models
Jeffrey Morgan's avatar
Jeffrey Morgan committed
144

145
```
Michael Yang's avatar
Michael Yang committed
146
ollama list
147
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
148

Michael Yang's avatar
Michael Yang committed
149
### Removing local models
Jeffrey Morgan's avatar
Jeffrey Morgan committed
150
151

```
Michael Yang's avatar
Michael Yang committed
152
ollama rm llama2
Jeffrey Morgan's avatar
Jeffrey Morgan committed
153
154
155
156
157
158
```

## Model packages

### Overview

Michael Yang's avatar
Michael Yang committed
159
Ollama bundles model weights, configurations, and data into a single package, defined by a [Modelfile](./docs/modelfile.md).
Jeffrey Morgan's avatar
Jeffrey Morgan committed
160
161
162
163
164
165

<picture>
  <source media="(prefers-color-scheme: dark)" height="480" srcset="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
  <img alt="logo" height="480" src="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
</picture>

Jeffrey Morgan's avatar
Jeffrey Morgan committed
166
167
## Building

168
Install `cmake` and `go`:
Michael Yang's avatar
Michael Yang committed
169

Jeffrey Morgan's avatar
Jeffrey Morgan committed
170
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
171
brew install cmake
172
brew install go
Jeffrey Morgan's avatar
Jeffrey Morgan committed
173
174
175
176
177
178
```

Then generate dependencies and build:

```
go generate ./...
Michael Yang's avatar
Michael Yang committed
179
go build .
Jeffrey Morgan's avatar
Jeffrey Morgan committed
180
181
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
182
Next, start the server:
Bruce MacDonald's avatar
Bruce MacDonald committed
183

Jeffrey Morgan's avatar
Jeffrey Morgan committed
184
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
185
./ollama serve
Jeffrey Morgan's avatar
Jeffrey Morgan committed
186
187
```

Michael Yang's avatar
Michael Yang committed
188
Finally, in a separate shell, run a model:
Jeffrey Morgan's avatar
Jeffrey Morgan committed
189
190

```
191
./ollama run llama2
Jeffrey Morgan's avatar
Jeffrey Morgan committed
192
```
193
194
195

## REST API

196
> See the [API documentation](./docs/api.md) for all endpoints.
197

198
Ollama has an API for running and managing models. For example to generate text from a model:
199
200

```
201
202
203
204
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'
205
```
Nate Sesti's avatar
Nate Sesti committed
206

Michael Yang's avatar
Michael Yang committed
207
## Community Projects using Ollama
Nate Sesti's avatar
Nate Sesti committed
208

209
210
211
212
213
214
215
216
| Project                                                                    | Description                                                                                                                                                  |
| -------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| [LangChain][1] and [LangChain.js][2]                                       | Also, there is a question-answering [example][3].                                                                                                            |
| [Continue](https://github.com/continuedev/continue)                        | Embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline. |
| [LiteLLM](https://github.com/BerriAI/litellm)                              | Lightweight Python package to simplify LLM API calls.                                                                                                        |
| [Discord AI Bot](https://github.com/mekb-turtle/discord-ai-bot)            | Interact with Ollama as a chatbot on Discord.                                                                                                                |
| [Raycast Ollama](https://github.com/MassimilianoPasquini97/raycast_ollama) | Raycast extension to use Ollama for local llama inference on Raycast.                                                                                        |
| [Simple HTML UI](https://github.com/rtcfirefly/ollama-ui)                  | Also, there is a Chrome extension.                                                                                                                           |
Twan L's avatar
Twan L committed
217
| [Ollama-GUI](https://github.com/ollama-interface/Ollama-Gui?tab=readme-ov-file)                  | 🖥️ Mac Chat Interface ⚡️                                                                                                                           |
218
219
220
221
222
| [Emacs client](https://github.com/zweifisch/ollama)                        |                                                                                                                                                              |

[1]: https://python.langchain.com/docs/integrations/llms/ollama
[2]: https://js.langchain.com/docs/modules/model_io/models/llms/integrations/ollama
[3]: https://js.langchain.com/docs/use_cases/question_answering/local_retrieval_qa