README.md 5.28 KB
Newer Older
Michael Chiang's avatar
Michael Chiang committed
1
2
<div align="center">
  <picture>
Michael Chiang's avatar
Michael Chiang committed
3
4
    <source media="(prefers-color-scheme: dark)" height="200px" srcset="https://github.com/jmorganca/ollama/assets/3325447/56ea1849-1284-4645-8970-956de6e51c3c">
    <img alt="logo" height="200px" src="https://github.com/jmorganca/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7">
Michael Chiang's avatar
Michael Chiang committed
5
6
  </picture>
</div>
Jeffrey Morgan's avatar
Jeffrey Morgan committed
7

Bruce MacDonald's avatar
Bruce MacDonald committed
8
# Ollama
Jeffrey Morgan's avatar
Jeffrey Morgan committed
9

10
[![Discord](https://dcbadge.vercel.app/api/server/ollama?style=flat&compact=true)](https://discord.gg/ollama)
11

Jeffrey Morgan's avatar
Jeffrey Morgan committed
12
Run, create, and share large language models (LLMs).
13

Jeffrey Morgan's avatar
Jeffrey Morgan committed
14
15
> Note: Ollama is in early preview. Please report any issues you find.

16
17
## Download

Bruce MacDonald's avatar
Bruce MacDonald committed
18
- [Download](https://ollama.ai/download) for macOS
19
20
21
- Download for Windows and Linux (coming soon)
- Build [from source](#building)

22
23
24
25
26
27
28
29
30
31
## Quickstart

To run and chat with [Llama 2](https://ai.meta.com/llama), the new model by Meta:

```
ollama run llama2
```

## Model library

32
Ollama supports a list of open-source models available on [ollama.ai/library](https://ollama.ai/library 'ollama model library')
33

34
Here are some example open-source models that can be downloaded:
35

36
37
38
39
| Model                    | Parameters | Size  | Download                        |
| ------------------------ | ---------- | ----- | ------------------------------- |
| Llama2                   | 7B         | 3.8GB | `ollama pull llama2`            |
| Llama2 13B               | 13B        | 7.3GB | `ollama pull llama2:13b`        |
40
41
| Llama2 70B               | 70B        | 39GB  | `ollama pull llama2:70b`        |
| Llama2 Uncensored        | 7B         | 3.8GB | `ollama pull llama2-uncensored` |
42
| Code Llama               | 7B         | 3.8GB | `ollama pull codellama`         |
43
| Orca Mini                | 3B         | 1.9GB | `ollama pull orca-mini`         |
44
| Vicuna                   | 7B         | 3.8GB | `ollama pull vicuna`            |
45
46
| Nous-Hermes              | 7B         | 3.8GB | `ollama pull nous-hermes`       |
| Nous-Hermes 13B          | 13B        | 7.3GB | `ollama pull nous-hermes:13b`   |
47
| Wizard Vicuna Uncensored | 13B        | 7.3GB | `ollama pull wizard-vicuna`     |
48

Jeffrey Morgan's avatar
Jeffrey Morgan committed
49
50
> Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

51
## Examples
Jeffrey Morgan's avatar
Jeffrey Morgan committed
52

Michael Yang's avatar
Michael Yang committed
53
54
55
56
57
58
59
60
### Pull a public model

```
ollama pull llama2
```

> This command can also be used to update a local model. Only updated changes will be pulled.

61
### Run a model
62

Jeffrey Morgan's avatar
Jeffrey Morgan committed
63
```
64
65
66
ollama run llama2
>>> hi
Hello! How can I help you today?
Jeffrey Morgan's avatar
Jeffrey Morgan committed
67
68
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
69
For multiline input, you can wrap text with `"""`:
70
71
72
73
74
75
76
77

```
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
```

Michael Yang's avatar
Michael Yang committed
78
### Customize a model
79
80
81
82

Pull a base model:

```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
83
ollama pull llama2
84
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
85

86
Create a `Modelfile`:
Jeffrey Morgan's avatar
Jeffrey Morgan committed
87

Jeffrey Morgan's avatar
Jeffrey Morgan committed
88
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
89
FROM llama2
90
91
92
93
94

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
Jeffrey Morgan's avatar
Jeffrey Morgan committed
95
SYSTEM """
Jeffrey Morgan's avatar
Jeffrey Morgan committed
96
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
97
"""
Jeffrey Morgan's avatar
Jeffrey Morgan committed
98
```
Bruce MacDonald's avatar
Bruce MacDonald committed
99

100
Next, create and run the model:
Bruce MacDonald's avatar
Bruce MacDonald committed
101
102

```
103
104
105
106
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
Bruce MacDonald's avatar
Bruce MacDonald committed
107
108
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
109
For more examples, see the [examples](./examples) directory. For more information on creating a Modelfile, see the [Modelfile](./docs/modelfile.md) documentation.
110

Michael Yang's avatar
Michael Yang committed
111
### Listing local models
Jeffrey Morgan's avatar
Jeffrey Morgan committed
112

113
```
Michael Yang's avatar
Michael Yang committed
114
ollama list
115
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
116

Michael Yang's avatar
Michael Yang committed
117
### Removing local models
Jeffrey Morgan's avatar
Jeffrey Morgan committed
118
119

```
Michael Yang's avatar
Michael Yang committed
120
ollama rm llama2
Jeffrey Morgan's avatar
Jeffrey Morgan committed
121
122
123
124
125
126
```

## Model packages

### Overview

Michael Yang's avatar
Michael Yang committed
127
Ollama bundles model weights, configurations, and data into a single package, defined by a [Modelfile](./docs/modelfile.md).
Jeffrey Morgan's avatar
Jeffrey Morgan committed
128
129
130
131
132
133

<picture>
  <source media="(prefers-color-scheme: dark)" height="480" srcset="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
  <img alt="logo" height="480" src="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
</picture>

Jeffrey Morgan's avatar
Jeffrey Morgan committed
134
135
## Building

Jeffrey Morgan's avatar
Jeffrey Morgan committed
136
Install `cmake`:
Michael Yang's avatar
Michael Yang committed
137

Jeffrey Morgan's avatar
Jeffrey Morgan committed
138
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
139
140
141
142
143
144
145
brew install cmake
```

Then generate dependencies and build:

```
go generate ./...
Michael Yang's avatar
Michael Yang committed
146
go build .
Jeffrey Morgan's avatar
Jeffrey Morgan committed
147
148
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
149
Next, start the server:
Bruce MacDonald's avatar
Bruce MacDonald committed
150

Jeffrey Morgan's avatar
Jeffrey Morgan committed
151
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
152
./ollama serve
Jeffrey Morgan's avatar
Jeffrey Morgan committed
153
154
```

Michael Yang's avatar
Michael Yang committed
155
Finally, in a separate shell, run a model:
Jeffrey Morgan's avatar
Jeffrey Morgan committed
156
157

```
158
./ollama run llama2
Jeffrey Morgan's avatar
Jeffrey Morgan committed
159
```
160
161
162

## REST API

163
> See the [API documentation](./docs/api.md) for all endpoints.
164

165
Ollama has an API for running and managing models. For example to generate text from a model:
166
167

```
168
169
170
171
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'
172
```
Nate Sesti's avatar
Nate Sesti committed
173

174
## Tools using Ollama
Nate Sesti's avatar
Nate Sesti committed
175

176
- [LangChain](https://python.langchain.com/docs/integrations/llms/ollama) and [LangChain.js](https://js.langchain.com/docs/modules/model_io/models/llms/integrations/ollama) with a question-answering [example](https://js.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).
Michael Chiang's avatar
Michael Chiang committed
177
- [Continue](https://github.com/continuedev/continue) - embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline.
Jeffrey Morgan's avatar
Jeffrey Morgan committed
178
- [LiteLLM](https://github.com/BerriAI/litellm) a lightweight python package to simplify LLM API calls
Michael Chiang's avatar
Michael Chiang committed
179
- [Discord AI Bot](https://github.com/mekb-turtle/discord-ai-bot) - interact with Ollama as a chatbot on Discord.
180
- [Raycast Ollama](https://github.com/MassimilianoPasquini97/raycast_ollama) - Raycast extension to use Ollama for local llama inference on Raycast.
Michael Chiang's avatar
Michael Chiang committed
181
- [Simple HTML UI for Ollama](https://github.com/rtcfirefly/ollama-ui)
Jeffrey Morgan's avatar
Jeffrey Morgan committed
182
- [Emacs client](https://github.com/zweifisch/ollama) for Ollama