README.md 5.27 KB
Newer Older
Michael Chiang's avatar
Michael Chiang committed
1
2
<div align="center">
  <picture>
Michael Chiang's avatar
Michael Chiang committed
3
4
    <source media="(prefers-color-scheme: dark)" height="200px" srcset="https://github.com/jmorganca/ollama/assets/3325447/56ea1849-1284-4645-8970-956de6e51c3c">
    <img alt="logo" height="200px" src="https://github.com/jmorganca/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7">
Michael Chiang's avatar
Michael Chiang committed
5
6
  </picture>
</div>
Jeffrey Morgan's avatar
Jeffrey Morgan committed
7

Bruce MacDonald's avatar
Bruce MacDonald committed
8
# Ollama
Jeffrey Morgan's avatar
Jeffrey Morgan committed
9

10
[![Discord](https://dcbadge.vercel.app/api/server/ollama?style=flat&compact=true)](https://discord.gg/ollama)
11

Jeffrey Morgan's avatar
Jeffrey Morgan committed
12
Run, create, and share large language models (LLMs).
13

Jeffrey Morgan's avatar
Jeffrey Morgan committed
14
15
> Note: Ollama is in early preview. Please report any issues you find.

16
17
## Download

Bruce MacDonald's avatar
Bruce MacDonald committed
18
- [Download](https://ollama.ai/download) for macOS
19
20
21
- Download for Windows and Linux (coming soon)
- Build [from source](#building)

22
23
24
25
26
27
28
29
30
31
## Quickstart

To run and chat with [Llama 2](https://ai.meta.com/llama), the new model by Meta:

```
ollama run llama2
```

## Model library

32
Ollama supports a list of open-source models available on [ollama.ai/library](https://ollama.ai/library 'ollama model library')
33

34
Here are some example open-source models that can be downloaded:
35

36
37
38
39
| Model                    | Parameters | Size  | Download                        |
| ------------------------ | ---------- | ----- | ------------------------------- |
| Llama2                   | 7B         | 3.8GB | `ollama pull llama2`            |
| Llama2 13B               | 13B        | 7.3GB | `ollama pull llama2:13b`        |
40
41
| Llama2 70B               | 70B        | 39GB  | `ollama pull llama2:70b`        |
| Llama2 Uncensored        | 7B         | 3.8GB | `ollama pull llama2-uncensored` |
42
| Code Llama               | 7B         | 3.8GB | `ollama pull codellama`         |
43
| Orca Mini                | 3B         | 1.9GB | `ollama pull orca-mini`         |
44
| Vicuna                   | 7B         | 3.8GB | `ollama pull vicuna`            |
45
46
| Nous-Hermes              | 7B         | 3.8GB | `ollama pull nous-hermes`       |
| Nous-Hermes 13B          | 13B        | 7.3GB | `ollama pull nous-hermes:13b`   |
47
| Wizard Vicuna Uncensored | 13B        | 7.3GB | `ollama pull wizard-vicuna`     |
48

Jeffrey Morgan's avatar
Jeffrey Morgan committed
49
50
> Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

51
## Examples
Jeffrey Morgan's avatar
Jeffrey Morgan committed
52

53
### Run a model
54

Jeffrey Morgan's avatar
Jeffrey Morgan committed
55
```
56
57
58
ollama run llama2
>>> hi
Hello! How can I help you today?
Jeffrey Morgan's avatar
Jeffrey Morgan committed
59
60
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
61
For multiline input, you can wrap text with `"""`:
62
63
64
65
66
67
68
69

```
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
70
### Create a custom model
71
72
73
74

Pull a base model:

```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
75
ollama pull llama2
76
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
77

Gerd's avatar
Gerd committed
78
> To update a model to the latest version, run `ollama pull llama2` again. The model will be updated (if necessary).
Jeffrey Morgan's avatar
Jeffrey Morgan committed
79

80
Create a `Modelfile`:
Jeffrey Morgan's avatar
Jeffrey Morgan committed
81

Jeffrey Morgan's avatar
Jeffrey Morgan committed
82
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
83
FROM llama2
84
85
86
87
88

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
Jeffrey Morgan's avatar
Jeffrey Morgan committed
89
SYSTEM """
Jeffrey Morgan's avatar
Jeffrey Morgan committed
90
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
91
"""
Jeffrey Morgan's avatar
Jeffrey Morgan committed
92
```
Bruce MacDonald's avatar
Bruce MacDonald committed
93

94
Next, create and run the model:
Bruce MacDonald's avatar
Bruce MacDonald committed
95
96

```
97
98
99
100
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
Bruce MacDonald's avatar
Bruce MacDonald committed
101
102
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
103
For more examples, see the [examples](./examples) directory. For more information on creating a Modelfile, see the [Modelfile](./docs/modelfile.md) documentation.
104
105

### Pull a model from the registry
Jeffrey Morgan's avatar
Jeffrey Morgan committed
106

107
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
108
ollama pull orca-mini
109
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
110

Jeffrey Morgan's avatar
Jeffrey Morgan committed
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
### Listing local models

```
ollama list
```

## Model packages

### Overview

Ollama bundles model weights, configuration, and data into a single package, defined by a [Modelfile](./docs/modelfile.md).

<picture>
  <source media="(prefers-color-scheme: dark)" height="480" srcset="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
  <img alt="logo" height="480" src="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
</picture>

Jeffrey Morgan's avatar
Jeffrey Morgan committed
128
129
## Building

Jeffrey Morgan's avatar
Jeffrey Morgan committed
130
Install `cmake`:
Michael Yang's avatar
Michael Yang committed
131

Jeffrey Morgan's avatar
Jeffrey Morgan committed
132
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
133
134
135
136
137
138
139
brew install cmake
```

Then generate dependencies and build:

```
go generate ./...
Michael Yang's avatar
Michael Yang committed
140
go build .
Jeffrey Morgan's avatar
Jeffrey Morgan committed
141
142
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
143
Next, start the server:
Bruce MacDonald's avatar
Bruce MacDonald committed
144

Jeffrey Morgan's avatar
Jeffrey Morgan committed
145
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
146
./ollama serve
Jeffrey Morgan's avatar
Jeffrey Morgan committed
147
148
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
149
Finally, run a model (in another shell):
Jeffrey Morgan's avatar
Jeffrey Morgan committed
150
151

```
152
./ollama run llama2
Jeffrey Morgan's avatar
Jeffrey Morgan committed
153
```
154
155
156

## REST API

157
> See the [API documentation](./docs/api.md) for all endpoints.
158

159
Ollama has an API for running and managing models. For example to generate text from a model:
160
161

```
162
163
164
165
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'
166
```
Nate Sesti's avatar
Nate Sesti committed
167

168
## Tools using Ollama
Nate Sesti's avatar
Nate Sesti committed
169

170
- [LangChain](https://python.langchain.com/docs/integrations/llms/ollama) and [LangChain.js](https://js.langchain.com/docs/modules/model_io/models/llms/integrations/ollama) with a question-answering [example](https://js.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).
Michael Chiang's avatar
Michael Chiang committed
171
- [Continue](https://github.com/continuedev/continue) - embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline.
Jeffrey Morgan's avatar
Jeffrey Morgan committed
172
- [LiteLLM](https://github.com/BerriAI/litellm) a lightweight python package to simplify LLM API calls
Michael Chiang's avatar
Michael Chiang committed
173
- [Discord AI Bot](https://github.com/mekb-turtle/discord-ai-bot) - interact with Ollama as a chatbot on Discord.
174
- [Raycast Ollama](https://github.com/MassimilianoPasquini97/raycast_ollama) - Raycast extension to use Ollama for local llama inference on Raycast.
Michael Chiang's avatar
Michael Chiang committed
175
- [Simple HTML UI for Ollama](https://github.com/rtcfirefly/ollama-ui)
Jeffrey Morgan's avatar
Jeffrey Morgan committed
176
- [Emacs client](https://github.com/zweifisch/ollama) for Ollama