README.md 4.68 KB
Newer Older
Michael Chiang's avatar
Michael Chiang committed
1
2
<div align="center">
  <picture>
Michael Chiang's avatar
Michael Chiang committed
3
4
    <source media="(prefers-color-scheme: dark)" height="200px" srcset="https://github.com/jmorganca/ollama/assets/3325447/56ea1849-1284-4645-8970-956de6e51c3c">
    <img alt="logo" height="200px" src="https://github.com/jmorganca/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7">
Michael Chiang's avatar
Michael Chiang committed
5
6
  </picture>
</div>
Jeffrey Morgan's avatar
Jeffrey Morgan committed
7

Bruce MacDonald's avatar
Bruce MacDonald committed
8
# Ollama
Jeffrey Morgan's avatar
Jeffrey Morgan committed
9

10
[![Discord](https://dcbadge.vercel.app/api/server/ollama?style=flat&compact=true)](https://discord.gg/ollama)
11

Jeffrey Morgan's avatar
Jeffrey Morgan committed
12
Run, create, and share large language models (LLMs).
13

Jeffrey Morgan's avatar
Jeffrey Morgan committed
14
15
> Note: Ollama is in early preview. Please report any issues you find.

16
17
## Download

Bruce MacDonald's avatar
Bruce MacDonald committed
18
- [Download](https://ollama.ai/download) for macOS
19
20
21
- Download for Windows and Linux (coming soon)
- Build [from source](#building)

22
23
24
25
26
27
28
29
30
31
## Quickstart

To run and chat with [Llama 2](https://ai.meta.com/llama), the new model by Meta:

```
ollama run llama2
```

## Model library

Jeffrey Morgan's avatar
Jeffrey Morgan committed
32
`ollama` includes a library of open-source models:
33

34
35
36
37
| Model                    | Parameters | Size  | Download                        |
| ------------------------ | ---------- | ----- | ------------------------------- |
| Llama2                   | 7B         | 3.8GB | `ollama pull llama2`            |
| Llama2 13B               | 13B        | 7.3GB | `ollama pull llama2:13b`        |
38
39
| Llama2 70B               | 70B        | 39GB  | `ollama pull llama2:70b`        |
| Llama2 Uncensored        | 7B         | 3.8GB | `ollama pull llama2-uncensored` |
40
41
42
43
| Orca Mini                | 3B         | 1.9GB | `ollama pull orca`              |
| Vicuna                   | 7B         | 3.8GB | `ollama pull vicuna`            |
| Nous-Hermes              | 13B        | 7.3GB | `ollama pull nous-hermes`       |
| Wizard Vicuna Uncensored | 13B        | 7.3GB | `ollama pull wizard-vicuna`     |
44

Jeffrey Morgan's avatar
Jeffrey Morgan committed
45
46
> Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

47
## Examples
Jeffrey Morgan's avatar
Jeffrey Morgan committed
48

49
### Run a model
50

Jeffrey Morgan's avatar
Jeffrey Morgan committed
51
```
52
53
54
ollama run llama2
>>> hi
Hello! How can I help you today?
Jeffrey Morgan's avatar
Jeffrey Morgan committed
55
56
```

57
58
59
60
61
62
63
64
65
For multiline input, you can wrap it with `"""`:

```
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
66
### Create a custom model
67
68
69
70

Pull a base model:

```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
71
ollama pull llama2
72
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
73

Gerd's avatar
Gerd committed
74
> To update a model to the latest version, run `ollama pull llama2` again. The model will be updated (if necessary).
Jeffrey Morgan's avatar
Jeffrey Morgan committed
75

76
Create a `Modelfile`:
Jeffrey Morgan's avatar
Jeffrey Morgan committed
77

Jeffrey Morgan's avatar
Jeffrey Morgan committed
78
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
79
FROM llama2
80
81
82
83
84

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
Jeffrey Morgan's avatar
Jeffrey Morgan committed
85
SYSTEM """
Jeffrey Morgan's avatar
Jeffrey Morgan committed
86
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
87
"""
Jeffrey Morgan's avatar
Jeffrey Morgan committed
88
```
Bruce MacDonald's avatar
Bruce MacDonald committed
89

90
Next, create and run the model:
Bruce MacDonald's avatar
Bruce MacDonald committed
91
92

```
93
94
95
96
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
Bruce MacDonald's avatar
Bruce MacDonald committed
97
98
```

99
For more examples, see the [examples](./examples) directory.
Bruce MacDonald's avatar
Bruce MacDonald committed
100

Bruce MacDonald's avatar
Bruce MacDonald committed
101
For more information on creating a Modelfile, see the [Modelfile](./docs/modelfile.md) documentation.
102
103

### Pull a model from the registry
Jeffrey Morgan's avatar
Jeffrey Morgan committed
104

105
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
106
ollama pull orca
107
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
108

Jeffrey Morgan's avatar
Jeffrey Morgan committed
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
### Listing local models

```
ollama list
```

## Model packages

### Overview

Ollama bundles model weights, configuration, and data into a single package, defined by a [Modelfile](./docs/modelfile.md).

<picture>
  <source media="(prefers-color-scheme: dark)" height="480" srcset="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
  <img alt="logo" height="480" src="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
</picture>

Jeffrey Morgan's avatar
Jeffrey Morgan committed
126
127
128
## Building

```
Michael Yang's avatar
Michael Yang committed
129
go build .
Jeffrey Morgan's avatar
Jeffrey Morgan committed
130
131
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
132
To run it start the server:
Bruce MacDonald's avatar
Bruce MacDonald committed
133

Jeffrey Morgan's avatar
Jeffrey Morgan committed
134
```
DavidZirinsky's avatar
DavidZirinsky committed
135
./ollama serve &
Jeffrey Morgan's avatar
Jeffrey Morgan committed
136
137
138
139
140
```

Finally, run a model!

```
141
./ollama run llama2
Jeffrey Morgan's avatar
Jeffrey Morgan committed
142
```
143
144
145

## REST API

146
> See the [API documentation](./docs/api.md) for all endpoints.
147

148
Ollama has an API for running and managing models. For example to generate text from a model:
149
150

```
151
152
153
154
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'
155
```
Nate Sesti's avatar
Nate Sesti committed
156

157
## Tools using Ollama
Nate Sesti's avatar
Nate Sesti committed
158

159
- [LangChain](https://python.langchain.com/docs/integrations/llms/ollama) and [LangChain.js](https://js.langchain.com/docs/modules/model_io/models/llms/integrations/ollama) with a question-answering [example](https://js.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).
Michael Chiang's avatar
Michael Chiang committed
160
161
- [Continue](https://github.com/continuedev/continue) - embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline.
- [Discord AI Bot](https://github.com/mekb-turtle/discord-ai-bot) - interact with Ollama as a chatbot on Discord.
162
- [Raycast Ollama](https://github.com/MassimilianoPasquini97/raycast_ollama) - Raycast extension to use Ollama for local llama inference on Raycast.
Michael Chiang's avatar
Michael Chiang committed
163
- [Simple HTML UI for Ollama](https://github.com/rtcfirefly/ollama-ui)