README.md 4.4 KB
Newer Older
Michael Chiang's avatar
Michael Chiang committed
1
2
<div align="center">
  <picture>
Michael Chiang's avatar
Michael Chiang committed
3
4
    <source media="(prefers-color-scheme: dark)" height="200px" srcset="https://github.com/jmorganca/ollama/assets/3325447/56ea1849-1284-4645-8970-956de6e51c3c">
    <img alt="logo" height="200px" src="https://github.com/jmorganca/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7">
Michael Chiang's avatar
Michael Chiang committed
5
6
  </picture>
</div>
Jeffrey Morgan's avatar
Jeffrey Morgan committed
7

Bruce MacDonald's avatar
Bruce MacDonald committed
8
# Ollama
Jeffrey Morgan's avatar
Jeffrey Morgan committed
9

10
[![Discord](https://dcbadge.vercel.app/api/server/ollama?style=flat&compact=true)](https://discord.gg/ollama)
11

12
13
> [!NOTE]
> Ollama is in early preview. Please report any issues you find.
Jeffrey Morgan's avatar
Jeffrey Morgan committed
14

Jeffrey Morgan's avatar
Jeffrey Morgan committed
15
Run, create, and share large language models (LLMs).
16

17
18
## Download

Bruce MacDonald's avatar
Bruce MacDonald committed
19
- [Download](https://ollama.ai/download) for macOS
20
21
22
- Download for Windows and Linux (coming soon)
- Build [from source](#building)

23
24
25
26
27
28
29
30
31
32
## Quickstart

To run and chat with [Llama 2](https://ai.meta.com/llama), the new model by Meta:

```
ollama run llama2
```

## Model library

Jeffrey Morgan's avatar
Jeffrey Morgan committed
33
`ollama` includes a library of open-source models:
34

35
36
37
38
| Model                    | Parameters | Size  | Download                        |
| ------------------------ | ---------- | ----- | ------------------------------- |
| Llama2                   | 7B         | 3.8GB | `ollama pull llama2`            |
| Llama2 13B               | 13B        | 7.3GB | `ollama pull llama2:13b`        |
39
40
| Llama2 70B               | 70B        | 39GB  | `ollama pull llama2:70b`        |
| Llama2 Uncensored        | 7B         | 3.8GB | `ollama pull llama2-uncensored` |
41
42
43
44
| Orca Mini                | 3B         | 1.9GB | `ollama pull orca`              |
| Vicuna                   | 7B         | 3.8GB | `ollama pull vicuna`            |
| Nous-Hermes              | 13B        | 7.3GB | `ollama pull nous-hermes`       |
| Wizard Vicuna Uncensored | 13B        | 7.3GB | `ollama pull wizard-vicuna`     |
45

Jeffrey Morgan's avatar
Jeffrey Morgan committed
46
47
> Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

48
## Examples
Jeffrey Morgan's avatar
Jeffrey Morgan committed
49

50
### Run a model
51

Jeffrey Morgan's avatar
Jeffrey Morgan committed
52
```
53
54
55
ollama run llama2
>>> hi
Hello! How can I help you today?
Jeffrey Morgan's avatar
Jeffrey Morgan committed
56
57
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
58
### Create a custom model
59
60
61
62

Pull a base model:

```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
63
ollama pull llama2
64
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
65

Gerd's avatar
Gerd committed
66
> To update a model to the latest version, run `ollama pull llama2` again. The model will be updated (if necessary).
Jeffrey Morgan's avatar
Jeffrey Morgan committed
67

68
Create a `Modelfile`:
Jeffrey Morgan's avatar
Jeffrey Morgan committed
69

Jeffrey Morgan's avatar
Jeffrey Morgan committed
70
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
71
FROM llama2
72
73
74
75
76

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
Jeffrey Morgan's avatar
Jeffrey Morgan committed
77
SYSTEM """
Jeffrey Morgan's avatar
Jeffrey Morgan committed
78
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
79
"""
Jeffrey Morgan's avatar
Jeffrey Morgan committed
80
```
Bruce MacDonald's avatar
Bruce MacDonald committed
81

82
Next, create and run the model:
Bruce MacDonald's avatar
Bruce MacDonald committed
83
84

```
85
86
87
88
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
Bruce MacDonald's avatar
Bruce MacDonald committed
89
90
```

91
For more examples, see the [examples](./examples) directory.
Bruce MacDonald's avatar
Bruce MacDonald committed
92

Bruce MacDonald's avatar
Bruce MacDonald committed
93
For more information on creating a Modelfile, see the [Modelfile](./docs/modelfile.md) documentation.
94
95

### Pull a model from the registry
Jeffrey Morgan's avatar
Jeffrey Morgan committed
96

97
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
98
ollama pull orca
99
```
Jeffrey Morgan's avatar
Jeffrey Morgan committed
100

Jeffrey Morgan's avatar
Jeffrey Morgan committed
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
### Listing local models

```
ollama list
```

## Model packages

### Overview

Ollama bundles model weights, configuration, and data into a single package, defined by a [Modelfile](./docs/modelfile.md).

<picture>
  <source media="(prefers-color-scheme: dark)" height="480" srcset="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
  <img alt="logo" height="480" src="https://github.com/jmorganca/ollama/assets/251292/2fd96b5f-191b-45c1-9668-941cfad4eb70">
</picture>

Jeffrey Morgan's avatar
Jeffrey Morgan committed
118
119
120
## Building

```
Michael Yang's avatar
Michael Yang committed
121
go build .
Jeffrey Morgan's avatar
Jeffrey Morgan committed
122
123
```

Jeffrey Morgan's avatar
Jeffrey Morgan committed
124
To run it start the server:
Bruce MacDonald's avatar
Bruce MacDonald committed
125

Jeffrey Morgan's avatar
Jeffrey Morgan committed
126
```
DavidZirinsky's avatar
DavidZirinsky committed
127
./ollama serve &
Jeffrey Morgan's avatar
Jeffrey Morgan committed
128
129
130
131
132
```

Finally, run a model!

```
133
./ollama run llama2
Jeffrey Morgan's avatar
Jeffrey Morgan committed
134
```
135
136
137

## REST API

138
> See the [API documentation](./docs/api.md) for all endpoints.
139

140
Ollama has an API for running and managing models. For example to generate text from a model:
141
142

```
143
144
145
146
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'
147
```
Nate Sesti's avatar
Nate Sesti committed
148

149
## Tools using Ollama
Nate Sesti's avatar
Nate Sesti committed
150

Jeffrey Morgan's avatar
Jeffrey Morgan committed
151
- [LangChain](https://js.langchain.com/docs/use_cases/question_answering/local_retrieval_qa) integration - Set up all local, JS-based retrival + QA over docs in 5 minutes.
Michael Chiang's avatar
Michael Chiang committed
152
153
- [Continue](https://github.com/continuedev/continue) - embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline.
- [Discord AI Bot](https://github.com/mekb-turtle/discord-ai-bot) - interact with Ollama as a chatbot on Discord.
154
- [Raycast Ollama](https://github.com/MassimilianoPasquini97/raycast_ollama) - Raycast extension to use Ollama for local llama inference on Raycast.
Michael Chiang's avatar
Michael Chiang committed
155
- [Simple HTML UI for Ollama](https://github.com/rtcfirefly/ollama-ui)