development.md 4.46 KB
Newer Older
Bruce MacDonald's avatar
Bruce MacDonald committed
1
2
# Development

Jeffrey Morgan's avatar
Jeffrey Morgan committed
3
Install required tools:
Bruce MacDonald's avatar
Bruce MacDonald committed
4

5
- cmake version 3.24 or higher
6
- go version 1.22 or higher
7
8
- gcc version 11.4.0 or higher

9
10
### MacOS

11
```bash
12
brew install go cmake gcc
Bruce MacDonald's avatar
Bruce MacDonald committed
13
14
```

15
16
17
Optionally enable debugging and more verbose logging:

```bash
18
# At build time
19
export CGO_CFLAGS="-g"
20
21
22

# At runtime
export OLLAMA_DEBUG=1
23
24
25
```

Get the required libraries and build the native LLM code:
26

27
```bash
28
go generate ./...
29
30
```

31
Then build ollama:
Bruce MacDonald's avatar
Bruce MacDonald committed
32

33
```bash
34
go build .
Bruce MacDonald's avatar
Bruce MacDonald committed
35
36
```

37
Now you can run `ollama`:
Bruce MacDonald's avatar
Bruce MacDonald committed
38

39
```bash
40
./ollama
Bruce MacDonald's avatar
Bruce MacDonald committed
41
```
42

43
### Linux
44

45
#### Linux CUDA (NVIDIA)
46

47
_Your operating system distribution may already have packages for NVIDIA CUDA. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!_
48

49
Install `cmake` and `golang` as well as [NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads)
50
development and runtime packages.
51
52
53
54

Typically the build scripts will auto-detect CUDA, however, if your Linux distro
or installation approach uses unusual paths, you can specify the location by
specifying an environment variable `CUDA_LIB_DIR` to the location of the shared
55
libraries, and `CUDACXX` to the location of the nvcc compiler. You can customize
56
a set of target CUDA architectures by setting `CMAKE_CUDA_ARCHITECTURES` (e.g. "50;60;70")
57

58
59
60
61
62
63
Then generate dependencies:

```
go generate ./...
```

64
Then build the binary:
65

66
```
67
go build .
68
69
```

70
71
#### Linux ROCm (AMD)

72
_Your operating system distribution may already have packages for AMD ROCm and CLBlast. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!_
73

74
Install [CLBlast](https://github.com/CNugteren/CLBlast/blob/master/doc/installation.md) and [ROCm](https://rocm.docs.amd.com/en/latest/) development packages first, as well as `cmake` and `golang`.
75
76
77
78
79

Typically the build scripts will auto-detect ROCm, however, if your Linux distro
or installation approach uses unusual paths, you can specify the location by
specifying an environment variable `ROCM_PATH` to the location of the ROCm
install (typically `/opt/rocm`), and `CLBlast_DIR` to the location of the
80
CLBlast install (typically `/usr/lib/cmake/CLBlast`). You can also customize
81
the AMD GPU targets by setting AMDGPU_TARGETS (e.g. `AMDGPU_TARGETS="gfx1101;gfx1102"`)
82

83
84
85
86
```
go generate ./...
```

87
Then build the binary:
88

89
```
90
go build .
91
92
```

93
ROCm requires elevated privileges to access the GPU at runtime. On most distros you can add your user account to the `render` group, or run as root.
94

95
96
#### Advanced CPU Settings

97
By default, running `go generate ./...` will compile a few different variations
98
99
of the LLM library based on common CPU families and vector math capabilities,
including a lowest-common-denominator which should run on almost any 64 bit CPU
100
101
somewhat slowly. At runtime, Ollama will auto-detect the optimal variation to
load. If you would like to build a CPU-based build customized for your
102
processor, you can set `OLLAMA_CUSTOM_CPU_DEFS` to the llama.cpp flags you would
103
like to use. For example, to compile an optimized binary for an Intel i9-9880H,
104
105
106
you might use:

```
107
108
OLLAMA_CUSTOM_CPU_DEFS="-DLLAMA_AVX=on -DLLAMA_AVX2=on -DLLAMA_F16C=on -DLLAMA_FMA=on" go generate ./...
go build .
109
110
```

111
112
#### Containerized Linux Build

113
If you have Docker available, you can build linux binaries with `./scripts/build_linux.sh` which has the CUDA and ROCm dependencies included. The resulting binary is placed in `./dist`
114
115
116
117
118
119
120

### Windows

Note: The windows build for Ollama is still under development.

Install required tools:

121
122
- MSVC toolchain - C/C++ and cmake as minimal requirements
- Go version 1.22 or higher
123
- MinGW (pick one variant) with GCC.
124
125
  - [MinGW-w64](https://www.mingw-w64.org/)
  - [MSYS2](https://www.msys2.org/)
126
127
128

```powershell
$env:CGO_ENABLED="1"
129
130
go generate ./...
go build .
131
132
133
134
```

#### Windows CUDA (NVIDIA)

135
In addition to the common Windows development tools described above, install CUDA after installing MSVC.
136

137
- [NVIDIA CUDA](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html)
138
139
140
141


#### Windows ROCm (AMD Radeon)

142
In addition to the common Windows development tools described above, install AMDs HIP package after installing MSVC.
143

144
- [AMD HIP](https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html)
145
146
- [Strawberry Perl](https://strawberryperl.com/)

147
Lastly, add `ninja.exe` included with MSVC to the system path (e.g. `C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja`).