development.md 4.57 KB
Newer Older
Bruce MacDonald's avatar
Bruce MacDonald committed
1
2
# Development

Jeffrey Morgan's avatar
Jeffrey Morgan committed
3
Install required tools:
Bruce MacDonald's avatar
Bruce MacDonald committed
4

5
- cmake version 3.24 or higher
6
- go version 1.22 or higher
7
8
- gcc version 11.4.0 or higher

9
```bash
10
brew install go cmake gcc
Bruce MacDonald's avatar
Bruce MacDonald committed
11
12
```

13
14
15
Optionally enable debugging and more verbose logging:

```bash
16
# At build time
17
export CGO_CFLAGS="-g"
18
19
20

# At runtime
export OLLAMA_DEBUG=1
21
22
23
```

Get the required libraries and build the native LLM code:
24

25
```bash
26
go run build.go
27
28
```

29
Now you can run `ollama`:
Bruce MacDonald's avatar
Bruce MacDonald committed
30

31
```bash
32
./ollama
Bruce MacDonald's avatar
Bruce MacDonald committed
33
34
```

35
36
37
38
39
### Rebuilding the native code

If at any point you need to rebuild the native code, you can run the
build.go script again using the `-f` flag to force a rebuild, and,
optionally, the `-d` flag to skip building the Go binary:
Bruce MacDonald's avatar
Bruce MacDonald committed
40

41
```bash
42
go run build.go -f -d
Bruce MacDonald's avatar
Bruce MacDonald committed
43
```
44

45
### Linux
46

47
#### Linux CUDA (NVIDIA)
48

49
_Your operating system distribution may already have packages for NVIDIA CUDA. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!_
50

51
Install `cmake` and `golang` as well as [NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads)
52
development and runtime packages.
53
54
55
56

Typically the build scripts will auto-detect CUDA, however, if your Linux distro
or installation approach uses unusual paths, you can specify the location by
specifying an environment variable `CUDA_LIB_DIR` to the location of the shared
57
libraries, and `CUDACXX` to the location of the nvcc compiler. You can customize
58
set set of target CUDA architectues by setting `CMAKE_CUDA_ARCHITECTURES` (e.g. "50;60;70")
59

60
Then build the binary:
61

62
```
63
go run build.go
64
65
```

66
67
#### Linux ROCm (AMD)

68
_Your operating system distribution may already have packages for AMD ROCm and CLBlast. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!_
69

70
Install [CLBlast](https://github.com/CNugteren/CLBlast/blob/master/doc/installation.md) and [ROCm](https://rocm.docs.amd.com/en/latest/) development packages first, as well as `cmake` and `golang`.
71
72
73
74
75

Typically the build scripts will auto-detect ROCm, however, if your Linux distro
or installation approach uses unusual paths, you can specify the location by
specifying an environment variable `ROCM_PATH` to the location of the ROCm
install (typically `/opt/rocm`), and `CLBlast_DIR` to the location of the
76
CLBlast install (typically `/usr/lib/cmake/CLBlast`). You can also customize
77
the AMD GPU targets by setting AMDGPU_TARGETS (e.g. `AMDGPU_TARGETS="gfx1101;gfx1102"`)
78

79
Then build the binary:
80

81
```
82
go run build.go
83
84
```

85
ROCm requires elevated privileges to access the GPU at runtime. On most distros you can add your user account to the `render` group, or run as root.
86

87
88
#### Advanced CPU Settings

89
By default, running `go run build.go` will compile a few different variations
90
91
of the LLM library based on common CPU families and vector math capabilities,
including a lowest-common-denominator which should run on almost any 64 bit CPU
92
93
somewhat slowly. At runtime, Ollama will auto-detect the optimal variation to
load. If you would like to build a CPU-based build customized for your
94
processor, you can set `OLLAMA_CUSTOM_CPU_DEFS` to the llama.cpp flags you would
95
like to use. For example, to compile an optimized binary for an Intel i9-9880H,
96
97
98
you might use:

```
99
OLLAMA_CUSTOM_CPU_DEFS="-DLLAMA_AVX=on -DLLAMA_AVX2=on -DLLAMA_F16C=on -DLLAMA_FMA=on" go run build.go
100
101
```

102
103
#### Containerized Linux Build

104
If you have Docker available, you can build linux binaries with `./scripts/build_linux.sh` which has the CUDA and ROCm dependencies included. The resulting binary is placed in `./dist`
105
106
107
108
109
110
111

### Windows

Note: The windows build for Ollama is still under development.

Install required tools:

112
113
- MSVC toolchain - C/C++ and cmake as minimal requirements
- Go version 1.22 or higher
114
- MinGW (pick one variant) with GCC.
115
116
  - [MinGW-w64](https://www.mingw-w64.org/)
  - [MSYS2](https://www.msys2.org/)
117
118
119

```powershell
$env:CGO_ENABLED="1"
120
go run build.go
121
122
123
124
```

#### Windows CUDA (NVIDIA)

125
In addition to the common Windows development tools described above, install CUDA after installing MSVC.
126

127
- [NVIDIA CUDA](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html)
128
129
130
131


#### Windows ROCm (AMD Radeon)

132
In addition to the common Windows development tools described above, install AMDs HIP package after installing MSVC.
133

134
- [AMD HIP](https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html)
135
136
- [Strawberry Perl](https://strawberryperl.com/)

137
Lastly, add `ninja.exe` included with MSVC to the system path (e.g. `C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja`).