README.md 1.94 KB
Newer Older
xuxz's avatar
xuxz committed
1
# <div align="center"><strong>Ollama</strong></div>
Jeffrey Morgan's avatar
Jeffrey Morgan committed
2

xuxz's avatar
xuxz committed
3
## 简介
Jeffrey Morgan's avatar
Jeffrey Morgan committed
4

xuxz's avatar
xuxz committed
5
Ollama可快速部署主流模型。
6

xuxz's avatar
xuxz committed
7
## 安装
Jeffrey Morgan's avatar
Jeffrey Morgan committed
8

xuxz's avatar
xuxz committed
9
### 1、使用源码编译方式安装
10

xuxz's avatar
xuxz committed
11
#### 环境准备
12

xuxz's avatar
xuxz committed
13
##### Docker
14

xuxz's avatar
xuxz committed
15
16
```bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.4.1-ubuntu22.04-dtk25.04-py3.10-fixpy
17

xuxz's avatar
xuxz committed
18
docker run -i -t -d  --device=/dev/kfd --privileged --network=host --device=/dev/dri --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home  -v /opt/hyhal:/opt/hyhal:ro -v --group-add video --shm-size 16G --name {容器名} {镜像ID}
19
20
```

xuxz's avatar
xuxz committed
21
1、下载源码
22

xuxz's avatar
xuxz committed
23
24
25
```bash
git clone -b 0.6.3 http://developer.sourcefind.cn/codes/OpenDAS/ollama.git --depth=1
cd ollama
Matt Williams's avatar
Matt Williams committed
26
27
```

xuxz's avatar
xuxz committed
28
#### 编译
Jeffrey Morgan's avatar
Jeffrey Morgan committed
29

xuxz's avatar
xuxz committed
30
##### 安装go
31

xuxz's avatar
xuxz committed
32
33
34
35
```bash
wget wget https://golang.google.cn/dl/go1.24.1.linux-amd64.tar.gz
tar -C /usr/local -xzf go1.24.1.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin
Jeffrey Morgan's avatar
Jeffrey Morgan committed
36

xuxz's avatar
xuxz committed
37
38
# 修改go下载源,提升速度(按需设置)
go env -w GOPROXY=https://goproxy.cn,direct
Jeffrey Morgan's avatar
Jeffrey Morgan committed
39
40
```

xuxz's avatar
xuxz committed
41
##### 运行编译
42

xuxz's avatar
xuxz committed
43
44
45
46
47
```bash
export LIBRARY_PATH=/opt/dtk/lib:$LIBRARY_PATH
cmake -B build
cmake --build build
go build .
48
49
```

xuxz's avatar
xuxz committed
50
## 运行
51

xuxz's avatar
xuxz committed
52
53
54
55
56
57
58
```bash
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
export ROCR_VISIBLE_DEVICES=所有设备号(0,1,2,3,4,5,6,...)/选择设备号
go run . serve  (选择可用设备,可通过上条命令输出结果查看)
# 新增fa和kv cache量化
OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q4_0 go run . serve
go run . run llama3.1
59
60
```

xuxz's avatar
xuxz committed
61
## deepseek-r1模型推理
Jeffrey Morgan's avatar
Jeffrey Morgan committed
62
63

```
xuxz's avatar
xuxz committed
64
65
66
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
go run . serve
go run . run deepseek-r1:671b
67
```
68

xuxz's avatar
xuxz committed
69
更多使用方式请参考[原项目](https://github.com/ollama/ollama)
70

xuxz's avatar
xuxz committed
71
注意:每次运行前请检查环境变量`HSA_OVERRIDE_GFX_VERSION`是否正确设置。
Sam's avatar
Sam committed
72

xuxz's avatar
xuxz committed
73
## 参考资料
74

xuxz's avatar
xuxz committed
75
* https://github.com/ollama/ollama