README.md 4.33 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
5
# <div align="center"><strong>Ollama</strong></div>

## 简介

Ollama是以llama.cpp为后端的前端大模型推理框架,可快速部署主流模型。
mashun1's avatar
v1  
mashun1 committed
6
7
8

## 安装

mashun1's avatar
mashun1 committed
9
10
11
12
13
14
组件支持
+ Python 3.10
+ Cmake 3.29
+ gcc 7.3.1
+ go  

xuxzh1's avatar
update  
xuxzh1 committed
15
### 1、使用dockerfile方式安装(推荐)
mashun1's avatar
v1  
mashun1 committed
16

mashun1's avatar
mashun1 committed
17
18
本仓库仅提供代码修改时的参考,请勿直接使用,官方版本0.1.43已测可用。
本仓库也提供了已测试通过的`Dockerfile`(Z100),在`ollama_build.zip`中,如需适配其他平台需要按照下述文档对`Dockerfile`进行修改,主要为`AMDGPU_TARGETS=当前设备型号(如:gfx906,gfx928等);HSA_OVERRIDE_GFX_VERSION=设备型号(如: gfx906对应9.0.6;gfx928对应9.2.8)`
mashun1's avatar
v1  
mashun1 committed
19

mashun1's avatar
mashun1 committed
20
直接下载本仓库中的`ollama_build.zip`并修改其中的Dockerfile(遵循上述描述),然后执行`docker build xxxxx`相关命令(具体需自行查阅相关资料)。
mashun1's avatar
v1  
mashun1 committed
21

mashun1's avatar
mashun1 committed
22
23
如遇到卡数检测错误,请参考 https://developer.hpccube.com/codes/OpenDAS/ollama/-/issues/1 ,也可以提前进行修复。

mashun1's avatar
mashun1 committed
24
25
26
27
28
#### Dockerfile支持

|版本|压缩包名称|测试模型|
|:---:|:---:|:---:|
|0.1.43|ollama_build.zip|qwen2:7b-instruct-fp16|
xuxzh1's avatar
update  
xuxzh1 committed
29
|0.3.5(推荐)|ollama_035.zip|llama3.1:8b-instruct-q8_0|
mashun1's avatar
mashun1 committed
30

mashun1's avatar
mashun1 committed
31
### 2、使用源码编译方式安装(<=0.35)
mashun1's avatar
v1  
mashun1 committed
32

mashun1's avatar
mashun1 committed
33
#### 环境准备
mashun1's avatar
v1  
mashun1 committed
34

mashun1's avatar
mashun1 committed
35
##### Docker
mashun1's avatar
v1  
mashun1 committed
36

mashun1's avatar
mashun1 committed
37
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
xuxzh1's avatar
update  
xuxzh1 committed
38
    
mashun1's avatar
mashun1 committed
39
    docker run --shm-size 30g --network=host --name=ollama --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
mashun1's avatar
v1  
mashun1 committed
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62


1、下载源码

    git clone https://github.com/ollama/ollama.git  
    cd ollama
    git submodule init
    git submodule update

注意:上述命令将下载最新版本,可以在[官网](https://github.com/ollama/ollama/releases)查找其他版本。

2、修改源码

- 跳过git仓库检查及链接库检查,可在`llm/generate/gen_linux.sh`中修改
- 修改`llm/llama.cpp``cu`文件,在相应函数中加入`__launch_bounds__(1024)`
- 注释`llm/llama.cpp/requirements`中所有有关`torch`的下载部分
- 修改`gpu/amd_linux.go``DriverVersionFile``ROCmLibGlobs`值,具体参考本仓库相应文件。

3、安装依赖包

    cd llm/llama.cpp
    pip install -r requirements.txt

mashun1's avatar
mashun1 committed
63
#### 编译
mashun1's avatar
v1  
mashun1 committed
64

mashun1's avatar
mashun1 committed
65
##### 环境设置
mashun1's avatar
v1  
mashun1 committed
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
    export AMDGPU_TARGETS=当前设备型号(如:gfx906,gfx928等)
    export HSA_OVERRIDE_GFX_VERSION=设备型号(如: gfx906对应9.0.6;gfx928对应9.2.8)
    export HIP_PATH=/opt/dtk/hip
    export ROCM_PATH=/opt/dtk
    export CMAKE_PREFIX_PATH=/lib/cmake/amd_comgr/:$CMAKE_PREFIX_PATH
    export CMAKE_PREFIX_PATH=/opt/dtk/lib64/cmake/amd_comgr:$CMAKE_PREFIX_PATH
    export LIBRARY_PATH=/opt/dtk/llvm/lib/clang/15.0.0/lib/linux/:$LIBRARY_PATH
    export HIP_VISIBLE_DEVICES=所有设备号(0,1,2,3,4,5,6,...)/选择设备号(0,1)

注意:仅在dtk24.04+ 版本可用,其他版本需进行相应修改,更多设备型号设置请参考[列表](https://salsa.debian.org/rocm-team/community/team-project/-/wikis/supported-gpu-list)

安装go

https://golang.google.cn/dl/

    # 请替换[]中下载的go文件
    tar -C /usr/local -xzf [go-xxxx.tar.gz]
    export PATH=$PATH:/usr/local/go/bin
xuxzh1's avatar
update  
xuxzh1 committed
84
    
mashun1's avatar
v1  
mashun1 committed
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
    # 修改go下载源,提升速度(按需设置)
    go env -w GO111MODULE=on
    go env -w GOPROXY=https://goproxy.cn,direct

安装cmake(3.29版本可用)

https://cmake.org/download/

解压安装后设置

    export PATH=/path/to/cmake/bin:$PATH

gcc升级(如需,gcc - 7.3.1可正常编译)

    yum install centos-release-scl
    yum install devtoolset-9-gcc*
    scl enable devtoolset-9 bash
xuxzh1's avatar
update  
xuxzh1 committed
102
    
mashun1's avatar
v1  
mashun1 committed
103
104
105
106
    # 如果有其他版本gcc需要进行卸载
    rpm -q gcc
    rpm -e 上一步的输出 (需要卸载所有依赖项)

mashun1's avatar
mashun1 committed
107
##### 运行编译
mashun1's avatar
v1  
mashun1 committed
108
109
110
111

    cd llm/generate && bash gen_linux.sh
    cd ../.. && go build

mashun1's avatar
mashun1 committed
112
## 验证
mashun1's avatar
v1  
mashun1 committed
113
114
115
116
117
118
119
120
121
122
123
124
125

    ./ollama serve &
    export HIP_VISIBLE_DEVICES=0  (选择可用设备,可通过上条命令输出结果查看)
    ./ollama run qwen2:7b-instruct-fp16

更多使用方式请参考[原项目](https://github.com/ollama/ollama)

注意:每次运行前请检查环境变量`HSA_OVERRIDE_GFX_VERSION`是否正确设置。

## 参考资料

* https://github.com/ollama/ollama
* https://github.com/ggerganov/llama.cpp