README.md 1.75 KB
Newer Older
zhuwenwen's avatar
zhuwenwen committed
1
2
# <div align="center"><strong>vLLM</strong></div>
## 简介
zhuwenwen's avatar
zhuwenwen committed
3
vLLM是一个快速且易于使用的LLM推理和服务库,使用PageAttention高效管理kv内存,Continuous batching传入请求,支持很多Hugging Face模型,如LLaMA & LLaMA-2、Qwen、Chatglm2 & Chatglm3等。
Woosuk Kwon's avatar
Woosuk Kwon committed
4

zhuwenwen's avatar
zhuwenwen committed
5
6
7
8
9
10
## 安装
vLLM支持
+ Python 3.8.
+ Python 3.9.
+ Python 3.10.
+ Python 3.11.
Woosuk Kwon's avatar
Woosuk Kwon committed
11

zhuwenwen's avatar
zhuwenwen committed
12
### 使用源码编译方式安装
Woosuk Kwon's avatar
Woosuk Kwon committed
13

zhuwenwen's avatar
zhuwenwen committed
14
15
#### 编译环境准备
提供2种环境准备方式:
Woosuk Kwon's avatar
Woosuk Kwon committed
16

zhuwenwen's avatar
zhuwenwen committed
17
1. 基于光源pytorch基础镜像环境:镜像下载地址:[https://sourcefind.cn/#/image/dcu/pytorch](https://sourcefind.cn/#/image/dcu/pytorch),根据pytorch、python、dtk及系统下载对应的镜像版本。
18

zhuwenwen's avatar
zhuwenwen committed
19
20
21
22
23
2. 基于现有python环境:安装pytorch,pytorch whl包下载目录:[https://cancon.hpccube.com:65024/4/main/pytorch/dtk23.10](https://cancon.hpccube.com:65024/4/main/pytorch/dtk23.10),根据python、dtk版本,下载对应pytorch的whl包。安装命令如下:
```shell
pip install torch* (下载的torch的whl包)
pip install setuptools wheel
```
Zhuohan Li's avatar
Zhuohan Li committed
24

zhuwenwen's avatar
zhuwenwen committed
25
26
27
#### 源码编译安装
```shell
git clone https://developer.hpccube.com/codes/aicomponent/vllm # 根据需要的分支进行切换
Zhuohan Li's avatar
Zhuohan Li committed
28
29
```

zhuwenwen's avatar
zhuwenwen committed
30
31
32
33
- 提供2种源码编译方式(进入vllm目录):
```
1. 编译whl包并安装
python setup.py bdist_wheel 
zhuwenwen's avatar
zhuwenwen committed
34
35
cd dist
pip install vllm*
Zhuohan Li's avatar
Zhuohan Li committed
36

zhuwenwen's avatar
zhuwenwen committed
37
38
39
2. 源码编译安装
python3 setup.py install 
```
Zhuohan Li's avatar
Zhuohan Li committed
40

zhuwenwen's avatar
zhuwenwen committed
41
42
#### 注意事项
+ 若使用 pip install 下载安装过慢,可添加源:-i https://pypi.tuna.tsinghua.edu.cn/simple/
43

zhuwenwen's avatar
zhuwenwen committed
44
## 验证
45
- python -c "import vllm; print(vllm.\_\_version__)",版本号与官方版本同步,查询该软件的版本号,例如0.3.3;
Woosuk Kwon's avatar
Woosuk Kwon committed
46

zhuwenwen's avatar
zhuwenwen committed
47
48
## Known Issue
-
Woosuk Kwon's avatar
Woosuk Kwon committed
49

zhuwenwen's avatar
zhuwenwen committed
50
51
52
## 参考资料
- [README_ORIGIN](README_ORIGIN.md)
- [https://github.com/vllm-project/vllm](https://github.com/vllm-project/vllm)