README.md 2.59 KB
Newer Older
yangql's avatar
initing  
yangql committed
1
2
3
4
# <div align="center"><strong>AutoAWQ</strong></div>
## 简介
AutoAWQ 是一个用于4bit量化的三方组件。与FP16相比,AutoAWQ可以将模型速度提升3倍,内存需求减少3倍。AutoAWQ实现了激活感知权重量化(AWQ)算法,用于量化LLMs。AutoAWQ是基于麻省理工学院的[原始工作](https://github.com/mit-han-lab/llm-awq)进行改进和创建的。
## 安装
yangql's avatar
yangql committed
5

yangql's avatar
initing  
yangql committed
6
### 使用源码编译方式安装
yangql's avatar
yangql committed
7

yangql's avatar
initing  
yangql committed
8
9
#### 编译环境准备
下载光源的镜像,起dcoker
yangql's avatar
yangql committed
10
11

```
yangql's avatar
initing  
yangql committed
12
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
yangql's avatar
yangql committed
13

yangql's avatar
initing  
yangql committed
14
15
16
17
# <Image ID>用上面拉取docker镜像的ID替换
# <Host Path>主机端路径
# <Container Path>容器映射路径
docker run -it --name mydocker --shm-size=1024G -v /opt/hyhal:/opt/hyhal:ro --device=/dev/kfd --device=/dev/dri/ --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --ulimit memlock=-1:-1 --ipc=host --network host --group-add video -v <Host Path>:<Container Path> <Image ID> /bin/bash                  
yangql's avatar
yangql committed
18
```
yangql's avatar
initing  
yangql committed
19
20
21
22
23
24
25
26
27
28
29
注:                      
1、docker启动  -v /opt/hyhal:/opt/hyhal  这个变量不能少                                
2、若使用 pip install 下载安装过慢,可添加源:-i https://pypi.tuna.tsinghua.edu.cn/simple/                              

#### 源码编译安装
- 代码下载
根据不同的需求下载不同的分支
- 提供2种源码编译方式(进入AutoAWQ目录):
```
#基础依赖安装:                      
pip install -r requirements.txt  
yangql's avatar
yangql committed
30

yangql's avatar
initing  
yangql committed
31
32
1. 源码编译安装
pip3 install e.
yangql's avatar
yangql committed
33

yangql's avatar
initing  
yangql committed
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
2. 编译成whl包安装
# 安装wheel 
python3 setup.py bdist_wheel
cd dist && pip3 install autoawq*
```
## 支持模型
| Models   |            Sizes            |
| :------: | :-------------------------: |
| LLaMA-2  | 7B/13B/70B                  |
| LLaMA    | 7B/13B/30B/65B              |
| Mistral  | 7B                          |
| Vicuna   | 7B/13B                      |
| MPT      | 7B/30B                      |
| Falcon   | 7B/40B                      |
| OPT      | 125m/1.3B/2.7B/6.7B/13B/30B |
| Bloom    | 560m/3B/7B/                 |
| GPTJ     | 6.7B                        |
| Aquila   | 7B                          |
| Aquila2  | 7B/34B                      |
| Yi       | 6B/34B                      |
| Qwen     | 1.8B/7B/14B/72B             |
| BigCode  | 1B/7B/15B                   |
| GPT NeoX | 20B                         |
| GPT-J    | 6B                          |
| LLaVa    | 7B/13B                      |
| Mixtral  | 8x7B                        |
| Baichuan | 7B/13B                      |
| QWen     | 1.8B/7B/14/72B              |
yangql's avatar
yangql committed
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93