README.md 2.59 KB
Newer Older
gaoqiong's avatar
gaoqiong committed
1
2
3
4
# <div align="center"><strong>AutoAWQ</strong></div>
## 简介
AutoAWQ 是一个用于4bit量化的三方组件。与FP16相比,AutoAWQ可以将模型速度提升3倍,内存需求减少3倍。AutoAWQ实现了激活感知权重量化(AWQ)算法,用于量化LLMs。AutoAWQ是基于麻省理工学院的[原始工作](https://github.com/mit-han-lab/llm-awq)进行改进和创建的。
## 安装
Ji Lin's avatar
Ji Lin committed
5

gaoqiong's avatar
gaoqiong committed
6
### 使用源码编译方式安装
Casper's avatar
Casper committed
7

gaoqiong's avatar
gaoqiong committed
8
9
#### 编译环境准备
下载光源的镜像,起dcoker
Ji Lin's avatar
Ji Lin committed
10

Casper's avatar
Casper committed
11
```
gaoqiong's avatar
gaoqiong committed
12
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
Casper's avatar
Casper committed
13

gaoqiong's avatar
gaoqiong committed
14
15
16
# <Image ID>用上面拉取docker镜像的ID替换
# <Host Path>主机端路径
# <Container Path>容器映射路径
gaoqiong's avatar
gaoqiong committed
17
docker run -it --name mydocker --shm-size=1024G -v /opt/hyhal:/opt/hyhal:ro --device=/dev/kfd --device=/dev/dri/ --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --ulimit memlock=-1:-1 --ipc=host --network host --group-add video -v <Host Path>:<Container Path> <Image ID> /bin/bash                  
Casper's avatar
Casper committed
18
```
gaoqiong's avatar
gaoqiong committed
19
20
21
注:                      
1、docker启动  -v /opt/hyhal:/opt/hyhal  这个变量不能少                                
2、若使用 pip install 下载安装过慢,可添加源:-i https://pypi.tuna.tsinghua.edu.cn/simple/                              
Casper's avatar
Casper committed
22

gaoqiong's avatar
gaoqiong committed
23
24
25
26
#### 源码编译安装
- 代码下载
根据不同的需求下载不同的分支
- 提供2种源码编译方式(进入AutoAWQ目录):
Ilyas Moutawwakil's avatar
Ilyas Moutawwakil committed
27
```
gaoqiong's avatar
gaoqiong committed
28
29
30
#基础依赖安装:                      
pip install -r requirements.txt  

gaoqiong's avatar
gaoqiong committed
31
1. 源码编译安装
gaoqiong's avatar
gaoqiong committed
32
pip3 install e .
Ilyas Moutawwakil's avatar
Ilyas Moutawwakil committed
33

gaoqiong's avatar
gaoqiong committed
34
35
36
37
2. 编译成whl包安装
# 安装wheel 
python3 setup.py bdist_wheel
cd dist && pip3 install autoawq*
Ji Lin's avatar
Ji Lin committed
38
```
gaoqiong's avatar
gaoqiong committed
39
## 支持模型
gaoqiong's avatar
gaoqiong committed
40
| Models   |            Sizes            |
gaoqiong's avatar
gaoqiong committed
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
| :------: | :-------------------------: |
| LLaMA-2  | 7B/13B/70B                  |
| LLaMA    | 7B/13B/30B/65B              |
| Mistral  | 7B                          |
| Vicuna   | 7B/13B                      |
| MPT      | 7B/30B                      |
| Falcon   | 7B/40B                      |
| OPT      | 125m/1.3B/2.7B/6.7B/13B/30B |
| Bloom    | 560m/3B/7B/                 |
| GPTJ     | 6.7B                        |
| Aquila   | 7B                          |
| Aquila2  | 7B/34B                      |
| Yi       | 6B/34B                      |
| Qwen     | 1.8B/7B/14B/72B             |
| BigCode  | 1B/7B/15B                   |
| GPT NeoX | 20B                         |
| GPT-J    | 6B                          |
| LLaVa    | 7B/13B                      |
| Mixtral  | 8x7B                        |
| Baichuan | 7B/13B                      |
| QWen     | 1.8B/7B/14/72B              |
Ilyas Moutawwakil's avatar
Ilyas Moutawwakil committed
62

Ji Lin's avatar
Ji Lin committed
63

Casper's avatar
Casper committed
64

65
66


Casper's avatar
Casper committed
67
68
69
70
71
72






73
74


Casper's avatar
Casper committed
75

Casper Hansen's avatar
Casper Hansen committed
76
77


78

Casper Hansen's avatar
Casper Hansen committed
79

Ji Lin's avatar
Ji Lin committed
80
81


Casper's avatar
Casper committed
82

Casper's avatar
Casper committed
83

Ji Lin's avatar
Ji Lin committed
84
85
86
87




Casper's avatar
Casper committed
88
89
90
91




Ji Lin's avatar
Ji Lin committed
92
93