"vscode:/vscode.git/clone" did not exist on "08e62070f6dee22a8984f700147f581b8bb49fc5"
README.md 4.54 KB
Newer Older
1
2
3
4
5
6
7
8
# Yi-1.5-6B

## 论文

`Yi: Open Foundation Models by 01.AI`

- [https://arxiv.org/abs/2403.04652]

zhougaofeng's avatar
zhougaofeng committed
9
10
## 模型架构

zhougaofeng's avatar
zhougaofeng committed
11
Yi 模型采用了基于 LLaMA 实现的修改版decoder-only Transformer 架构。主要改进包括:
zhougaofeng's avatar
zhougaofeng committed
12
13
14
15
16
17
18
19
20
21

注意力机制:
    Yi 在 6B 和 34B 模型中引入了分组查询注意力(GQA),以减少训练和推理成本,同时未观察到性能下降。

激活函数:
    使用 SwiGLU 作为后注意力层,调整激活大小以与现有模型保持一致,并补偿由 GQA 引起的参数减少。

位置嵌入和长上下文:
    采用 RoPE 并调整基频以支持长达 200K 的上下文窗口。通过持续预训练和轻量级微调,模型在长上下文检索性能上接近完美,表明模型具有内在的建模长依赖关系的能力。

22
23
## 算法原理

zhougaofeng's avatar
zhougaofeng committed
24
Yi-1.5是一个 decoder-only 的 transformer 模型,使用 SwiGLU激活函数、GQA、RoPE等是Yi的升级版本,它在Yi的基础上进行了持续预训练,使用了500B(即500十亿)个高质量语料库的token,并且在300万个多样化的微调样本上进行了微调。与Yi相比,Yi-1.5在编程、数学、推理和指令遵循能力方面表现更强,同时仍然保持了在语言理解、常识推理和阅读理解方面的卓越能力。
zhougaofeng's avatar
zhougaofeng committed
25
26
27
<div align=center>
    <img src="./doc/model_accuracy.png"/>
</div>
28
29
30
31
32
33
34
35

## 环境配置

### Docker(方法一)

此处提供[光源](https://www.sourcefind.cn/#/service-details)拉取 docker 镜像的地址与使用步骤

```
dcuai's avatar
dcuai committed
36
37
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-py3.10-dtk24.04.3-ubuntu20.04
docker run -it --shm-size=1024G -v <Host Path>:<Container Path> -v /opt/hyhal:/opt/hyhal --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name Yi-1.5  <your IMAGE ID> bash  # <your IMAGE ID>为以上拉取的docker的镜像ID替换
38
cd /home/Yi-1.5-pytorch
zhougaofeng's avatar
zhougaofeng committed
39
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com
dcuai's avatar
dcuai committed
40
41

pip uninstall vllm
42
43
44
45
46
47
48
49

```

### Dockerfile(方法二)

此处提供 dockerfile 的使用方法

```
zhougaofeng's avatar
zhougaofeng committed
50
51
docker build  -t yi-1.5-df:latest .
docker run -it --shm-size=1024G -v <Host Path>:<Container Path> -v /opt/hyhal:/opt/hyhal --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name yi-1.5  yi-1.5-df  bash
zhougaofeng's avatar
zhougaofeng committed
52
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com
dcuai's avatar
dcuai committed
53
54

pip uninstall vllm
55
56
57
58
59
60
61
62
63
64

```

### Anaconda(方法三)

此处提供本地配置、编译的详细步骤,例如:

关于本项目 DCU 显卡所需的特殊深度学习库可从[光合](https://developer.hpccube.com/tool/)开发者社区下载安装。

```
dcuai's avatar
dcuai committed
65
DTK驱动:dtk24.04.3
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
python:python3.10
torch: 2.1.0
torchvision: 0.16.0
deepspeed:0.12.3
bitsandbytes: 0.42.0
triton:2.1.0
```

`Tips:以上dtk驱动、python、paddle等DCU相关工具版本需要严格一一对应`

其它非深度学习库参照 requirements.txt 安装:

```
pip install -r requirements.txt
```
zhougaofeng's avatar
zhougaofeng committed
81
`Tips:transformers>=4.36.2,<4.39`
82
83
84
85
86
87
88
89
90
91

## 数据集
使用alpaca_en.json数据集,已经包含在data目录中,具体文件为alpaca_en_demo.json
项目中已提供用于试验训练的迷你数据集,训练数据目录结构如下,用于正常训练的完整数据集请按此目录结构进行制备:

```
 ── data
    ├── alpaca_en_demo.json.json
    └── alpaca_zh_demo.json.json
```
zhougaofeng's avatar
zhougaofeng committed
92

zhougaofeng's avatar
zhougaofeng committed
93
## 模型下载
zhougaofeng's avatar
zhougaofeng committed
94
预训练权重下载中心: [huggingface](https://huggingface.co/01-ai/Yi-1.5-6B-Chat)
zhougaofeng's avatar
zhougaofeng committed
95

zhougaofeng's avatar
zhougaofeng committed
96
快速下载通道: [AiModels](http://113.200.138.88:18080/aimodels/01-ai/Yi-1.5-6B-Chat)
zhougaofeng's avatar
zhougaofeng committed
97

zhougaofeng's avatar
zhougaofeng committed
98
模型目录结构如下:
zhougaofeng's avatar
zhougaofeng committed
99

zhougaofeng's avatar
zhougaofeng committed
100
101
102
<div align=center>
    <img src="./doc/model.png"/>
</div>
103
104
105

## 训练

zhougaofeng's avatar
zhougaofeng committed
106
根据实际路径修改模型路径和数据集路径
107
108
109
110

### 单机单卡

```
zhougaofeng's avatar
zhougaofeng committed
111
cd finetune
zhougaofeng's avatar
zhougaofeng committed
112
sh single_node.sh
113
114
115
116
```
### 单机多卡

```
zhougaofeng's avatar
zhougaofeng committed
117
cd finetune
118
119
120
121
122
123
124
sh multi_node.sh
```


## 推理

```
zhougaofeng's avatar
zhougaofeng committed
125
cd inference
zhougaofeng's avatar
zhougaofeng committed
126
python 6B_single_dcu.py
127
128
129
130
131
132
133
```

## result

使用的加速卡:2张 DCU-K100-64G

<div align=center>
zhougaofeng's avatar
zhougaofeng committed
134
    <img src="./result/inf_result.png"/>
135
136
137
138
139
140
141
142
143
144
145
146
</div>

### 精度
测试数据:[alpaca_en_demo.json],使用的加速卡:K100-64G,2卡训练。

根据测试结果情况填写表格:
| device | train_loss | 
| :------: | :------: | 
| DCU-K100 | 0.7647 | 
| GPU-A800 | 0.7651 | 


dcuai's avatar
dcuai committed
147
## 应用场景
148
149
150
151
152
153
154
155
156
157
### 算法类别

对话问答

### 热点应用行业

`科研,教育,政府,金融`

## 源码仓库及问题反馈

zhougaofeng's avatar
zhougaofeng committed
158
- https://developer.hpccube.com/codes/modelzoo/yi_1.5_6b_pytorch
159
160
161
162
163
164
165

## 参考资料

- https://github.com/hiyouga/LLaMA-Factory/tree/main
- https://github.com/01-ai/Yi-1.5
- (https://huggingface.co/01-ai)