README.md 4.13 KB
Newer Older
dcuai's avatar
dcuai committed
1
# Transformer-XL
yongshk's avatar
add new  
yongshk committed
2
3
4
5
6
7
8
## 论文
`Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context`

- https://arxiv.org/abs/1901.02860
## 模型结构
 TransformersXL 是一种改进的 Transformer 模型,旨在处理更长的文本序列。它引入了**延展性机制**,通过**分块处理**超长序列,然后使用**跨块注意力**来捕捉长距离依赖关系。 

yongshk's avatar
add new  
yongshk committed
9
![img](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/模型结构.png)
yongshk's avatar
add new  
yongshk committed
10
11
12
13
14
## 算法原理
 Transformer-XL 在很大程度上依赖于普通 Transformer(Al-Rfou 等人),但引入了两种创新技术——**递归机制****相对位置编码**——来克服普通 Transformer 的缺点以下是其原理对比

transformer

yongshk's avatar
add new  
yongshk committed
15
![](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/transformer的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
16
17
18

transformer-XL

yongshk's avatar
add new  
yongshk committed
19
![img](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/xl的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
20
21
## 环境配置
### Docker(方法一)
dcuai's avatar
dcuai committed
22
此处提供[光源](https://sourcefind.cn/#/main-page)拉取docker镜像的地址与使用步骤
yongshk's avatar
add new  
yongshk committed
23
```
yongshk's avatar
yongshk committed
24
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10 
yongshk's avatar
add new  
yongshk committed
25

yongshk's avatar
yongshk committed
26
docker run -it --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 -v /opt/hyhal:/opt/hyhal:ro -v /usr/local/hyhal:/usr/local/hyhal:ro  image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10 /bin/bash
dcuai's avatar
dcuai committed
27
pip install -r requirements.txt
yongshk's avatar
add new  
yongshk committed
28
```
yongshk's avatar
yongshk committed
29
30
31
32
33
34
### Dockerfile(方法二)

此处提供dockerfile的使用方法

```
docker build --no-cache -t transformer-XL:latest .
yongshk's avatar
add  
yongshk committed
35
docker run -dit --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 -v /opt/hyhal:/opt/hyhal:ro -v /usr/local/hyhal:/usr/local/hyhal:ro transformer-XL:latest
yongshk's avatar
yongshk committed
36
37
38
39
40
41
docker exec -it transformer-XL /bin/bash
pip install -r requirements.txt
```

### Anaconda(方法三)

yongshk's avatar
add new  
yongshk committed
42
43
44
45
此处提供本地配置、编译的详细步骤,例如:

关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.hpccube.com/tool/)开发者社区下载安装。
```
yongshk's avatar
yongshk committed
46
47
48
49
DTK驱动:dtk24.04.1
python:python3.10
apex==1.1.0+das1.1.gitf477a3a.abi1.dtk2404.torch2.1.0
torch==2.1.0+das1.1.git3ac1bdd.abi1.dtk2404
yongshk's avatar
add new  
yongshk committed
50
51
52
53
54
55
56
57
58
```
`Tips:以上dtk驱动、python等DCU相关工具版本需要严格一一对应`

其它非深度学习库参照requirements.txt安装:
```
pip install -r requirements.txt
```
## 数据集
`enwik8`
zwq330205812's avatar
zwq330205812 committed
59
[快速下载地址](http://113.200.138.88:18080/aidatasets/project-dependency/enwik8)
yongshk's avatar
add new  
yongshk committed
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
- http://mattmahoney.net/dc/enwik8.zip

此处提供数据预处理脚本的使用方法
```
wget https://raw.githubusercontent.com/salesforce/awd-lstm-lm/master/data/enwik8/prep_enwik8.py

python3 prep_enwik8.py
```
项目中已提供用于试验训练的迷你数据集,训练数据目录结构如下,用于正常训练的完整数据集请按此目录结构进行制备:
```
 ── data
    │   ├── train.txt
    │   └── vaild.txt
    │   └── test.txt   
```
## 训练
yongshk's avatar
yongshk committed
76
77
### 单机单卡
```
dcuai's avatar
dcuai committed
78
cd pytorch
yongshk's avatar
yongshk committed
79
80
sh run_enwik8_base.sh train
```
yongshk's avatar
add new  
yongshk committed
81
82
83
84
85
### 单机多卡
```
sh run_enwik8_base_dp.sh train
```

yongshk's avatar
yongshk committed
86

yongshk's avatar
add new  
yongshk committed
87
88
89
90
91
## 推理
```
sh run_enwik8_base.sh eval --work_dir 模型路径
```
## result
yongshk's avatar
update  
yongshk committed
92
![rusult](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/result.png)
yongshk's avatar
add new  
yongshk committed
93
94
95
96
97
98
99
100
101
102
103

### 精度
测试数据:[test data](http://mattmahoney.net/dc/enwik8.zip),使用的加速卡:Z100L。

根据测试结果情况填写表格:
| transformer-XL | loss | bpc |
| :------: | :------: | :------: |
| enwik8 | 0.9 | 1.292 |
## 应用场景
### 算法类别

yongshk's avatar
yongshk committed
104
`语言翻译`
yongshk's avatar
add new  
yongshk committed
105
106

### 热点应用行业
yongshk's avatar
yongshk committed
107
`科研`,`设计`,`金融`
yongshk's avatar
add new  
yongshk committed
108
109

## 源码仓库及问题反馈
yongshk's avatar
add new  
yongshk committed
110
- https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch
yongshk's avatar
add new  
yongshk committed
111
112
## 参考资料
- https://github.com/kimiyoung/transformer-xl