README.md 3.85 KB
Newer Older
yongshk's avatar
yongshk committed
1
# TRANSFORMER-XL
yongshk's avatar
add new  
yongshk committed
2
3
4
5
6
7
8
## 论文
`Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context`

- https://arxiv.org/abs/1901.02860
## 模型结构
 TransformersXL 是一种改进的 Transformer 模型,旨在处理更长的文本序列。它引入了**延展性机制**,通过**分块处理**超长序列,然后使用**跨块注意力**来捕捉长距离依赖关系。 

yongshk's avatar
add new  
yongshk committed
9
![img](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/模型结构.png)
yongshk's avatar
add new  
yongshk committed
10
11
12
13
14
## 算法原理
 Transformer-XL 在很大程度上依赖于普通 Transformer(Al-Rfou 等人),但引入了两种创新技术——**递归机制****相对位置编码**——来克服普通 Transformer 的缺点以下是其原理对比

transformer

yongshk's avatar
add new  
yongshk committed
15
![](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/transformer的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
16
17
18

transformer-XL

yongshk's avatar
add new  
yongshk committed
19
![img](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/xl的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
20
21
22
23
24
25
## 环境配置
### Docker(方法一)
此处提供[光源](https://www.sourcefind.cn/#/service-details)拉取docker镜像的地址与使用步骤
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-22.10-py37-latest

yongshk's avatar
yongshk committed
26
docker run -it --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=32G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1  image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-22.10-py37-latest
dcuai's avatar
dcuai committed
27
pip install -r requirements.txt
yongshk's avatar
add new  
yongshk committed
28
```
yongshk's avatar
yongshk committed
29
30
31
32
33
34
35
36
37
38
39
40
41
### Dockerfile(方法二)

此处提供dockerfile的使用方法

```
docker build --no-cache -t transformer-XL:latest .
docker run -dit --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 unet:latest
docker exec -it transformer-XL /bin/bash
pip install -r requirements.txt
```

### Anaconda(方法三)

yongshk's avatar
add new  
yongshk committed
42
43
44
45
46
47
此处提供本地配置、编译的详细步骤,例如:

关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.hpccube.com/tool/)开发者社区下载安装。
```
DTK驱动:dtk22.10
python:python3.7
yongshk's avatar
yongshk committed
48
49
apex==0.1+gitdb7007a.dtk2210
torch==1.10.0a0+git2040069.dtk2210
yongshk's avatar
add new  
yongshk committed
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
```
`Tips:以上dtk驱动、python等DCU相关工具版本需要严格一一对应`

其它非深度学习库参照requirements.txt安装:
```
pip install -r requirements.txt
```
## 数据集
`enwik8`

- http://mattmahoney.net/dc/enwik8.zip

此处提供数据预处理脚本的使用方法
```
wget https://raw.githubusercontent.com/salesforce/awd-lstm-lm/master/data/enwik8/prep_enwik8.py

python3 prep_enwik8.py
```
项目中已提供用于试验训练的迷你数据集,训练数据目录结构如下,用于正常训练的完整数据集请按此目录结构进行制备:
```
 ── data
    │   ├── train.txt
    │   └── vaild.txt
    │   └── test.txt   
```
## 训练
yongshk's avatar
yongshk committed
76
77
### 单机单卡
```
dcuai's avatar
dcuai committed
78
cd pytorch
yongshk's avatar
yongshk committed
79
80
sh run_enwik8_base.sh train
```
yongshk's avatar
add new  
yongshk committed
81
82
83
84
85
### 单机多卡
```
sh run_enwik8_base_dp.sh train
```

yongshk's avatar
yongshk committed
86

yongshk's avatar
add new  
yongshk committed
87
88
89
90
91
## 推理
```
sh run_enwik8_base.sh eval --work_dir 模型路径
```
## result
yongshk's avatar
add new  
yongshk committed
92
![rusult](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/rusult.png)
yongshk's avatar
add new  
yongshk committed
93
94
95
96
97
98
99
100
101
102
103

### 精度
测试数据:[test data](http://mattmahoney.net/dc/enwik8.zip),使用的加速卡:Z100L。

根据测试结果情况填写表格:
| transformer-XL | loss | bpc |
| :------: | :------: | :------: |
| enwik8 | 0.9 | 1.292 |
## 应用场景
### 算法类别

yongshk's avatar
yongshk committed
104
`语言翻译`
yongshk's avatar
add new  
yongshk committed
105
106

### 热点应用行业
yongshk's avatar
yongshk committed
107
`科研`
yongshk's avatar
add new  
yongshk committed
108
109

## 源码仓库及问题反馈
yongshk's avatar
add new  
yongshk committed
110
- https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch
yongshk's avatar
add new  
yongshk committed
111
112
## 参考资料
- https://github.com/kimiyoung/transformer-xl