README.md 3.81 KB
Newer Older
yongshk's avatar
yongshk committed
1
# TRANSFORMER-XL
yongshk's avatar
add new  
yongshk committed
2
3
4
5
6
7
8
## 论文
`Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context`

- https://arxiv.org/abs/1901.02860
## 模型结构
 TransformersXL 是一种改进的 Transformer 模型,旨在处理更长的文本序列。它引入了**延展性机制**,通过**分块处理**超长序列,然后使用**跨块注意力**来捕捉长距离依赖关系。 

yongshk's avatar
add new  
yongshk committed
9
![img](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/模型结构.png)
yongshk's avatar
add new  
yongshk committed
10
11
12
13
14
## 算法原理
 Transformer-XL 在很大程度上依赖于普通 Transformer(Al-Rfou 等人),但引入了两种创新技术——**递归机制****相对位置编码**——来克服普通 Transformer 的缺点以下是其原理对比

transformer

yongshk's avatar
add new  
yongshk committed
15
![](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/transformer的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
16
17
18

transformer-XL

yongshk's avatar
add new  
yongshk committed
19
![img](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/xl的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
20
21
22
23
24
25
## 环境配置
### Docker(方法一)
此处提供[光源](https://www.sourcefind.cn/#/service-details)拉取docker镜像的地址与使用步骤
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-22.10-py37-latest

yongshk's avatar
yongshk committed
26
docker run -it --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=32G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1  image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-22.10-py37-latest
yongshk's avatar
add new  
yongshk committed
27
```
yongshk's avatar
yongshk committed
28
29
30
31
32
33
34
35
36
37
38
39
40
### Dockerfile(方法二)

此处提供dockerfile的使用方法

```
docker build --no-cache -t transformer-XL:latest .
docker run -dit --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 unet:latest
docker exec -it transformer-XL /bin/bash
pip install -r requirements.txt
```

### Anaconda(方法三)

yongshk's avatar
add new  
yongshk committed
41
42
43
44
45
46
此处提供本地配置、编译的详细步骤,例如:

关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.hpccube.com/tool/)开发者社区下载安装。
```
DTK驱动:dtk22.10
python:python3.7
yongshk's avatar
yongshk committed
47
48
apex==0.1+gitdb7007a.dtk2210
torch==1.10.0a0+git2040069.dtk2210
yongshk's avatar
add new  
yongshk committed
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
```
`Tips:以上dtk驱动、python等DCU相关工具版本需要严格一一对应`

其它非深度学习库参照requirements.txt安装:
```
pip install -r requirements.txt
```
## 数据集
`enwik8`

- http://mattmahoney.net/dc/enwik8.zip

此处提供数据预处理脚本的使用方法
```
wget https://raw.githubusercontent.com/salesforce/awd-lstm-lm/master/data/enwik8/prep_enwik8.py

python3 prep_enwik8.py
```
项目中已提供用于试验训练的迷你数据集,训练数据目录结构如下,用于正常训练的完整数据集请按此目录结构进行制备:
```
 ── data
    │   ├── train.txt
    │   └── vaild.txt
    │   └── test.txt   
```
## 训练
yongshk's avatar
yongshk committed
75
76
77
78
### 单机单卡
```
sh run_enwik8_base.sh train
```
yongshk's avatar
add new  
yongshk committed
79
80
81
82
83
### 单机多卡
```
sh run_enwik8_base_dp.sh train
```

yongshk's avatar
yongshk committed
84

yongshk's avatar
add new  
yongshk committed
85
86
87
88
89
## 推理
```
sh run_enwik8_base.sh eval --work_dir 模型路径
```
## result
yongshk's avatar
add new  
yongshk committed
90
![rusult](https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/rusult.png)
yongshk's avatar
add new  
yongshk committed
91
92
93
94
95
96
97
98
99
100
101

### 精度
测试数据:[test data](http://mattmahoney.net/dc/enwik8.zip),使用的加速卡:Z100L。

根据测试结果情况填写表格:
| transformer-XL | loss | bpc |
| :------: | :------: | :------: |
| enwik8 | 0.9 | 1.292 |
## 应用场景
### 算法类别

yongshk's avatar
yongshk committed
102
`语言翻译`
yongshk's avatar
add new  
yongshk committed
103
104

### 热点应用行业
yongshk's avatar
yongshk committed
105
`科研`
yongshk's avatar
add new  
yongshk committed
106
107

## 源码仓库及问题反馈
yongshk's avatar
add new  
yongshk committed
108
- https://developer.hpccube.com/codes/modelzoo/transformer-XL-pytorch
yongshk's avatar
add new  
yongshk committed
109
110
## 参考资料
- https://github.com/kimiyoung/transformer-xl