README.md 4.05 KB
Newer Older
dcuai's avatar
dcuai committed
1
# Transformer-XL
yongshk's avatar
add new  
yongshk committed
2
3
4
5
6
7
8
## 论文
`Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context`

- https://arxiv.org/abs/1901.02860
## 模型结构
 TransformersXL 是一种改进的 Transformer 模型,旨在处理更长的文本序列。它引入了**延展性机制**,通过**分块处理**超长序列,然后使用**跨块注意力**来捕捉长距离依赖关系。 

chenzk's avatar
chenzk committed
9
![img](https://developer.sourcefind.cn/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/模型结构.png)
yongshk's avatar
add new  
yongshk committed
10
11
12
13
14
## 算法原理
 Transformer-XL 在很大程度上依赖于普通 Transformer(Al-Rfou 等人),但引入了两种创新技术——**递归机制****相对位置编码**——来克服普通 Transformer 的缺点以下是其原理对比

transformer

chenzk's avatar
chenzk committed
15
![](https://developer.sourcefind.cn/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/transformer的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
16
17
18

transformer-XL

chenzk's avatar
chenzk committed
19
![img](https://developer.sourcefind.cn/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/xl的训练与评估.png)
yongshk's avatar
add new  
yongshk committed
20
21
## 环境配置
### Docker(方法一)
dcuai's avatar
dcuai committed
22
此处提供[光源](https://sourcefind.cn/#/main-page)拉取docker镜像的地址与使用步骤
yongshk's avatar
add new  
yongshk committed
23
```
yongshk's avatar
yongshk committed
24
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10 
yongshk's avatar
add new  
yongshk committed
25

yongshk's avatar
yongshk committed
26
docker run -it --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 -v /opt/hyhal:/opt/hyhal:ro -v /usr/local/hyhal:/usr/local/hyhal:ro  image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10 /bin/bash
dcuai's avatar
dcuai committed
27
pip install -r requirements.txt
yongshk's avatar
add new  
yongshk committed
28
```
yongshk's avatar
yongshk committed
29
30
31
32
33
34
### Dockerfile(方法二)

此处提供dockerfile的使用方法

```
docker build --no-cache -t transformer-XL:latest .
yongshk's avatar
add  
yongshk committed
35
docker run -dit --network=host --name=transformer-XL --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 -v /opt/hyhal:/opt/hyhal:ro -v /usr/local/hyhal:/usr/local/hyhal:ro transformer-XL:latest
yongshk's avatar
yongshk committed
36
37
38
39
40
41
docker exec -it transformer-XL /bin/bash
pip install -r requirements.txt
```

### Anaconda(方法三)

yongshk's avatar
add new  
yongshk committed
42
43
此处提供本地配置、编译的详细步骤,例如:

chenzk's avatar
chenzk committed
44
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
yongshk's avatar
add new  
yongshk committed
45
```
yongshk's avatar
yongshk committed
46
47
48
49
DTK驱动:dtk24.04.1
python:python3.10
apex==1.1.0+das1.1.gitf477a3a.abi1.dtk2404.torch2.1.0
torch==2.1.0+das1.1.git3ac1bdd.abi1.dtk2404
yongshk's avatar
add new  
yongshk committed
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
```
`Tips:以上dtk驱动、python等DCU相关工具版本需要严格一一对应`

其它非深度学习库参照requirements.txt安装:
```
pip install -r requirements.txt
```
## 数据集
`enwik8`
- http://mattmahoney.net/dc/enwik8.zip

此处提供数据预处理脚本的使用方法
```
wget https://raw.githubusercontent.com/salesforce/awd-lstm-lm/master/data/enwik8/prep_enwik8.py

python3 prep_enwik8.py
```
项目中已提供用于试验训练的迷你数据集,训练数据目录结构如下,用于正常训练的完整数据集请按此目录结构进行制备:
```
 ── data
    │   ├── train.txt
    │   └── vaild.txt
    │   └── test.txt   
```
## 训练
yongshk's avatar
yongshk committed
75
76
### 单机单卡
```
dcuai's avatar
dcuai committed
77
cd pytorch
yongshk's avatar
yongshk committed
78
79
sh run_enwik8_base.sh train
```
yongshk's avatar
add new  
yongshk committed
80
81
82
83
84
### 单机多卡
```
sh run_enwik8_base_dp.sh train
```

yongshk's avatar
yongshk committed
85

yongshk's avatar
add new  
yongshk committed
86
87
88
89
90
## 推理
```
sh run_enwik8_base.sh eval --work_dir 模型路径
```
## result
chenzk's avatar
chenzk committed
91
![rusult](https://developer.sourcefind.cn/codes/modelzoo/transformer-XL-pytorch/-/raw/main/doc/result.png)
yongshk's avatar
add new  
yongshk committed
92
93
94
95
96
97
98
99
100
101
102

### 精度
测试数据:[test data](http://mattmahoney.net/dc/enwik8.zip),使用的加速卡:Z100L。

根据测试结果情况填写表格:
| transformer-XL | loss | bpc |
| :------: | :------: | :------: |
| enwik8 | 0.9 | 1.292 |
## 应用场景
### 算法类别

yongshk's avatar
yongshk committed
103
`语言翻译`
yongshk's avatar
add new  
yongshk committed
104
105

### 热点应用行业
chenzk's avatar
chenzk committed
106
`科研,设计,金融`
yongshk's avatar
add new  
yongshk committed
107
108

## 源码仓库及问题反馈
chenzk's avatar
chenzk committed
109
- https://developer.sourcefind.cn/codes/modelzoo/transformer-XL-pytorch
yongshk's avatar
add new  
yongshk committed
110
111
## 参考资料
- https://github.com/kimiyoung/transformer-xl