Commit 20f80172 authored by wanglch's avatar wanglch
Browse files

Update README.md

parent 3f0a9b8d
# UMT5
**注:执行下游任务是需要先进行预训练, 训练代码参考train_model.py。**
<div align="center">
<img align="center" src=docs/T5_task.png>
<img align="center" src=docs/T5_task.png>
</div>
## 论文
......@@ -13,7 +13,7 @@ umT5:T5 的多语言版本,具备 T5 模型大部分的多功能性,在多
### MT5 模型结构
<div align="center">
<img align="center" src=docs/T5_structure.png>
<img align="center" src=docs/T5_structure.png>
</div>
......@@ -23,19 +23,19 @@ umT5:T5 的多语言版本,具备 T5 模型大部分的多功能性,在多
它主要的改动来自论文[GLU Variants Improve Transformer](https://arxiv.org/abs/2002.05202),主要是借用了[Language Modeling with Gated Convolutional Networks](https://arxiv.org/abs/1612.08083)**GLU**(Gated Linear Unit)来增强 FFN 部分的效果。具体来说,原来 T5 的 FFN 为(T5 没有 Bias):
<div align="center">
<img align="center" src=docs/equation1.png>
<img align="center" src=docs/equation1.png>
</div>
改为:
<div align="center">
<img align="center" src=docs/euqation2.png>
<img align="center" src=docs/euqation2.png>
</div>
### T5 Transformer
<div align="center">
<img align="center" src=docs/t5transformer.png>
<img align="center" src=docs/t5transformer.png>
</div>
......@@ -174,7 +174,7 @@ python umt5_summary.py
## result
### 中文文本摘要任务
<div align="center">
<img align="center" src=docs/result.png>
<img align="center" src=docs/result.png>
</div>
### 精度
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment