update

7dc34919 · yuguo · 53930618 · 7dc34919 · 7dc34919 · 7dc34919
Commit 7dc34919 authored Sep 07, 2023 by yuguo
Hide whitespace changes
Inline Side-by-side

Showing with 37 additions and 11 deletions

README.md README.md +35 -11

gpt2模型结构.png gpt2模型结构.png +0 -0

gpt2算法原理.png gpt2算法原理.png +0 -0

model.properties model.properties +2 -0

No files found.
--- a/README.md
+++ b/README.md
-# Generative Pre-Training2(GPT2)
+# GPT2
-## 模型介绍
+## 论文
-GPT2模型：第二代生成式预训练模型（Generative Pre-Training2）。
+`Language Models are Unsupervised Multitask Learners`
+- [https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf)
 ## 模型结构
-GPT2使用 Transformer 的 Decoder 结构，并对 Transformer Decoder 进行了一些改动，原本的 Decoder 包含了两个 Multi-Head Attention 结构，GPT2 只保留了 Mask Multi-Head Attention。
+第二代生成式预训练模型（Generative Pre-Training2），GPT2使用 Transformer 的 Decoder 结构，并对 Transformer Decoder 进行了一些改动，原本的 Decoder 包含了两个 Multi-Head Attention 结构，GPT2 只保留了 Mask Multi-Head Attention。
 我们为了用户可以使用OneFlow-Libai快速验证GPT2模型预训练，统计性能或验证精度，提供了一个GPT2网络示例，主要网络参数：
@@ -26,24 +32,32 @@ model.cfg.hidden_layers = 64
 model.cfg.max_seq_length = 1024
 ```
+## 算法原理
+GPT-2中使用掩模自注意力（masked self-attention），一般的自注意力模块允许某位置右侧的词计算时处于最大值。而掩模自注意力会阻止这种情况发生。
 ## 数据集
 我们在libai目录下集成了部分小数据集供用户快速验证：
    ./nlp_data
-## GPT2预训练
+## 环境配置
-### 环境配置
+### Docker
-推荐使用docker方式运行，提供[光源](https://www.sourcefind.cn/#/service-details)拉取的docker镜像：image.sourcefind.cn:5000/dcu/admin/base/oneflow:0.9.1-centos7.6-dtk-22.10.1-py39-latest
-进入docker：
+提供[光源](https://www.sourcefind.cn/#/service-details)拉取的训练以及推理的docker镜像：image.sourcefind.cn:5000/dcu/admin/base/oneflow:0.9.1-centos7.6-dtk-22.10.1-py39-latest
-    cd libai
+    docker pull image.sourcefind.cn:5000/dcu/admin/base/oneflow:0.9.1-centos7.6-dtk-22.10.1-py39-latest
+    # <Your Image ID>用上面拉取docker镜像的ID替换
+    docker run --shm-size 16g --network=host --name=gpt_oneflow --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/GPT:/home/GPT -it <Your Image ID> bash
+    cd /home/GPT
    pip3 install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
    pip3 install pybind11 -i https://mirrors.aliyun.com/pypi/simple
    pip3 install -e . -i https://mirrors.aliyun.com/pypi/simple
-### 训练
+## GPT2预训练
 该预训练脚本运行环境为1节点，4张DCU-Z100-16G。
@@ -140,6 +154,16 @@ train.dist.pipeline_parallel_size = 4
 | 卡数 | 分布式工具 | 收敛性 |
 | :------: | :------: |:------: |
 | 96 | Libai-main | total_loss: 5.56/1299 iters |
+## 应用场景
+### 算法类别
+`自然语言处理`
+### 热点应用行业
+`nlp,智能聊天助手,科研`
 ## 源码仓库及问题反馈
 - https://developer.hpccube.com/codes/modelzoo/GPT

--- a/gpt2模型结构.png
+++ b/gpt2模型结构.png
--- a/gpt2算法原理.png
+++ b/gpt2算法原理.png
--- a/model.properties
+++ b/model.properties
+# 模型唯一标识
+modelCode=62
 # 模型名称
 modelName=GPT2_OneFlow
 # 模型描述