Commit 374d3043 authored by zhuwenwen's avatar zhuwenwen
Browse files

update modelzoo std

parent c95fe99a
# LLama_FT
## 模型介绍
LLama是一个基础语言模型的集合,参数范围从7B到65B。在数万亿的tokens上训练出的模型,并表明可以专门使用公开可用的数据集来训练最先进的模型,而不依赖于专有的和不可访问的数据集。
## 论文
- [https://arxiv.org/pdf/2302.13971.pdf](https://arxiv.org/pdf/2302.13971.pdf)
## 模型结构
LLAMA网络基于 Transformer 架构。提出了各种改进,并用于不同的模型,例如 PaLM。以下是与原始架构的主要区别:
预归一化。为了提高训练稳定性,对每个transformer 子层的输入进行归一化,而不是对输出进行归一化。使用 RMSNorm 归一化函数。
SwiGLU 激活函数 [PaLM]。使用 SwiGLU 激活函数替换 ReLU 非线性以提高性能。使用 2 /3 4d 的维度而不是 PaLM 中的 4d。
旋转嵌入。移除了绝对位置嵌入,而是添加了旋转位置嵌入 (RoPE),在网络的每一层。
## 推理
## 模型介绍
LLama是一个基础语言模型的集合,参数范围从7B到65B。在数万亿的tokens上训练出的模型,并表明可以专门使用公开可用的数据集来训练最先进的模型,而不依赖于专有的和不可访问的数据集。
### 环境配置
## 环境配置
提供[光源](https://www.sourcefind.cn/#/service-details)拉取推理的docker镜像:
* 推理镜像:docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:fastertransformer-dtk23.04-latest
```
docker pull docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:fastertransformer-dtk23.04-latest
docker run -it --name llama --shm-size=32G --device=/dev/kfd --device=/dev/dri/ --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --ulimit memlock=-1:-1 --ipc=host --network host --group-add video docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:fastertransformer-dtk23.04-latest /bin/bash
```
镜像版本依赖:
* DTK驱动:dtk23.04
* Pytorch: 1.10
* python: python3.8
激活镜像环境:
`source /opt/dtk-23.04/env.sh`
## 数据集
训练数据集:CCNet [67%], C4 [15%], GitHub [4.5%], Wikipedia [4.5%], Books [4.5%], ArXiv [2.5%], Stack Exchange[2%],Wikipedia和Books包括以下语言的数据:bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk。评估数据集:BoolQ, PIQA, SIQA, HellaSwag, WinoGrande, ARC, OpenBookQA, NaturalQuestions, TriviaQA, RACE, MMLU, BIG-bench hard, GSM8k, RealToxicityPrompts, WinoGender, CrowS-Pairs。
## 推理
### 编译
```
......@@ -32,10 +43,12 @@ make -j12
### 模型下载
```
[llama 7B](https://huggingface.co/decapoda-research/llama-7b-hf)
[llama 13b](https://huggingface.co/decapoda-research/llama-13b-hf)
[llama 30B](https://huggingface.co/decapoda-research/llama-30b-hf)
[llama 65b](https://huggingface.co/decapoda-research/llama-65b-hf)
```
模型转换
......@@ -112,6 +125,30 @@ layernorm_eps=rms_norm_eps
vocab_size=vocab_size
```
## result
build/
out
## 精度
测试数据:"I believe the meaning of life is" (token id: 306, 4658, 278, 6593, 310, 2834, 338),使用的加速卡:1张 DCU-Z100L-32G
准确性数据:
| 数据类型 | batch size | temperate | input len | output len |
| :------: | :------: | :------: | :------: |:------: |
| fp16 | 1 | 0 | 7 | 256 |
输出:
```
I believe the meaning of life is to live it to the fullest. I believe that we are all here for a reason and that we are all here to help each other. I believe that we are all here to learn and grow and that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here to help each other learn and grow. I believe that we are all here
```
## 应用场景
### 算法类别
NLP
### 热点应用行业
金融,科研,教育
## 源码仓库及问题反馈
* https://developer.hpccube.com/codes/modelzoo/llama_ft
......
# 模型唯一标识
modelCode = 358
# 模型名称
modelName=LLama_FT
modelName=llama_fastertransformer
# 模型描述
modelDescription=LLama是一个基础语言模型的集合,参数范围从7B到65B。在数万亿的tokens上训练出的模型,并表明可以专门使用公开可用的数据集来训练最先进的模型。
# 应用场景
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment