README.md 4.24 KB
Newer Older
dcuai's avatar
dcuai committed
1
# Gemma
chenzk's avatar
v1.0  
chenzk committed
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
谷歌发布的号称“全球性能最强大、轻量级”的新一代开源2B小模型Gemma,打响小模型战争。
## 论文
`未发表论文`

## 模型结构
[`Gemma`](./gemma/model.py)基于原始transformer decoder结构,2B模型使用了multi-query attention (with 𝑛𝑢𝑚_𝑘𝑣_ℎ𝑒𝑎𝑑𝑠 = 1),改进细节请见代码:

1、RoPE Embeddings: 不使用绝对位置编码,在每一层前加下RoPE Embedding,同时共享输入与输出层的embedding权重;

2、GeGLU Activations: ReLU的激活替换为GeGLU的激活;

3、Normalizer Location: 在transformer的每一层layer的前后都进行规一化,使用RMSNorm作为规一化层;
<div align=center>
    <img src="./doc/gemma.png"/>
</div>

## 算法原理
[`Gemma`](./gemma/model.py)算法主要将转换成向量的分词用qkv自相关和全连接层提取特征,然后利用全连接层输出监督训练结果,具体算法原理可参考下图原始transformer模型结构右侧decoder部分进行初步理解;Gemma在2T和6T个token的文本上进行预训练,数据主要来自英文网页、数学和代码,开发者使用Gemini的SentencePiece分词器的子集,词汇量为256k,高质量大数据产生巨大的小模型效果提升。
<div align=center>
    <img src="./doc/transformer.png"/>
</div>

## 环境配置

### Docker(方法一)
```
dcuai's avatar
dcuai committed
28
29
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
# <your IMAGE ID>为以上拉取的docker的镜像ID替换
chenzk's avatar
v1.0  
chenzk committed
30
31
32
33
34
35
docker run -it --shm-size=16G -v $PWD/gemma_pytorch:/home/gemma_pytorch -v /opt/hyhal:/opt/hyhal --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name gemma <your IMAGE ID> bash
cd home/gemma_pytorch
pip install -r requirements.txt # requirements.txt
```
### Dockerfile(方法二)
```
chenzk's avatar
v1.0.1  
chenzk committed
36
cd gemma_pytorch/docker
chenzk's avatar
v1.0  
chenzk committed
37
38
39
40
41
42
docker build --no-cache -t gemma:latest .
docker run --shm-size=16G --name gemma -v /opt/hyhal:/opt/hyhal --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../gemma_pytorch:/home/gemma_pytorch -it gemma bash
# 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt。
```
### Anaconda(方法三)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
chenzk's avatar
chenzk committed
43
- https://developer.sourcefind.cn/tool/
chenzk's avatar
v1.0  
chenzk committed
44
```
dcuai's avatar
dcuai committed
45
46
DTK驱动:dtk24.04.1
python:python3.10
chenzk's avatar
v1.0  
chenzk committed
47
48
49
50
51
52
53
54
55
torch:2.1.0
torchvision:0.16.0
triton:2.1.0
```

`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应。`

2、其它非特殊库参照requirements.txt安装
```
chenzk's avatar
v1.0.2  
chenzk committed
56
pip install -r requirements.txt # requirements.txt
chenzk's avatar
v1.0  
chenzk committed
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
```

## 数据集


## 训练
官方github未开源微调代码,如有需求请进入以下网站申请账户获取:
 * [Gemma on Google AI](https://ai.google.dev/gemma)
 * [Gemma on Kaggle](https://www.kaggle.com/models/google/gemma)
 * [Gemma on Vertex AI Model Garden](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335)

微调所需的特殊深度学习库可从光合开发者社区下载安装。

更多资料可参考源项目的[`README_origin`](./README_origin.md)


## 推理
### 单机单卡
mashun1's avatar
update  
mashun1 committed
75
76
推理权重采用`gemma-2b-it-pytorch`

chenzk's avatar
v1.0  
chenzk committed
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96

```
sh infer.sh # 采用官方默认权重推理,去除--device=cuda则为CPU推理
```

## result
```
#PROMPT: 
The meaning of life is
#RESULT: 
a question that has been pondered by philosophers, theologians, and laypeople alike for centuries. There is no single, universally accepted answer, but there are many different perspectives and beliefs that attempt to provide meaning to life.
```
### 精度
DCU Z100L精度与英伟达v100一致。

## 应用场景
### 算法类别
`对话问答`
### 热点应用行业
`制造,广媒,金融,能源,医疗,家居,教育`
mashun1's avatar
update  
mashun1 committed
97
98
99
100
101

## 预训练权重

请下载后放入目录gemma-2b-pytorch下面。

chenzk's avatar
chenzk committed
102
[huggingface](https://huggingface.co/google/gemma-2b-it-pytorch)
mashun1's avatar
update  
mashun1 committed
103
104


chenzk's avatar
v1.0  
chenzk committed
105
## 源码仓库及问题反馈
chenzk's avatar
chenzk committed
106
- http://developer.sourcefind.cn/codes/modelzoo/gemma_pytorch.git
chenzk's avatar
v1.0  
chenzk committed
107
108
109
110
## 参考资料
- https://github.com/google/gemma_pytorch.git
- https://hf-mirror.com/ #Huggingface镜像官网下载教程
- https://hf-mirror.com/datasets #Huggingface镜像数据地址