"src/git@developer.sourcefind.cn:renzhc/diffusers_dcu.git" did not exist on "17e5b4921a2dae2b9f33b3cb603d328f94e2d36f"
README.md 7.4 KB
Newer Older
yuguo's avatar
update  
yuguo committed
1
2
3
4
5
6
7
8
9
# BLOOM

## 论文

`BLOOM: A 176B-Parameter Open-Access Multilingual Language Model`

- [https://arxiv.org/abs/2211.05100](https://arxiv.org/abs/2211.05100)

## 模型结构
yuguo960516's avatar
bloom  
yuguo960516 committed
10
11
Bloom是一个开源的支持最多59种语言和176B参数的大语言模型。它是在Megatron-LM GPT2的基础上修改训练出来的,主要使用了解码器唯一结构,对词嵌入层的归一化,使用GeLU激活函数的线性偏差注意力位置编码等技术。它的训练集包含了45种自然语言和12种编程语言,1.5TB的预处理文本转化为了350B的唯一token。bigscience在hugging face上发布的bloom模型包含多个参数多个版本。

chenzk's avatar
chenzk committed
12
<img src="http://developer.sourcefind.cn/codes/modelzoo/bloom_oneflow/-/raw/main/bloom%E6%A8%A1%E5%9E%8B%E7%BB%93%E6%9E%84.png" alt="bloom模型结构.png" style="zoom:50%;" />
yuguo's avatar
update  
yuguo committed
13
14

## 算法原理
“yuguo”'s avatar
“yuguo” committed
15

chenzk's avatar
chenzk committed
16
<img src="http://developer.sourcefind.cn/codes/modelzoo/bloom_oneflow/-/raw/main/bloom%E7%AE%97%E6%B3%95%E5%8E%9F%E7%90%86.png" alt="bloom算法原理.png" style="zoom:50%;" />
“yuguo”'s avatar
“yuguo” committed
17

yuguo960516's avatar
bloom  
yuguo960516 committed
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
当模型规模过于庞大,单个 GPU 设备无法容纳大规模模型参数时,便捷好用的分布式训练和推理需求就相继出现,业内也随之推出相应的工具。

基于 OneFlow 构建的 LiBai 模型库让分布式上手难度降到最低,用户不需要关注模型如何分配在不同的显卡设备,只需要修改几个配置数据就可以设置不同的分布式策略。当然,加速性能更是出众。

用 LiBai 搭建的 BLOOM可以便捷地实现model parallel + pipeline parallel推理, 很好地解决单卡放不下大规模模型的问题。

### 分布式推理具有天然优势

要知道,模型的参数其实就是许多 tensor,也就是以矩阵的形式出现,大模型的参数也就是大矩阵,并行策略就是把大矩阵分为多个小矩阵,并分配到不同的显卡或不同的设备上,基础的 LinearLayer 在LiBai中的实现代码如下:

```python
class Linear1D(nn.Module):
    def __init__(self, in_features, out_features, parallel="data", layer_idx=0, ...):
        super().__init__()

        if parallel == "col":
            weight_sbp = dist.get_nd_sbp([flow.sbp.broadcast, flow.sbp.split(0)])
        elif parallel == "row":
            weight_sbp = dist.get_nd_sbp([flow.sbp.broadcast, flow.sbp.split(1)])
        elif parallel == "data":
            weight_sbp = dist.get_nd_sbp([flow.sbp.broadcast, flow.sbp.broadcast])
        else:
            raise KeyError(f"{parallel} is not supported! Only support ('data', 'row' and 'col')")

        self.weight = flow.nn.Parameter(
            flow.empty(
                (out_features, in_features),
                dtype=flow.float32,
                placement=dist.get_layer_placement(layer_idx),  # for pipeline parallelism placement
                sbp=weight_sbp,
            )
        )
        init_method(self.weight)
        ...
    
    def forward(self, x):
        ...
```

在这里,用户可选择去如何切分 Linear 层的矩阵,如何切分数据矩阵,而OneFlow 中的 SBP 控制竖着切、横着切以及其他拆分矩阵的方案(模型并行、数据并行),以及通过设置 Placement 来控制这个 LinearLayer 是放在第几张显卡上(流水并行)。

所以,根据 LiBai 中各种 layer 的设计原理以及基于 OneFlow 中 tensor 自带的 SBP 和 Placement 属性的天然优势,使得用户搭建的模型能够很简单地就实现数据并行、模型并行以及流水并行操作。

yuguo's avatar
update  
yuguo committed
61
62
63
64
## 环境配置

### Docker

chenzk's avatar
chenzk committed
65
提供[光源](https://www.sourcefind.cn/#/service-details)拉取的训练以及推理的docker镜像:image.sourcefind.cn:5000/dcu/admin/base/oneflow:0.9.1-centos7.6-dtk-22.10.1-py39-latest,关于本项目DCU显卡所需torch库等均可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装
yuguo960516's avatar
bloom  
yuguo960516 committed
66

yuguo's avatar
update  
yuguo committed
67
68
    docker pull image.sourcefind.cn:5000/dcu/admin/base/oneflow:0.9.1-centos7.6-dtk-22.10.1-py39-latest
    # <Your Image ID>用上面拉取docker镜像的ID替换
yuguo's avatar
update  
yuguo committed
69
70
    docker run --shm-size 16g --network=host --name=bloom_oneflow --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/bloom_oneflow:/home/bloom_oneflow -it <Your Image ID> bash
    cd /home/bloom_oneflow
“yuguo”'s avatar
“yuguo” committed
71
    pip3 install transformers==4.28.1
yuguo960516's avatar
bloom  
yuguo960516 committed
72
73
74
    pip3 install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
    pip3 install pybind11 -i https://mirrors.aliyun.com/pypi/simple
    pip3 install -e . -i https://mirrors.aliyun.com/pypi/simple
“yuguo”'s avatar
update  
“yuguo” committed
75
    pip3 install torch-1.10.0a0+git2040069.dtk2210-cp39-cp39-manylinux2014_x86_64.whl
yuguo's avatar
update  
yuguo committed
76
77
78
79
80
81
## 数据集

在下面脚本中生成。

## 权重

yuguo960516's avatar
bloom  
yuguo960516 committed
82
83
需要先准备好模型权重:https://huggingface.co/bigscience/bloomz-7b1/tree/main

chenzk's avatar
chenzk committed
84
85
模型权重SCNet快速下载链接: https://www.scnet.cn/ui/aihub/models/yiziqinx/Bloomz-7B

yuguo960516's avatar
bloom  
yuguo960516 committed
86
87
88
89
90
91
92
93
94
95
96
97
### bloomz-7b1的文件结构

```python
$ tree data
path/to/bloomz-7b1
├── tokenizer_config.json
├── tokenizer.json
├── special_tokens_map.json
├── config.json
└── pytorch_model.bin
```

“yuguo”'s avatar
update  
“yuguo” committed
98
## 推理
yuguo960516's avatar
bloom  
yuguo960516 committed
99
100
101

采用1节点,4张DCU-Z100-16G,采用tp=4,pp=1的并行配置。

yuguo's avatar
update  
yuguo committed
102
将模型权重放置与demo.py同一目录下,运行以下代码:
yuguo960516's avatar
bloom  
yuguo960516 committed
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155

    cd projects/BLOOM
    # 运行前修改 configs/bloom_inference.py 中 `min_length=64`
    python3 -m oneflow.distributed.launch --nproc_per_node 4 demo.py

demo.py如下:

    # model parallel + pipeline parallel demo
    
    import oneflow as flow
    from omegaconf import DictConfig
    from transformers import BloomTokenizerFast
    
    from libai.utils import distributed as dist
    from projects.BLOOM.configs.bloom_inference import cfg
    from projects.BLOOM.modeling.bloom_model import BloomForCausalLM
    from projects.BLOOM.utils.model_loader import BlooMLoaderHuggerFace
    import time
    
    parallel_config = DictConfig(
        dict(
            data_parallel_size=1,
            tensor_parallel_size=4,
            pipeline_parallel_size=1,
            pipeline_num_layers=30,
        )
    )
    dist.setup_dist_util(parallel_config)
    
    tokenizer = BloomTokenizerFast.from_pretrained("bloomz-7b1")
    res = tokenizer("How to improve sleep quality?")
    inputs = {
        "input_ids": flow.tensor([res.input_ids]),
        "attention_mask": flow.tensor([res.attention_mask]),
    }
    
    sbp = dist.get_nd_sbp([flow.sbp.broadcast, flow.sbp.broadcast])
    placement = dist.get_layer_placement(0)
    
    loader = BlooMLoaderHuggerFace(BloomForCausalLM, cfg, "bloomz-7b1")
    model = loader.load()
    
    start_t = time.time()
    outputs = model.generate(
        inputs=inputs["input_ids"].to_global(sbp=sbp, placement=placement), max_length=128
    )
    end_t = time.time()
    if dist.is_main_process():
        print('model.generate: %s秒' % (end_t - start_t))
    
    res = tokenizer.decode(outputs[0])
    if dist.is_main_process():
        print(res)
yuguo's avatar
perf  
yuguo committed
156

“yuguo”'s avatar
update  
“yuguo” committed
157
## result
yuguo960516's avatar
bloom  
yuguo960516 committed
158
159

```
yuguo960516yuguo's avatar
perf  
yuguo960516yuguo committed
160
>>>
yuguo960516's avatar
bloom  
yuguo960516 committed
161
162
163
How to improve sleep quality? keep your bedroom dark and quiet. Avoid electronics and bright lights. Keep your bedroom cool. Use a white noise machine. Use a humidifier. Use a diffuser. Use essential oils. Use a sleep aid. Try acupuncture. Try hypnotherapy. Try acupressure.</s>
```

yuguo's avatar
update  
yuguo committed
164
165
166
167
168
169
170
171
## 应用场景

### 算法类别

`自然语言处理`

### 热点应用行业

“yuguo”'s avatar
update  
“yuguo” committed
172
`医疗,教育,科研,金融`
yuguo's avatar
update  
yuguo committed
173

yuguo960516yuguo's avatar
1.1  
yuguo960516yuguo committed
174
## 源码仓库及问题反馈
yuguo960516's avatar
bloom  
yuguo960516 committed
175

chenzk's avatar
chenzk committed
176
- https://developer.sourcefind.cn/codes/modelzoo/bloom_oneflow
yuguo960516's avatar
bloom  
yuguo960516 committed
177
178
179
180

## 参考
* https://github.com/Oneflow-Inc/libai
* https://huggingface.co/bigscience/bloomz