README.md 2.15 KB
Newer Older
sugon_cxj's avatar
sugon_cxj committed
1
# bert_large_squad_onnx
chenxj's avatar
chenxj committed
2
3
## 论文
https://arxiv.org/pdf/1810.04805.pdf
sugon_cxj's avatar
sugon_cxj committed
4
## 模型结构
chenxj's avatar
chenxj committed
5
6
7
8
9
10
11
12
13
14
bert_large_squad核心是transformer,transformer结构如下:

![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/transformer.png)
## 算法原理
bert_large_squad模型的主要参数为:24个transformer层、1024个hidden size、16个self-attention heads,简要原理可用下图表示:

![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/squad.png)
## 数据集
暂无合适中文数据集
## 环境配置
sugon_cxj's avatar
sugon_cxj committed
15
16
17
18
19
[光源](https://www.sourcefind.cn/#/service-details)可拉取推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载onnxruntime安装包。bert_large_squad_onnx推荐的镜像如下:
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:ort1.14.0_migraphx3.0.0-dtk22.10.1
```
[huggingface](https://huggingface.co/ctuning/mlperf-inference-bert-onnx-fp32-squad-v1.1)下载模型model.onnx到当前目录
sugon_cxj's avatar
sugon_cxj committed
20

sugon_cxj's avatar
sugon_cxj committed
21
22
23
24
执行fp16转换
```
python3 fp16-convert.py
```
chenxj's avatar
chenxj committed
25
## 推理
sugon_cxj's avatar
sugon_cxj committed
26
27
28
```
python3 main.py
```
chenxj's avatar
chenxj committed
29
30
31
## result
![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/bert_result.png)
### 性能数据
sugon_cxj's avatar
sugon_cxj committed
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

fp32
| loop | time(ms) |
| :------: | :------: |
| 1 | 0.09298863005824387 | 
| 2 | 0.04267867305316031 | 
| 3 | 0.04294574190862477 | 
| 4 | 0.042622152948752046 | 
| 5 | 0.042897791834548116 | 
| 6 | 0.04309680196456611 | 
| 7 | 0.04240077408030629 | 
| 8 | 0.042515473905950785 | 
| 9 | 0.0424974428024143 | 
| 10 | 0.04259936395101249 | 

fp16
| loop | time(ms) |
| :------: | :------: |
| 1 | 0.059390615904703736 | 
| 2 | 0.04876187210902572 | 
| 3 | 0.04870052193291485 | 
| 4 | 0.04873379203490913 | 
| 5 | 0.04842417314648628 | 
| 6 | 0.04876326210796833 | 
| 7 | 0.04846481396816671 | 
| 8 | 0.04872900294139981 | 
| 9 | 0.048555332934483886 | 
| 10 | 0.048343464033678174 | 

chenxj's avatar
chenxj committed
61
62
63
64
65
## 应用场景
### 算法类别
nlp
### 热点应用行业
问答系统
sugon_cxj's avatar
sugon_cxj committed
66
67
68
69
## 源码仓库及问题反馈
https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx
## 参考
https://github.com/google-research/bert
sugon_cxj's avatar
sugon_cxj committed
70