# bert_large_squad_onnx ## 论文 https://arxiv.org/pdf/1810.04805.pdf ## 模型结构 bert_large_squad核心是transformer,transformer结构如下: ![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/transformer.png) ## 算法原理 bert_large_squad模型的主要参数为:24个transformer层、1024个hidden size、16个self-attention heads,简要原理可用下图表示: ![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/squad.png) ## 数据集 暂无合适中文数据集 ## 环境配置 在[光源](https://www.sourcefind.cn/#/service-details)可拉取推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载onnxruntime安装包。bert_large_squad_onnx推荐的镜像如下: ``` docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:ort1.14.0_migraphx3.0.0-dtk22.10.1 ``` 在[huggingface](https://huggingface.co/ctuning/mlperf-inference-bert-onnx-fp32-squad-v1.1)下载模型model.onnx到当前目录 执行fp16转换 ``` python3 fp16-convert.py ``` ## 推理 ``` python3 main.py ``` ## result ![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/bert_result.png) ### 性能数据 fp32 | loop | time(ms) | | :------: | :------: | | 1 | 0.09298863005824387 | | 2 | 0.04267867305316031 | | 3 | 0.04294574190862477 | | 4 | 0.042622152948752046 | | 5 | 0.042897791834548116 | | 6 | 0.04309680196456611 | | 7 | 0.04240077408030629 | | 8 | 0.042515473905950785 | | 9 | 0.0424974428024143 | | 10 | 0.04259936395101249 | fp16 | loop | time(ms) | | :------: | :------: | | 1 | 0.059390615904703736 | | 2 | 0.04876187210902572 | | 3 | 0.04870052193291485 | | 4 | 0.04873379203490913 | | 5 | 0.04842417314648628 | | 6 | 0.04876326210796833 | | 7 | 0.04846481396816671 | | 8 | 0.04872900294139981 | | 9 | 0.048555332934483886 | | 10 | 0.048343464033678174 | ## 应用场景 ### 算法类别 nlp ### 热点应用行业 问答系统 ## 源码仓库及问题反馈 https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx ## 参考 https://github.com/google-research/bert