Commit 1d300d4e authored by chenxj's avatar chenxj
Browse files

update README

parent 127d4fe2
# bert_large_squad_onnx # bert_large_squad_onnxruntime
## 论文 ## 论文
https://arxiv.org/pdf/1810.04805.pdf [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf)
## 模型结构 ## 模型结构
bert_large_squad核心是transformer,transformer结构如下: bert_large_squad核心是transformer,transformer结构如下:
...@@ -15,11 +15,13 @@ bert_large_squad模型的主要参数为:24个transformer层、1024个hidden s ...@@ -15,11 +15,13 @@ bert_large_squad模型的主要参数为:24个transformer层、1024个hidden s
[光源](https://www.sourcefind.cn/#/service-details)可拉取推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载onnxruntime安装包。bert_large_squad_onnx推荐的镜像如下: [光源](https://www.sourcefind.cn/#/service-details)可拉取推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载onnxruntime安装包。bert_large_squad_onnx推荐的镜像如下:
``` ```
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:ort1.14.0_migraphx3.0.0-dtk22.10.1 docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:ort1.14.0_migraphx3.0.0-dtk22.10.1
docker run -d -t --privileged --device=/dev/kfd --device=/dev/dri/ --network=host --group-add video --name bert-test image.sourcefind.cn:5000/dcu/admin/base/custom:ort1.14.0_migraphx3.0.0-dtk22.10.1
``` ```
[huggingface](https://huggingface.co/ctuning/mlperf-inference-bert-onnx-fp32-squad-v1.1)下载模型model.onnx到当前目录 [huggingface](https://huggingface.co/ctuning/mlperf-inference-bert-onnx-fp32-squad-v1.1)下载模型model.onnx到当前目录
执行fp16转换 执行fp16转换
``` ```
pip3 install onnxmltools
python3 fp16-convert.py python3 fp16-convert.py
``` ```
## 推理 ## 推理
...@@ -28,36 +30,8 @@ python3 main.py ...@@ -28,36 +30,8 @@ python3 main.py
``` ```
## result ## result
![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/bert_result.png) ![image](https://developer.hpccube.com/codes/modelzoo/bert_large_squad_onnx/-/raw/main/resources/bert_result.png)
### 性能数据 ### 性能和精度数据
暂无
fp32
| loop | time(ms) |
| :------: | :------: |
| 1 | 0.09298863005824387 |
| 2 | 0.04267867305316031 |
| 3 | 0.04294574190862477 |
| 4 | 0.042622152948752046 |
| 5 | 0.042897791834548116 |
| 6 | 0.04309680196456611 |
| 7 | 0.04240077408030629 |
| 8 | 0.042515473905950785 |
| 9 | 0.0424974428024143 |
| 10 | 0.04259936395101249 |
fp16
| loop | time(ms) |
| :------: | :------: |
| 1 | 0.059390615904703736 |
| 2 | 0.04876187210902572 |
| 3 | 0.04870052193291485 |
| 4 | 0.04873379203490913 |
| 5 | 0.04842417314648628 |
| 6 | 0.04876326210796833 |
| 7 | 0.04846481396816671 |
| 8 | 0.04872900294139981 |
| 9 | 0.048555332934483886 |
| 10 | 0.048343464033678174 |
## 应用场景 ## 应用场景
### 算法类别 ### 算法类别
nlp nlp
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment