Commit 3c15726c authored by yangzhong's avatar yangzhong
Browse files

git init

parents
---
hide:
- toc
---
# Text Summarization using LLAMA3_1-405b
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "llama3_1-405b-99", "reference", devices=["CPU","CUDA"]) }}
{{ mlperf_inference_implementation_readme (4, "llama3_1-405b-99.9", "reference", devices=["CPU","CUDA"]) }}
\ No newline at end of file
---
hide:
- toc
---
# Question Answering, Math, and Code Generation using Mixtral-8x7B
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "mixtral-8x7b", "reference") }}
---
hide:
- toc
---
# Question and Answering using Bert Large for IndySCC 2024
## Introduction
This guide is designed for the [IndySCC 2024](https://sc24.supercomputing.org/students/indyscc/) to walk participants through running and optimizing the [MLPerf Inference Benchmark](https://arxiv.org/abs/1911.02549) using [Bert Large](https://github.com/mlcommons/inference/tree/master/language/bert#supported-models) across various software and hardware configurations. The goal is to maximize system throughput (measured in samples per second) without compromising accuracy.
For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Bert Large requires processing a minimum of 10833 samples in both performance and accuracy modes using the Squad v1.1 dataset.
## Scoring
In the IndySCC 2024, your objective will be to run a reference (unoptimized) Python implementation of the MLPerf inference benchmark to complete a successful submission passing the submission checker. Only one of the available framework needs to be submitted.
!!! info
Both MLPerf and CM automation are evolving projects.
If you encounter issues or have questions, please submit them [here](https://github.com/mlcommons/cm4mlops/issues)
## Artifacts to submit to the SCC committee
All the needed files are automatically pushed to the GitHub repository if you manage to complete the given commands. No additional files need to be submitted.
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "bert-99", "reference", extra_variation_tags="", fixed_scenarios=["Offline"],categories=["Edge"], setup_tips=False) }}
## Submission Commands
### Generate actual submission tree
```bash
cm run script --tags=generate,inference,submission \
--clean \
--run-checker \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--category=edge \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--run_style=test \
--quiet \
--submitter=<Team Name>
```
* Use `--hw_name="My system name"` to give a meaningful system name.
### Push Results to GitHub
Fork the `mlperf-inference-results-scc24` branch of the repository URL at [https://github.com/mlcommons/cm4mlperf-inference](https://github.com/mlcommons/cm4mlperf-inference).
Run the following command after **replacing `--repo_url` with your GitHub fork URL**.
```bash
cm run script --tags=push,github,mlperf,inference,submission \
--repo_url=https://github.com/<myfork>/cm4mlperf-inference \
--repo_branch=mlperf-inference-results-scc24 \
--commit_message="Results on system <HW Name>" \
--quiet
```
Once uploaded give a Pull Request to the origin repository. Github action will be running there and once
finished you can see your submitted results at [https://docs.mlcommons.org/cm4mlperf-inference](https://docs.mlcommons.org/cm4mlperf-inference).
---
hide:
- toc
---
# Medical Imaging using 3d-unet (KiTS 2019 kidney tumor segmentation task)
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "3d-unet-99", "reference") }}
{{ mlperf_inference_implementation_readme (4, "3d-unet-99.9", "reference") }}
=== "Nvidia"
## Nvidia MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "3d-unet-99", "nvidia") }}
{{ mlperf_inference_implementation_readme (4, "3d-unet-99.9", "nvidia") }}
=== "Intel"
## Intel MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "3d-unet-99", "intel") }}
{{ mlperf_inference_implementation_readme (4, "3d-unet-99.9", "intel") }}
---
hide:
- toc
---
# Medical Imaging using 3d-unet (KiTS 2019 kidney tumor segmentation task)
## Dataset
The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
=== "Validation"
3d-unet validation run uses the KiTS19 dataset performing [KiTS 2019](https://kits19.grand-challenge.org/) kidney tumor segmentation task
### Get Validation Dataset(Original)
```
cm run script --tags=get,dataset,kits19,_validation -j
```
### Get Validation Dataset(Preprocessed)
```
cm run script --tags=get,dataset,kits19,preprocessed -j
```
## Model
The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
Get the Official MLPerf 3d-unet Model
=== "Pytorch"
### Pytorch
```
cm run script --tags=get,ml-model,3d-unet,_pytorch -j
```
=== "Onnx"
### Onnx
```
cm run script --tags=get,ml-model,3d-unet,_onnx -j
```
=== "Tensorflow"
### Tensorflow
```
cm run script --tags=get,ml-model,3d-unet,_tensorflow -j
```
---
hide:
- toc
---
# Object Detection using Retinanet
## Dataset
The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
=== "Validation"
Retinanet validation run uses the OpenImages v6 MLPerf validation dataset resized to 800x800 and consisting of 24,576 images.
### Get Validation Dataset
```
cm run script --tags=get,dataset,openimages,_validation -j
```
=== "Calibration"
Retinanet calibration dataset consist of 500 images selected from the OpenImages v6 dataset.
```
cm run script --tags=get,dataset,openimages,_calibration -j
```
## Model
The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
Get the Official MLPerf Retinanet Model
=== "Pytorch"
### Pytorch
```
cm run script --tags=get,ml-model,retinanet,_pytorch -j
```
=== "Onnx"
### Onnx
```
cm run script --tags=get,ml-model,retinanet,_onnx -j
```
---
hide:
- toc
---
# Object Detection using Retinanet
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "retinanet", "reference") }}
=== "Nvidia"
## Nvidia MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "retinanet", "nvidia") }}
=== "Intel"
## Intel MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "retinanet", "intel") }}
=== "Qualcomm"
## Qualcomm AI100 MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "retinanet", "qualcomm") }}
=== "MLCommons-C++"
## MLPerf Modular Implementation in C++
{{ mlperf_inference_implementation_readme (4, "retinanet", "cpp") }}
---
hide:
- toc
---
# Recommendation using DLRM v2
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99", "reference") }}
{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99.9", "reference") }}
=== "Nvidia"
## Nvidia MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99", "nvidia") }}
{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99.9", "nvidia") }}
=== "Intel"
## Intel MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99", "intel") }}
{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99.9", "intel") }}
---
hide:
- toc
---
# Recommendation using DLRM v2
## Dataset
The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
=== "Validation"
DLRM validation run uses the Criteo dataset (Day 23).
### Get Validation Dataset
```
cm run script --tags=get,dataset,criteo,_validation -j
```
## Model
The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
Get the Official MLPerf DLRM v2 Model
=== "Pytorch"
### Pytorch
```
cm run script --tags=get,ml-model,dlrm,_pytorch -j
```
---
hide:
- toc
---
# Text to Image using Stable Diffusion
## Dataset
The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
=== "Validation"
Stable Diffusion validation run uses the Coco 2014 dataset.
### Get Validation Dataset
```
cm run script --tags=get,dataset,coco2014,_validation -j
```
## Model
The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
Get the Official MLPerf Stable Diffusion Model
=== "Pytorch"
### Pytorch
```
cm run script --tags=get,ml-model,sdxl,_pytorch -j
```
---
hide:
- toc
---
# Text-to-Image with Stable Diffusion for Student Cluster Competition 2024
## Introduction
This guide is designed for the [Student Cluster Competition 2024](https://sc24.supercomputing.org/students/student-cluster-competition/) to walk participants through running and optimizing the [MLPerf Inference Benchmark](https://arxiv.org/abs/1911.02549) using [Stable Diffusion XL 1.0](https://github.com/mlcommons/inference/tree/master/text_to_image#supported-models) across various software and hardware configurations. The goal is to maximize system throughput (measured in samples per second) without compromising accuracy. Since the model performs poorly on CPUs, it is essential to run it on GPUs.
For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Stable Diffusion XL requires processing a minimum of 5,000 samples in both performance and accuracy modes using the COCO 2014 dataset. However, for SCC, we have reduced this and we also have two variants. `scc-base` variant has dataset size reduced to 50 samples, making it possible to complete both performance and accuracy runs in approximately 5-10 minutes. `scc-main` variant has dataset size of 500 and running it will fetch extra points as compared to running just the base variant. Setting up for Nvidia GPUs may take 2-3 hours but can be done offline. Your final output will be a tarball (`mlperf_submission.tar.gz`) containing MLPerf-compatible results, which you will submit to the SCC organizers for scoring.
## Scoring
In the SCC, your first objective will be to run `scc-base` variant for reference (unoptimized) Python implementation or a vendor-provided version (such as Nvidia's) of the MLPerf inference benchmark to secure a baseline score.
Once the initial run is successful, you'll have the opportunity to optimize the benchmark further by maximizing system utilization, applying quantization techniques, adjusting ML frameworks, experimenting with batch sizes, and more, all of which can earn you additional points.
Since vendor implementations of the MLPerf inference benchmark vary and are often limited to single-node benchmarking, teams will compete within their respective hardware categories (e.g., Nvidia GPUs, AMD GPUs). Points will be awarded based on the throughput achieved on your system.
Additionally, significant bonus points will be awarded if your team enhances an existing implementation, adds support for new hardware (such as an unsupported GPU), enables multi-node execution, or adds/extends scripts to [cm4mlops repository](https://github.com/mlcommons/cm4mlops/tree/main/script) supporting new devices, frameworks, implementations etc. All improvements must be made publicly available under the Apache 2.0 license and submitted alongside your results to the SCC committee to earn these bonus points, contributing to the MLPerf community.
!!! info
Both MLPerf and CM automation are evolving projects.
If you encounter issues or have questions, please submit them [here](https://github.com/mlcommons/cm4mlops/issues)
## Artifacts to submit to the SCC committee
You will need to submit the following files:
* `mlperf_submission.run` - CM commands to run MLPerf inference benchmark saved to this file.
* `mlperf_submission.md` - description of your platform and some highlights of the MLPerf benchmark execution.
* `<Team Name>` under which results are pushed to the github repository.
## SCC interview
You are encouraged to highlight and explain the obtained MLPerf inference throughput on your system
and describe any improvements and extensions to this benchmark (such as adding new hardware backend
or supporting multi-node execution) useful for the community and [MLCommons](https://mlcommons.org).
## Run Commands
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "sdxl", "reference", extra_variation_tags=",_short,_scc24-base", devices=["ROCm", "CUDA"],fixed_scenarios=["Offline"],categories=["Datacenter"], setup_tips=False, skip_test_query_count=True, extra_input_string="--precision=float16") }}
=== "Nvidia"
## Nvidia MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "sdxl", "nvidia", extra_variation_tags=",_short,_scc24-base", fixed_scenarios=["Offline"],categories=["Datacenter"], setup_tips=False, implementation_tips=False, skip_test_query_count=True) }}
!!! info
Once the above run is successful, you can change `_scc24-base` to `_scc24-main` to run the main variant.
## Submission Commands
### Generate actual submission tree
```bash
cm run script --tags=generate,inference,submission \
--clean \
--run-checker \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--category=datacenter \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--run_style=test \
--adr.submission-checker.tags=_short-run \
--quiet \
--submitter=<Team Name>
```
* Use `--hw_name="My system name"` to give a meaningful system name.
### Push Results to GitHub
Fork the `mlperf-inference-results-scc24` branch of the repository URL at [https://github.com/mlcommons/cm4mlperf-inference](https://github.com/mlcommons/cm4mlperf-inference).
Run the following command after **replacing `--repo_url` with your GitHub fork URL**.
```bash
cm run script --tags=push,github,mlperf,inference,submission \
--repo_url=https://github.com/<myfork>/cm4mlperf-inference \
--repo_branch=mlperf-inference-results-scc24 \
--commit_message="Results on system <HW Name>" \
--quiet
```
Once uploaded give a Pull Request to the origin repository. Github action will be running there and once
finished you can see your submitted results at [https://docs.mlcommons.org/cm4mlperf-inference](https://docs.mlcommons.org/cm4mlperf-inference).
---
hide:
- toc
---
# Text to Image using Stable Diffusion
=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
{{ mlperf_inference_implementation_readme (4, "sdxl", "reference") }}
=== "Nvidia"
## Nvidia MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "sdxl", "nvidia") }}
=== "Intel"
## Intel MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "sdxl", "intel") }}
---
hide:
- toc
---
# What's New, What's Coming
---
hide:
- toc
---
# Demos
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" data-name="Layer 1" viewBox="0 0 1660 500">
<title>MLCommons</title>
<path d="M281.4263,380.1769c0,54.6592,24.0117,87.5176,68.2451,87.5176,37.9131,0,54.3428-27.4873,59.3974-44.2325a2.0216,2.0216,0,0,1,2.212-1.58h27.8037c.9472,0,1.5791.3154,1.5791,1.58,0,20.8526-24.959,71.0889-90.9922,71.0889-69.5093,0-99.84-47.3926-99.84-114.374s36.65-114.374,99.84-114.374c68.876,0,90.9922,53.7119,90.9922,71.0888,0,1.2647-.6319,1.58-1.5791,1.58H411.2808a2.3772,2.3772,0,0,1-2.212-1.58c-4.7392-13.5849-17.6923-44.2324-59.3974-44.2324C305.438,292.6593,281.4263,325.5177,281.4263,380.1769Zm333.9531,31.2783c0,49.2891-29.0674,83.0957-78.3555,83.0957-49.6045,0-78.3554-33.8066-78.3554-83.0957,0-49.6035,28.7509-83.0937,78.3554-83.0937C586.312,328.3615,615.3794,362.1681,615.3794,411.4552Zm-30.0156,0c0-32.8574-18.0088-56.2383-48.34-56.2383-30.0156,0-48.34,23.3809-48.34,56.2383,0,32.8594,18.3242,56.2393,48.34,56.2393C567.355,467.6945,585.3638,444.3146,585.3638,411.4552Zm224.0039-83.0937c-15.1651,0-35.7022,5.0546-47.708,18.957-1.58,1.58-2.2119,1.2637-3.792,0-14.85-14.2178-23.6953-18.957-45.8125-18.957-15.4815,0-31.91,7.582-37.5977,18.957-.6318,1.2637-3.1592,1.8955-3.791,0l-3.16-14.2178c-.3154-.9482-.6318-1.58-1.5791-1.58H642.231a1.4926,1.4926,0,0,0-1.58,1.58v156.71a1.4926,1.4926,0,0,0,1.58,1.58h26.8554a1.4926,1.4926,0,0,0,1.58-1.58V402.9259c0-24.6445,8.53-47.709,35.07-47.709,30.0147,0,35.3858,23.0645,35.3858,47.709v86.8848a1.1593,1.1593,0,0,0,.3164.9482c0,.3154.3154.6318.9472.6318h27.1719a1.16,1.16,0,0,0,.9482-.3164,1.64,1.64,0,0,0,.6319-1.2636V402.9259c0-24.6445,5.3711-47.709,35.07-47.709,31.2783,0,35.3857,23.6963,35.3857,47.709v86.8848a1.4926,1.4926,0,0,0,1.58,1.58H870.03a1.4926,1.4926,0,0,0,1.58-1.58V402.9259C871.61,355.5333,850.4409,328.3615,809.3677,328.3615Zm267.2866,0c-15.166,0-35.7031,5.0546-47.709,18.957-1.58,1.58-2.2109,1.2637-3.791,0-14.85-14.2178-23.6953-18.957-45.8125-18.957-15.4824,0-31.91,7.582-37.5981,18.957-.6319,1.2637-3.1592,1.8955-3.7911,0l-3.16-14.2178c-.3154-.9482-.6319-1.58-1.5791-1.58H909.5171a1.4926,1.4926,0,0,0-1.58,1.58v156.71a1.4926,1.4926,0,0,0,1.58,1.58h26.8555a1.4925,1.4925,0,0,0,1.58-1.58V402.9259c0-24.6445,8.53-47.709,35.0708-47.709,30.0137,0,35.3848,23.0645,35.3848,47.709v86.8848a1.1593,1.1593,0,0,0,.3164.9482c0,.3154.3164.6318.9473.6318h27.1719a1.1623,1.1623,0,0,0,.9492-.3164,1.6426,1.6426,0,0,0,.6308-1.2636V402.9259c0-24.6445,5.3711-47.709,35.07-47.709,31.2793,0,35.3868,23.6963,35.3868,47.709v86.8848a1.492,1.492,0,0,0,1.58,1.58h26.8555a1.4931,1.4931,0,0,0,1.58-1.58V402.9259C1138.8965,355.5333,1117.7266,328.3615,1076.6543,328.3615Zm241.6934,83.0937c0,49.2891-29.0665,83.0957-78.3555,83.0957-49.6035,0-78.3555-33.8066-78.3555-83.0957,0-49.6035,28.752-83.0937,78.3555-83.0937C1289.2812,328.3615,1318.3477,362.1681,1318.3477,411.4552Zm-30.0157,0c0-32.8574-18.0078-56.2383-48.34-56.2383-30.0156,0-48.34,23.3809-48.34,56.2383,0,32.8594,18.3243,56.2393,48.34,56.2393C1270.3242,467.6945,1288.332,444.3146,1288.332,411.4552ZM1421.66,328.3615c-21.168,0-35.07,9.7949-41.3887,19.2734-.6328.9473-3.16,1.5791-3.4766-.3164l-3.16-14.2178c-.3145-.9482-.6309-1.58-1.5782-1.58h-23.6972a1.4932,1.4932,0,0,0-1.58,1.58v156.71a1.4931,1.4931,0,0,0,1.58,1.58h26.8554a1.492,1.492,0,0,0,1.58-1.58V402.9259c0-24.6445,15.166-47.709,41.7051-47.709,31.2793,0,42.0215,23.6963,42.0215,47.709v86.8848a1.4931,1.4931,0,0,0,1.58,1.58h26.8554a1.4931,1.4931,0,0,0,1.58-1.58V402.9259C1490.5371,361.8517,1467.4727,328.3615,1421.66,328.3615Zm164.2871,67.6132c-15.7989-2.2119-38.5469-2.5273-38.5469-20.5371,0-9.7949,10.4277-20.2207,34.123-20.2207,25.5918,0,39.4942,11.374,39.4942,28.752a1.4919,1.4919,0,0,0,1.58,1.5791h27.17c.9492,0,1.2636-.6319,1.2636-1.5791,0-37.2832-30.6464-55.6074-69.5078-55.6074-39.1777,0-64.1367,21.1679-64.1367,47.0761,0,39.1778,38.2285,43.917,68.5606,47.3926,13.2695,1.58,39.4922,2.8438,39.4922,20.8525,0,20.5362-23.379,24.0118-39.4922,24.0118-25.2754,0-41.3907-12.9541-41.3907-28.751a1.7684,1.7684,0,0,0-1.8945-1.58H1516.123a1.492,1.492,0,0,0-1.58,1.58c0,30.331,23.3789,55.6074,71.4043,55.6074,39.4922,0,69.5078-14.85,69.5078-50.8682C1655.4551,403.2413,1617.541,400.3976,1585.9473,395.9747ZM229.0073,5H177.8237c-.9482,0-1.58.3154-2.2119,1.5791L119.3735,196.1484a1.6459,1.6459,0,0,1-3.16,0L59.9751,6.5791C59.3433,5.3154,58.7114,5,57.7632,5H6.58A1.4922,1.4922,0,0,0,5,6.5791V225.8477a1.4924,1.4924,0,0,0,1.58,1.58H33.4351a1.4925,1.4925,0,0,0,1.58-1.58V36.5947a1.595,1.595,0,0,1,3.1592-.3164L93.7817,225.8477a2.8124,2.8124,0,0,0,2.5274,1.58h42.9687a2.8124,2.8124,0,0,0,2.5274-1.58L197.4126,36.2783c.3154-1.5791,3.1592-1.8955,3.1592.3164v189.253a1.4926,1.4926,0,0,0,1.58,1.58h26.8554a1.4926,1.4926,0,0,0,1.58-1.58V6.5791A1.4923,1.4923,0,0,0,229.0073,5ZM270.542,227.4277H424.1631a1.4931,1.4931,0,0,0,1.58-1.58V202.1514a1.4928,1.4928,0,0,0-1.58-1.5791H300.5576a1.4927,1.4927,0,0,1-1.58-1.58V6.5791A1.4921,1.4921,0,0,0,297.3984,5H270.542a1.4921,1.4921,0,0,0-1.5791,1.5791V225.8477A1.4923,1.4923,0,0,0,270.542,227.4277Z"></path>
<circle cx="117.772" cy="378.6187" r="67.0305" fill="#CCEBD4"></circle>
</svg>
# MLPerf Inference Benchmarks
## Overview
The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf inference v5.0 round are listed below, categorized by tasks. Under each model you can find its details like the dataset used, reference accuracy, server latency constraints etc.
---
## Image Classification
### [ResNet50-v1.5](benchmarks/image_classification/resnet50.md)
- **Dataset**: Imagenet-2012 (224x224) Validation
- **Dataset Size**: 50,000
- **QSL Size**: 1,024
- **Number of Parameters**: 25.6 million
- **FLOPs**: 3.8 billion
- **Reference Model Accuracy**: 76.46% ACC
- **Server Scenario Latency Constraint**: 15ms
- **Equal Issue mode**: False
- **High accuracy variant**: No
- **Submission Category**: Datacenter, Edge
---
## Text to Image
### [Stable Diffusion](benchmarks/text_to_image/sdxl.md)
- **Dataset**: Subset of Coco2014
- **Dataset Size**: 5,000
- **QSL Size**: 5,000
- **Number of Parameters**: 3.5 billion <!-- taken from https://stability.ai/news/stable-diffusion-sdxl-1-announcement -->
- **FLOPs**: 1.28 - 2.4 trillion
- **Reference Model Accuracy (fp32)**: CLIP: 31.74981837, FID: 23.48046692
- **Required Accuracy (Closed Division)**:
- CLIP: 31.68631873 ≤ CLIP ≤ 31.81331801 (within 0.2% of the reference model CLIP score)
- FID: 23.01085758 ≤ FID ≤ 23.95007626 (within 2% of the reference model FID score)
- **Equal Issue mode**: False
- **High accuracy variant**: No
- **Submission Category**: Datacenter, Edge
---
## Object Detection
### [Retinanet](benchmarks/object_detection/retinanet.md)
- **Dataset**: OpenImages
- **Dataset Size**: 24,781
- **QSL Size**: 64
- **Number of Parameters**: TBD
- **Reference Model Accuracy (fp32) **: 0.3755 mAP
- **Server Scenario Latency Constraint**: 100ms
- **Equal Issue mode**: False
- **High accuracy variant**: Yes
- **Submission Category**: Datacenter, Edge
---
## Medical Image Segmentation
### [3d-unet](benchmarks/medical_imaging/3d-unet.md) <!-- https://ar5iv.labs.arxiv.org/html/1809.10483v2 -->
- **Dataset**: KiTS2019
- **Dataset Size**: 42
- **QSL Size**: 42
- **Number of Parameters**: 32.5 million
- **FLOPs**: 100-300 billion
- **Reference Model Accuracy (fp32) **: 0.86330 Mean DICE Score
- **Server Scenario**: Not Applicable
- **Equal Issue mode**: True
- **High accuracy variant**: Yes
- **Submission Category**: Datacenter, Edge
---
## Language Tasks
### Question Answering
#### [Bert-Large](benchmarks/language/bert.md)
- **Dataset**: Squad v1.1 (384 Sequence Length)
- **Dataset Size**: 10,833
- **QSL Size**: 10,833
- **Number of Parameters**: 340 million <!-- taken from https://huggingface.co/transformers/v2.9.1/pretrained_models.html -->
- **FLOPs**: ~128 billion
- **Reference Model Accuracy (fp32) **: F1 Score = 90.874%
- **Server Scenario Latency Constraint**: 130ms
- **Equal Issue mode**: False
- **High accuracy variant**: yes
- **Submission Category**: Edge
#### [LLAMA2-70B](benchmarks/language/llama2-70b.md)
- **Dataset**: OpenORCA (GPT-4 split, max_seq_len=1024)
- **Dataset Size**: 24,576
- **QSL Size**: 24,576
- **Number of Parameters**: 70 billion
- **FLOPs**: ~500 trillion
- **Reference Model Accuracy (fp32) **:
- Rouge1: 44.4312
- Rouge2: 22.0352
- RougeL: 28.6162
- Tokens_per_sample: 294.45
- **Server Scenario Latency Constraint**:
- TTFT: 2000ms
- TPOT: 200ms
- **Equal Issue mode**: True
- **High accuracy variant**: Yes
- **Submission Category**: Datacenter
### Text Summarization
#### [GPT-J](benchmarks/language/gpt-j.md)
- **Dataset**: CNN Daily Mail v3.0.0
- **Dataset Size**: 13,368
- **QSL Size**: 13,368
- **Number of Parameters**: 6 billion
- **FLOPs**: ~148 billion
- **Reference Model Accuracy (fp32) **:
- Rouge1: 42.9865
- Rouge2: 20.1235
- RougeL: 29.9881
- Gen_len: 4,016,878
- **Server Scenario Latency Constraint**: 20s
- **Equal Issue mode**: True
- **High accuracy variant**: Yes
- **Submission Category**: Datacenter, Edge
### Mixed Tasks (Question Answering, Math, and Code Generation)
#### [Mixtral-8x7B](benchmarks/language/mixtral-8x7b.md)
- **Datasets**:
- OpenORCA (5k samples of GPT-4 split, max_seq_len=2048)
- GSM8K (5k samples of the validation split, max_seq_len=2048)
- MBXP (5k samples of the validation split, max_seq_len=2048)
- **Dataset Size**: 15,000
- **QSL Size**: 15,000
- **Number of Parameters**: 47 billion <!-- https://huggingface.co/blog/moe -->
- **Reference Model Accuracy (fp16) **:
- OpenORCA
- Rouge1: 45.4911
- Rouge2: 23.2829
- RougeL: 30.3615
- GSM8K Accuracy: 73.78%
- MBXP Accuracy: 60.12%
- Tokens_per_sample: 294.45
- **Server Scenario Latency Constraint**:
- TTFT: 2000ms
- TPOT: 200ms
- **Equal Issue mode**: True
- **High accuracy variant**: Yes
- **Submission Category**: Datacenter
---
## Recommendation
### [DLRM_v2](benchmarks/recommendation/dlrm-v2.md)
- **Dataset**: Synthetic Multihot Criteo
- **Dataset Size**: 204,800
- **QSL Size**: 204,800
- **Number of Parameters**: ~23 billion
- **Reference Model Accuracy**: AUC = 80.31%
- **Server Scenario Latency Constraint**: 60ms
- **Equal Issue mode**: False
- **High accuracy variant**: Yes
- **Submission Category**: Datacenter
## Graph Neural Networks
### [R-GAT](benchmarks/graph/rgat.md)
- **Dataset**: Illinois Graph Benchmark Heterogeneous validation dataset
- **Dataset Size**: 788,379
- **QSL Size**: 788,379
- **Number of Parameters**:
- **Reference Model Accuracy**: ACC = 72.86%
- **Server Scenario Latency Constraint**: N/A
- **Equal Issue mode**: True
- **High accuracy variant**: No
- **Submission Category**: Datacenter
---
## Submission Categories
- **Datacenter Category**: All benchmarks except bert are applicable to the datacenter category for inference v5.0.
- **Edge Category**: All benchmarks except DLRMv2, LLAMA2-70B, Mixtral-8x7B and R-GAT are applicable to the edge category for v5.0.
## High Accuracy Variants
- **Benchmarks**: `bert`, `llama2-70b`, `gpt-j`, `dlrm_v2`, and `3d-unet` have a normal accuracy variant as well as a high accuracy variant.
- **Requirement**: Must achieve at least 99.9% of the reference model accuracy, compared to the default 99% accuracy requirement.
# MLPerf™ Inference Benchmark Suite
MLPerf Inference is a benchmark suite for measuring how fast systems can run models in a variety of deployment scenarios.
Please see the [MLPerf Inference benchmark paper](https://arxiv.org/abs/1911.02549) for a detailed description of the benchmarks along with the motivation and guiding principles behind the benchmark suite. If you use any part of this benchmark (e.g., reference implementations, submissions, etc.), please cite the following:
```
@misc{reddi2019mlperf,
title={MLPerf Inference Benchmark},
author={Vijay Janapa Reddi and Christine Cheng and David Kanter and Peter Mattson and Guenther Schmuelling and Carole-Jean Wu and Brian Anderson and Maximilien Breughe and Mark Charlebois and William Chou and Ramesh Chukka and Cody Coleman and Sam Davis and Pan Deng and Greg Diamos and Jared Duke and Dave Fick and J. Scott Gardner and Itay Hubara and Sachin Idgunji and Thomas B. Jablin and Jeff Jiao and Tom St. John and Pankaj Kanwar and David Lee and Jeffery Liao and Anton Lokhmotov and Francisco Massa and Peng Meng and Paulius Micikevicius and Colin Osborne and Gennady Pekhimenko and Arun Tejusve Raghunath Rajan and Dilip Sequeira and Ashish Sirasao and Fei Sun and Hanlin Tang and Michael Thomson and Frank Wei and Ephrem Wu and Lingjie Xu and Koichi Yamada and Bing Yu and George Yuan and Aaron Zhong and Peizhao Zhang and Yuchen Zhou},
year={2019},
eprint={1911.02549},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
Please see [here](https://docs.mlcommons.org/inference/benchmarks/) for the MLPerf inference documentation website which includes automated commands to run MLPerf inference benchmarks using different implementations.
## MLPerf Inference v4.1 (submission deadline July 26, 2024)
For submissions, please use the master branch and any commit since the [4.1 seed release](https://github.com/mlcommons/inference/pull/1736/files) although it is best to use the latest commit. v4.1 tag will be created from the master branch after the result publication.
For power submissions please use [SPEC PTD 1.10](https://github.com/mlcommons/power/tree/main/inference_v1.0) (needs special access) and any commit of the power-dev repository after the [code-freeze](https://github.com/mlcommons/power-dev/pull/325)
| model | reference app | framework | dataset | category
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, onnx, tvm, ncnn | imagenet2012 | edge,datacenter |
| retinanet 800x800 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | pytorch, onnx | openimages resized to 800x800| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/master/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm-v2 | [recommendation/dlrm_v2](https://github.com/mlcommons/inference/tree/master/recommendation/dlrm_v2/pytorch) | pytorch | Multihot Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet-kits19](https://github.com/mlcommons/inference/tree/master/vision/medical_imaging/3d-unet-kits19) | pytorch, tensorflow, onnx | KiTS19 | edge,datacenter |
| gpt-j | [language/gpt-j](https://github.com/mlcommons/inference/tree/master/language/gpt-j)| pytorch | CNN-Daily Mail | edge,datacenter |
| stable-diffusion-xl | [text_to_image](https://github.com/mlcommons/inference/tree/master/text_to_image) | pytorch | COCO 2014| edge,datacenter |
| llama2-70b | [language/llama2-70b](https://github.com/mlcommons/inference/tree/master/language/llama2-70b) | pytorch | OpenOrca | datacenter |
| mixtral-8x7b | [language/mixtral-8x7b](https://github.com/mlcommons/inference/tree/master/language/mixtral-8x7b) | pytorch | OpenOrca, MBXP, GSM8K | datacenter |
* Framework here is given for the reference implementation. Submitters are free to use their own frameworks to run the benchmark.
## MLPerf Inference v4.0 (submission February 23, 2024)
There is an extra one-week extension allowed only for the llama2-70b submissions. For submissions, please use the master branch and any commit since the [4.0 seed release](https://github.com/mlcommons/inference/commit/8e36925bd36a503e39fcbbc488e9e46126f079ed) although it is best to use the latest commit. v4.0 tag will be created from the master branch after the result publication.
For power submissions please use [SPEC PTD 1.10](https://github.com/mlcommons/power/tree/main/inference_v1.0) (needs special access) and any commit of the power-dev repository after the [code-freeze](https://github.com/mlcommons/power-dev/commit/4e026f43481f46ad57d2464d28924018444b0428)
| model | reference app | framework | dataset | category
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, onnx, tvm, ncnn | imagenet2012 | edge,datacenter |
| retinanet 800x800 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | pytorch, onnx | openimages resized to 800x800| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/master/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm-v2 | [recommendation/dlrm_v2](https://github.com/mlcommons/inference/tree/master/recommendation/dlrm_v2/pytorch) | pytorch | Multihot Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet-kits19](https://github.com/mlcommons/inference/tree/master/vision/medical_imaging/3d-unet-kits19) | pytorch, tensorflow, onnx | KiTS19 | edge,datacenter |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus | edge,datacenter |
| gpt-j | [language/gpt-j](https://github.com/mlcommons/inference/tree/master/language/gpt-j)| pytorch | CNN-Daily Mail | edge,datacenter |
| stable-diffusion-xl | [text_to_image](https://github.com/mlcommons/inference/tree/master/text_to_image) | pytorch | COCO 2014| edge,datacenter |
| llama2-70b | [language/llama2-70b](https://github.com/mlcommons/inference/tree/master/language/llama2-70b) | pytorch | OpenOrca | datacenter |
* Framework here is given for the reference implementation. Submitters are free to use their own frameworks to run the benchmark.
## MLPerf Inference v3.1 (submission August 18, 2023)
Please use [v3.1 tag](https://github.com/mlcommons/inference/releases/tag/v3.1) (```git checkout v3.1```) if you would like to reproduce the v3.1 results.
For reproducing power submissions please use the `master` branch of the [MLCommons power-dev](https://github.com/mlcommons/power-dev) repository and checkout to [e9e16b1299ef61a2a5d8b9abf5d759309293c440](https://github.com/mlcommons/power-dev/tree/e9e16b1299ef61a2a5d8b9abf5d759309293c440).
You can see the individual README files in the benchmark task folders for more details regarding the benchmarks. For reproducing the submitted results please see the README files under the respective submitter folders in the [inference v3.1 results repository](https://github.com/mlcommons/inference_results_v3.1).
| model | reference app | framework | dataset | category
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, onnx, tvm, ncnn | imagenet2012 | edge,datacenter |
| retinanet 800x800 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | pytorch, onnx | openimages resized to 800x800| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/master/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm-v2 | [recommendation/dlrm_v2](https://github.com/mlcommons/inference/tree/master/recommendation/dlrm_v2/pytorch) | pytorch | Multihot Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet-kits19](https://github.com/mlcommons/inference/tree/master/vision/medical_imaging/3d-unet-kits19) | pytorch, tensorflow, onnx | KiTS19 | edge,datacenter |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus | edge,datacenter |
| gpt-j | [language/gpt-j](https://github.com/mlcommons/inference/tree/master/language/gpt-j)| pytorch | CNN-Daily Mail | edge,datacenter |
## MLPerf Inference v3.0 (submission 03/03/2023)
Please use the v3.0 tag (```git checkout v3.0```) if you would like to reproduce v3.0 results.
You can see the individual Readme files in the reference app for more details.
| model | reference app | framework | dataset | category
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, onnx, tvm | imagenet2012 | edge,datacenter |
| retinanet 800x800 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | pytorch, onnx | openimages resized to 800x800| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/master/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm | [recommendation/dlrm](https://github.com/mlcommons/inference/tree/master/recommendation/dlrm/pytorch) | pytorch, tensorflow | Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet-kits19](https://github.com/mlcommons/inference/tree/master/vision/medical_imaging/3d-unet-kits19) | pytorch, tensorflow, onnx | KiTS19 | edge,datacenter |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus | edge,datacenter |
## MLPerf Inference v2.1 (submission 08/05/2022)
Use the r2.1 branch (```git checkout r2.1```) if you want to submit or reproduce v2.1 results.
See the individual Readme files in the reference app for details.
| model | reference app | framework | dataset | category
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, onnx | imagenet2012 | edge,datacenter |
| retinanet 800x800 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | pytorch, onnx | openimages resized to 800x800| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/master/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm | [recommendation/dlrm](https://github.com/mlcommons/inference/tree/master/recommendation/dlrm/pytorch) | pytorch, tensorflow | Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet-kits19](https://github.com/mlcommons/inference/tree/master/vision/medical_imaging/3d-unet-kits19) | pytorch, tensorflow, onnx | KiTS19 | edge,datacenter |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus | edge,datacenter |
## MLPerf Inference v2.0 (submission 02/25/2022)
Use the r2.0 branch (```git checkout r2.0```) if you want to submit or reproduce v2.0 results.
See the individual Readme files in the reference app for details.
| model | reference app | framework | dataset | category |
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, onnx | imagenet2012 | edge,datacenter |
| ssd-mobilenet 300x300 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, pytorch, onnx| coco resized to 300x300 | edge |
| ssd-resnet34 1200x1200 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/master/vision/classification_and_detection) | tensorflow, pytorch, onnx | coco resized to 1200x1200| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/master/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm | [recommendation/dlrm](https://github.com/mlcommons/inference/tree/master/recommendation/dlrm/pytorch) | pytorch, tensorflow | Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet-kits19](https://github.com/mlcommons/inference/tree/master/vision/medical_imaging/3d-unet-kits19) | pytorch, tensorflow, onnx | KiTS19 | edge,datacenter |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus | edge,datacenter |
## MLPerf Inference v1.1 (submission 08/13/2021)
Use the r1.1 branch (```git checkout r1.1```) if you want to submit or reproduce v1.1 results.
See the individual Readme files in the reference app for details.
| model | reference app | framework | dataset | category |
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r1.1/vision/classification_and_detection) | tensorflow, onnx | imagenet2012 | edge,datacenter |
| ssd-mobilenet 300x300 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r1.1/vision/classification_and_detection) | tensorflow, pytorch, onnx| coco resized to 300x300 | edge |
| ssd-resnet34 1200x1200 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r1.1/vision/classification_and_detection) | tensorflow, pytorch, onnx | coco resized to 1200x1200| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/r1.1/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm | [recommendation/dlrm](https://github.com/mlcommons/inference/tree/r1.1/recommendation/dlrm/pytorch) | pytorch, tensorflow | Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet](https://github.com/mlcommons/inference/tree/r1.1/vision/medical_imaging/3d-unet) | pytorch, tensorflow(?), onnx(?) | BraTS 2019 | edge,datacenter |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/r1.1/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus | edge,datacenter |
## MLPerf Inference v1.0 (submission 03/19/2021)
Use the r1.0 branch (```git checkout r1.0```) if you want to submit or reproduce v1.0 results.
See the individual Readme files in the reference app for details.
| model | reference app | framework | dataset | category |
| ---- | ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r1.0/vision/classification_and_detection) | tensorflow, onnx | imagenet2012 | edge,datacenter |
| ssd-mobilenet 300x300 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r1.0/vision/classification_and_detection) | tensorflow, pytorch, onnx| coco resized to 300x300 | edge |
| ssd-resnet34 1200x1200 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r1.0/vision/classification_and_detection) | tensorflow, pytorch, onnx | coco resized to 1200x1200| edge,datacenter |
| bert | [language/bert](https://github.com/mlcommons/inference/tree/r1.0/language/bert) | tensorflow, pytorch, onnx | squad-1.1 | edge,datacenter |
| dlrm | [recommendation/dlrm](https://github.com/mlcommons/inference/tree/r1.0/recommendation/dlrm/pytorch) | pytorch, tensorflow(?) | Criteo Terabyte | datacenter |
| 3d-unet | [vision/medical_imaging/3d-unet](https://github.com/mlcommons/inference/tree/r1.0/vision/medical_imaging/3d-unet) | pytorch, tensorflow(?), onnx(?) | BraTS 2019 | edge,datacenter |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/r1.0/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus | edge,datacenter |
## MLPerf Inference v0.7 (submission 9/18/2020)
Use the r0.7 branch (```git checkout r0.7```) if you want to submit or reproduce v0.7 results.
See the individual Readme files in the reference app for details.
| model | reference app | framework | dataset |
| ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r0.7/vision/classification_and_detection) | tensorflow, pytorch, onnx | imagenet2012 |
| ssd-mobilenet 300x300 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r0.7/vision/classification_and_detection) | tensorflow, pytorch, onnx| coco resized to 300x300 |
| ssd-resnet34 1200x1200 | [vision/classification_and_detection](https://github.com/mlcommons/inference/tree/r0.7/vision/classification_and_detection) | tensorflow, pytorch, onnx | coco resized to 1200x1200|
| bert | [language/bert](https://github.com/mlcommons/inference/tree/r0.7/language/bert) | tensorflow, pytorch, onnx | squad-1.1 |
| dlrm | [recommendation/dlrm](https://github.com/mlcommons/inference/tree/r0.7/recommendation/dlrm/pytorch) | pytorch, tensorflow(?), onnx(?) | Criteo Terabyte |
| 3d-unet | [vision/medical_imaging/3d-unet](https://github.com/mlcommons/inference/tree/r0.7/vision/medical_imaging/3d-unet) | pytorch, tensorflow(?), onnx(?) | BraTS 2019 |
| rnnt | [speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/r0.7/speech_recognition/rnnt) | pytorch | OpenSLR LibriSpeech Corpus |
## MLPerf Inference v0.5
Use the r0.5 branch (```git checkout r0.5```) if you want to reproduce v0.5 results.
See the individual Readme files in the reference app for details.
| model | reference app | framework | dataset |
| ---- | ---- | ---- | ---- |
| resnet50-v1.5 | [v0.5/classification_and_detection](https://github.com/mlcommons/inference/tree/r0.5/v0.5/classification_and_detection) | tensorflow, pytorch, onnx | imagenet2012 |
| mobilenet-v1 | [v0.5/classification_and_detection](https://github.com/mlcommons/inference/tree/r0.5/v0.5/classification_and_detection) |tensorflow, pytorch, onnx | imagenet2012 |
| ssd-mobilenet 300x300 | [v0.5/classification_and_detection](https://github.com/mlcommons/inference/tree/r0.5/v0.5/classification_and_detection) |tensorflow, pytorch, onnx | coco resized to 300x300 |
| ssd-resnet34 1200x1200 | [v0.5/classification_and_detection](https://github.com/mlcommons/inference/tree/r0.5/v0.5/classification_and_detection) | tensorflow, pytorch, onnx | coco resized to 1200x1200 |
| gnmt | [v0.5/translation/gnmt/](https://github.com/mlcommons/inference/tree/r0.5/v0.5/translation/gnmt/tensorflow) | tensorflow, pytorch | See Readme |
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment