Unverified Commit 15f22e2c authored by Yifan Xiong's avatar Yifan Xiong Committed by GitHub
Browse files

Docs - Upgrade version and release note (#209)

__Description__

Upgrade version and release note. Closes #95 and #170.

__Major Revisions__

* Upgrade package versions
* Add release note for v0.3.0
parent 0df916ed
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
__SuperBench__ is a validation and profiling tool for AI infrastructure. __SuperBench__ is a validation and profiling tool for AI infrastructure.
📢 [v0.2.1](https://github.com/microsoft/superbenchmark/releases/tag/v0.2.1) has been released! 📢 [v0.3.0](https://github.com/microsoft/superbenchmark/releases/tag/v0.3.0) has been released!
## _Check [aka.ms/superbench](https://aka.ms/superbench) for more details._ ## _Check [aka.ms/superbench](https://aka.ms/superbench) for more details._
......
...@@ -36,7 +36,10 @@ docker buildx build \ ...@@ -36,7 +36,10 @@ docker buildx build \
<TabItem value='rocm'> <TabItem value='rocm'>
```bash ```bash
# coming soon export DOCKER_BUILDKIT=1
docker buildx build \
--platform linux/amd64 --cache-to type=inline,mode=max \
--tag superbench-dev --file dockerfile/rocm4.2-pytorch1.7.0.dockerfile .
``` ```
</TabItem> </TabItem>
......
...@@ -57,7 +57,7 @@ You can clone the source from GitHub and build it. ...@@ -57,7 +57,7 @@ You can clone the source from GitHub and build it.
:::note Note :::note Note
You should checkout corresponding tag to use release version, for example, You should checkout corresponding tag to use release version, for example,
`git clone -b v0.2.1 https://github.com/microsoft/superbenchmark` `git clone -b v0.3.0 https://github.com/microsoft/superbenchmark`
::: :::
```bash ```bash
......
...@@ -27,7 +27,7 @@ sb deploy -f remote.ini --host-password [password] ...@@ -27,7 +27,7 @@ sb deploy -f remote.ini --host-password [password]
:::note Note :::note Note
You should deploy corresponding Docker image to use release version, for example, You should deploy corresponding Docker image to use release version, for example,
`sb deploy -f local.ini -i superbench/superbench:v0.2.1-cuda11.1.1` `sb deploy -f local.ini -i superbench/superbench:v0.3.0-cuda11.1.1`
::: :::
## Run ## Run
......
...@@ -66,7 +66,7 @@ superbench: ...@@ -66,7 +66,7 @@ superbench:
<TabItem value='example'> <TabItem value='example'>
```yaml ```yaml
version: v0.2 version: v0.3
superbench: superbench:
enable: benchmark_1 enable: benchmark_1
var: var:
......
...@@ -29,13 +29,17 @@ available tags are listed below for all stable versions. ...@@ -29,13 +29,17 @@ available tags are listed below for all stable versions.
| Tag | Description | | Tag | Description |
| ----------------- | ---------------------------------- | | ----------------- | ---------------------------------- |
| v0.3.0-cuda11.1.1 | SuperBench v0.3.0 with CUDA 11.1.1 |
| v0.2.1-cuda11.1.1 | SuperBench v0.2.1 with CUDA 11.1.1 | | v0.2.1-cuda11.1.1 | SuperBench v0.2.1 with CUDA 11.1.1 |
| v0.2.0-cuda11.1.1 | SuperBench v0.2.0 with CUDA 11.1.1 | | v0.2.0-cuda11.1.1 | SuperBench v0.2.0 with CUDA 11.1.1 |
</TabItem> </TabItem>
<TabItem value='rocm'> <TabItem value='rocm'>
Coming soon. | Tag | Description |
| --------------------------- | ---------------------------------------------- |
| v0.3.0-rocm4.2-pytorch1.7.0 | SuperBench v0.3.0 with ROCm 4.2, PyTorch 1.7.0 |
| v0.3.0-rocm4.0-pytorch1.7.0 | SuperBench v0.3.0 with ROCm 4.0, PyTorch 1.7.0 |
</TabItem> </TabItem>
</Tabs> </Tabs>
...@@ -6,5 +6,5 @@ ...@@ -6,5 +6,5 @@
Provide hardware and software benchmarks for AI systems. Provide hardware and software benchmarks for AI systems.
""" """
__version__ = '0.2.1' __version__ = '0.3.0'
__author__ = 'Microsoft' __author__ = 'Microsoft'
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
# Server: # Server:
# - Product: HPE Apollo 6500 # - Product: HPE Apollo 6500
version: v0.2 version: v0.3
superbench: superbench:
enable: null enable: null
var: var:
...@@ -52,8 +52,8 @@ superbench: ...@@ -52,8 +52,8 @@ superbench:
gemm-flops: gemm-flops:
<<: *default_local_mode <<: *default_local_mode
parameters: parameters:
m: 7680 m: 7680
n: 8192 n: 8192
k: 8192 k: 8192
ib-loopback: ib-loopback:
enable: true enable: true
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
# - Product: G482-Z53 # - Product: G482-Z53
# - Link: https://www.gigabyte.cn/FileUpload/Global/MicroSite/553/G482-Z53.html # - Link: https://www.gigabyte.cn/FileUpload/Global/MicroSite/553/G482-Z53.html
version: v0.2 version: v0.3
superbench: superbench:
enable: null enable: null
var: var:
......
# SuperBench Config # SuperBench Config
version: v0.2 version: v0.3
superbench: superbench:
enable: null enable: null
var: var:
......
# SuperBench Config # SuperBench Config
version: v0.2 version: v0.3
superbench: superbench:
enable: null enable: null
var: var:
......
---
slug: release-sb-v0.3
title: Releasing SuperBench v0.3
author: Peng Cheng
author_title: SuperBench Team
author_url: https://github.com/cp5555
author_image_url: https://github.com/cp5555.png
tags: [superbench, announcement, release]
---
We are very happy to announce that **SuperBench 0.3.0 version** is officially released today!
You can install and try superbench by following [Getting Started Tutorial](https://microsoft.github.io/superbenchmark/docs/getting-started/installation).
## SuperBench 0.3.0 Release Notes
### SuperBench Framework
#### Runner
- Implement MPI mode.
#### Benchmarks
- Support Docker benchmark.
### Single-node Validation
#### Micro Benchmarks
1. Memory (Tool: NVIDIA/AMD Bandwidth Test Tool)
| Metrics | Unit | Description |
|----------------|------|-------------------------------------|
| H2D_Mem_BW_GPU | GB/s | host-to-GPU bandwidth for each GPU |
| D2H_Mem_BW_GPU | GB/s | GPU-to-host bandwidth for each GPU |
2. IBLoopback (Tool: PerfTest – Standard RDMA Test Tool)
| Metrics | Unit | Description |
|----------|------|---------------------------------------------------------------|
| IB_Write | MB/s | The IB write loopback throughput with different message sizes |
| IB_Read | MB/s | The IB read loopback throughput with different message sizes |
| IB_Send | MB/s | The IB send loopback throughput with different message sizes |
3. NCCL/RCCL (Tool: NCCL/RCCL Tests)
| Metrics | Unit | Description |
|---------------------|------|-----------------------------------------------------------------|
| NCCL_AllReduce | GB/s | The NCCL AllReduce performance with different message sizes |
| NCCL_AllGather | GB/s | The NCCL AllGather performance with different message sizes |
| NCCL_broadcast | GB/s | The NCCL Broadcast performance with different message sizes |
| NCCL_reduce | GB/s | The NCCL Reduce performance with different message sizes |
| NCCL_reduce_scatter | GB/s | The NCCL ReduceScatter performance with different message sizes |
4. Disk (Tool: FIO – Standard Disk Performance Tool)
| Metrics | Unit | Description |
|----------------|------|---------------------------------------------------------------------------------|
| Seq_Read | MB/s | Sequential read performance |
| Seq_Write | MB/s | Sequential write performance |
| Rand_Read | MB/s | Random read performance |
| Rand_Write | MB/s | Random write performance |
| Seq_R/W_Read | MB/s | Read performance in sequential read/write, fixed measurement (read:write = 4:1) |
| Seq_R/W_Write | MB/s | Write performance in sequential read/write (read:write = 4:1) |
| Rand_R/W_Read | MB/s | Read performance in random read/write (read:write = 4:1) |
| Rand_R/W_Write | MB/s | Write performance in random read/write (read:write = 4:1) |
5. H2D/D2H SM Transmission Bandwidth (Tool: MSR-A build)
| Metrics | Unit | Description |
|---------------|------|-----------------------------------------------------|
| H2D_SM_BW_GPU | GB/s | host-to-GPU bandwidth using GPU kernel for each GPU |
| D2H_SM_BW_GPU | GB/s | GPU-to-host bandwidth using GPU kernel for each GPU |
### AMD GPU Support
#### Docker Image Support
- ROCm 4.2 PyTorch 1.7.0
- ROCm 4.0 PyTorch 1.7.0
#### Micro Benchmarks
1. Kernel Launch (Tool: MSR-A build)
| Metrics | Unit | Description |
|--------------------------|-----------|--------------------------------------------------------------|
| Kernel_Launch_Event_Time | Time (ms) | Dispatch latency measured in GPU time using hipEventRecord() |
| Kernel_Launch_Wall_Time | Time (ms) | Dispatch latency measured in CPU time |
2. GEMM FLOPS (Tool: AMD rocblas-bench Tool)
| Metrics | Unit | Description |
|----------|--------|-------------------------------|
| FP64 | GFLOPS | FP64 FLOPS without MatrixCore |
| FP32(MC) | GFLOPS | TF32 FLOPS with MatrixCore |
| FP16(MC) | GFLOPS | FP16 FLOPS with MatrixCore |
| BF16(MC) | GFLOPS | BF16 FLOPS with MatrixCore |
| INT8(MC) | GOPS | INT8 FLOPS with MatrixCore |
#### E2E Benchmarks
1. CNN models -- Use PyTorch torchvision models
- ResNet: ResNet-50, ResNet-101, ResNet-152
- DenseNet: DenseNet-169, DenseNet-201
- VGG: VGG-11, VGG-13, VGG-16, VGG-19​
2. BERT -- Use huggingface Transformers
- BERT
- BERT Large
3. LSTM -- Use PyTorch
4. GPT-2 -- Use huggingface Transformers
### Bug Fix
- VGG models failed on A100 GPU with batch_size=128
### Other Improvement
1. Contribution related
- Contribute rule
- System information collection
2. Document
- Add release process doc
- Add design documents
- Add developer guide doc for coding style
- Add contribution rules
- Add docker image list
- Add initial validation results
...@@ -101,7 +101,7 @@ module.exports = { ...@@ -101,7 +101,7 @@ module.exports = {
announcementBar: { announcementBar: {
id: 'supportus', id: 'supportus',
content: content:
'📢 <a href="https://microsoft.github.io/superbenchmark/blog/release-sb-v0.2">v0.2.1</a> has been released! ' + '📢 <a href="https://microsoft.github.io/superbenchmark/blog/release-sb-v0.3">v0.3.0</a> has been released! ' +
'⭐️ If you like SuperBench, give it a star on <a target="_blank" rel="noopener noreferrer" href="https://github.com/microsoft/superbenchmark">GitHub</a>! ⭐️', '⭐️ If you like SuperBench, give it a star on <a target="_blank" rel="noopener noreferrer" href="https://github.com/microsoft/superbenchmark">GitHub</a>! ⭐️',
}, },
algolia: { algolia: {
......
{ {
"name": "superbench-website", "name": "superbench-website",
"version": "0.2.1", "version": "0.3.0",
"lockfileVersion": 1, "lockfileVersion": 1,
"requires": true, "requires": true,
"dependencies": { "dependencies": {
......
{ {
"name": "superbench-website", "name": "superbench-website",
"version": "0.2.1", "version": "0.3.0",
"private": true, "private": true,
"scripts": { "scripts": {
"docusaurus": "docusaurus", "docusaurus": "docusaurus",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment