--- slug: release-sb-v0.8 title: Releasing SuperBench v0.8 author: Peng Cheng author_title: SuperBench Team author_url: https://github.com/cp5555 author_image_url: https://github.com/cp5555.png tags: [superbench, announcement, release] --- We are very happy to announce that **SuperBench 0.8.0 version** is officially released today! You can install and try superbench by following [Getting Started Tutorial](https://microsoft.github.io/superbenchmark/docs/getting-started/installation). ## SuperBench 0.8.0 Release Notes ### SuperBench Improvements - Support SuperBench Executor running on Windows. - Remove fixed rccl version in rocm5.1.x docker file. - Upgrade networkx version to fix installation compatibility issue. - Pin setuptools version to v65.7.0. - Limit ansible_runner version for Python 3.6. - Support cgroup V2 when read system metrics in monitor. - Fix analyzer bug in Python 3.8 due to pandas api change. - Collect real-time GPU power in monitor. - Remove unreachable condition when write host list in mpi mode. - Upgrade Docker image with cuda12.1, nccl 2.17.1-1, hpcx v2.14, and mlc 3.10. - Fix wrong unit of cpu-memory-bw-latency in document. ### Micro-benchmark Improvements - Add STREAM benchmark for sustainable memory bandwidth and the corresponding computation rate. - Add HPL Benchmark for HPC Linpack Benchmark. - Support flexible warmup and non-random data initialization in cublas-benchmark. - Support error tolerance in micro-benchmark for CuDNN function. - Add distributed inference benchmark. - Support tensor core precisions (e.g., FP8) and batch/shape range in cublaslt gemm. ### Model Benchmark Improvements - Fix torch.dist init issue with multiple models. - Support TE FP8 in BERT/GPT2 model. - Add num_workers configurations in model benchmark.