Release SuperBench v0.5.0 SuperBench v0.5.0 Release Notes =============================== Micro-benchmark Improvements ---------------------------- - Support NIC only NCCL bandwidth benchmark on single node in NCCL/RCCL bandwidth test. - Support bi-directional bandwidth benchmark in GPU copy bandwidth test. - Support data checking in GPU copy bandwidth test. - Update rccl-tests submodule to fix divide by zero error. - Add GPU-Burn micro-benchmark. Model-benchmark Improvements ---------------------------- - Sync results on root rank for e2e model benchmarks in distributed mode. - Support customized `env` in local and torch.distributed mode. - Add support for pytorch>=1.9.0. - Keep BatchNorm as fp32 for pytorch cnn models cast to fp16. - Remove FP16 samples type converting time. - Support FAMBench. Inference Benchmark Improvements -------------------------------- - Revise the default setting for inference benchmark. - Add percentile metrics for inference benchmarks. - Support T4 and A10 in GEMM benchmark. - Add configuration with inference benchmark. Other Improvements ------------------ - Add command to support listing all optional parameters for benchmarks. - Unify benchmark naming convention and support multiple tests with same benchmark and different parameters/options in one configuration file. - Support timeout to detect the benchmark failure and stop the process automatically. - Add rocm5.0 dockerfile. - Improve output interface. Data Diagnosis and Analysis --------------------------- - Support multi-benchmark check. - Support result summary in md, html and excel formats. - Support data diagnosis in md and html formats. - Support result output for all nodes in data diagnosis.
This tag has no release notes.