Release SuperBench v0.7.0 SuperBench v0.7.0 Release Notes =============================== SuperBench Improvements ----------------------- - Support non-zero return code when "sb deploy" or "sb run" fails in Ansible. - Support log flushing to the result file during runtime. - Update version to include revision hash and date. - Support "pattern" in mpi mode to run tasks in parallel. - Support topo-aware, all-pair, and K-batch pattern in mpi mode. - Fix Transformers version to avoid Tensorrt failure. - Add CUDA11.8 Docker image for NVIDIA arch90 GPUs. - Support "sb deploy" without pulling image. Micro-benchmark Improvements ---------------------------- - Support list of custom config string in cudnn-functions and cublas-functions. - Support correctness check in cublas-functions. - Support GEMM-FLOPS for NVIDIA arch90 GPUs. - Support cuBLASLt FP16 and FP8 GEMM. - Add wait time option to resolve mem-bw unstable issue. - Fix bug for incorrect datatype judgement in cublas-function source code. Model Benchmark Improvements ---------------------------- - Support FP8 in BERT model training. Distributed Benchmark Improvements ---------------------------------- - Support pair-wise pattern in IB validation benchmark. - Support topo-aware, pair-wise, and K-batch pattern in nccl-bw benchmark.
This tag has no release notes.