- 30 Apr, 2025 1 commit
-
-
Hongtao Zhang authored
- Upgrade OS of github runner used by lint to the latest. - Add symbolic link for clang-format to version 14. - Update importlib_metadata version since it is too old (inside nvcr.io/nvidia/pytorch:20.12-py3) and failed the 11.1 build. --------- Co-authored-by:
hongtaozhang <hongtaozhang@microsoft.com> Co-authored-by:
Yifan Xiong <yifan.xiong@microsoft.com>
-
- 21 Nov, 2024 1 commit
-
-
Hongtao Zhang authored
**Description** Add nvbandwidth build to repo --------- Co-authored-by:hongtaozhang <hongtaozhang@microsoft.com>
-
- 10 Oct, 2024 1 commit
-
-
Yuting Jiang authored
**Description** Cherry pick bug fixes from v0.11.0 to main **Major Revision** * #645 * #648 * #646 * #647 * #651 * #652 * #650 --------- Co-authored-by:
hongtaozhang <hongtaozhang@microsoft.com> Co-authored-by:
Yifan Xiong <yifan.xiong@microsoft.com>
-
- 18 Apr, 2024 1 commit
-
-
Yuting Jiang authored
**Description** Upgrade mlc to v3.11.
-
- 07 Dec, 2023 1 commit
-
-
Yuting Jiang authored
**Description** Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark
-
- 22 Nov, 2023 1 commit
-
-
guoshzhao authored
**Description** Generate baseline given results from multiple nodes. **Major Revision** - Add sub command `sb result generate-baseline` - Add UT and docs --------- Co-authored-by:
454314380 <454314380@qq.com> Co-authored-by:
Yuting Jiang <yutingjiang@microsoft.com>
-
- 23 Oct, 2023 1 commit
-
-
Yuting Jiang authored
**Description** Update mlc version into 3.10 for cuda and rocm dockerfiles to be consistent with cuda12 dockerfile Co-authored-by:yukirora <yuting.jiang@microsoft.com>
-
- 22 Aug, 2023 1 commit
-
-
Yuting Jiang authored
**Description** source code for evaluating NVDEC decoding performance. --------- Co-authored-by:yukirora <yuting.jiang@microsoft.com>
-
- 21 Mar, 2023 1 commit
-
-
rafsalas19 authored
**Description** - Adding HPL benchmark --------- Co-authored-by:
Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by:
Peng Cheng <chengpeng5555@outlook.com>
-
- 13 Feb, 2023 1 commit
-
-
rafsalas19 authored
**Description** - Added stream benchmark - Added stream unit test - Added stream example - Modified docker files to build stream --------- Co-authored-by:
Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by:
Peng Cheng <chengpeng5555@outlook.com> Co-authored-by:
Yifan Xiong <xiongyf@yandex.com>
-
- 29 Dec, 2022 1 commit
-
-
Yifan Xiong authored
Add Docker image for arch90 NVIDIA GPUs: * add CUDA11.8 Dockerfile * update archs in Makefile and benchmarks accordingly * update image build pipeline
-
- 31 Oct, 2022 1 commit
-
-
Yifan Xiong authored
Update version to include revision hash and date in "{last tag}+g{git hash}.d{date}" format, here're the examples: * exact tag: 0.6.0 * commit after tag: 0.6.0+gcbb1b34 * commit after tag with local changes: 0.6.0+gcbb1b34.d20221028
-
- 06 Sep, 2022 1 commit
-
-
Yifan Xiong authored
**Description** Cherry-pick bug fixes from v0.6.0 to main. **Major Revisions** * Enable latency test in ib traffic validation distributed benchmark (#396) * Enhance parameter parsing to allow spaces in value (#397) * Update apt packages in dockerfile (#398) * Upgrade colorlog for NO_COLOR support (#404) * Analyzer - Update error handling to support exit code of sb result diagnosis (#403) * Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399) * Enhance timeout cleanup to avoid possible hanging (#405) * Auto generate ibstat file by pssh (#402) * Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406) * Docs - Upgrade version and release note (#407) * Docs - Fix issues in document (#408) Co-authored-by:
Yang Wang <yangwang1@microsoft.com> Co-authored-by:
Yuting Jiang <yutingjiang@microsoft.com>
-
- 17 Aug, 2022 1 commit
-
-
Yifan Xiong authored
__Description__ Update Python setup for require packages. __Major Revisions__ * downgrade requests version to be compatible with python 3.6, add corresponding pipeline for 3.6 * add extra entry in extras_require for nested packages * update `pip install` contents accordingly
-
- 13 Aug, 2022 1 commit
-
-
Yang Wang authored
An enhancement for topo-aware IB performance validation #373. This PR will auto-generate a required ibstate file `ib_traffic_topo_aware_ibstat.txt` which is used as input to build a graph.
-
- 13 Jul, 2022 1 commit
-
-
Yifan Xiong authored
Add dependencies * include ndv4-topo.xml in cuda docker images * require requests version to avoid RequestsDependencyWarning
-
- 06 Jul, 2022 1 commit
-
-
Yifan Xiong authored
Update dependencies and Dockerfile: * upgrade nccl-tests and rccl-tests to current latest version to match NCCL/RCCL versions * unify image tag names on DockerHub * remove verbose output in Dockerfile and minor fix some flags
-
- 24 Jun, 2022 2 commits
-
-
Yifan Xiong authored
Fix incorrect ulimit nofile config in Dockerfile. Instead of bash, sh is used by default where `echo` does not accept any parameters and `-e` is written into /etc/security/limits.conf.
-
Yifan Xiong authored
**Description** Support multiple IB/GPU devices run simultaneously in ib validation benchmark. **Major Revisions** - Revise ib_validation_performance.cc so that multiple processes per node could be used to launch multiple perftest commands simultaneously. For each node pair in the config, number of processes per node will run in parallel. - Revise ib_validation_performance.py to correct file paths and adjust parameters to specify different NICs/GPUs/NUMA nodes. - Fix env issues in Dockerfile for end-to-end test. - Update ib-traffic configuration examples in config files. - Update unit tests and docs accordingly. Closes #326.
-
- 15 Jun, 2022 1 commit
-
-
Yifan Xiong authored
**Description** Fix cmake and build issues. **Major Revision** * Remove unnecessary boost build * Remove user-agent for mlc * Remove -j for third party to build each project in sequence * Fix ansible collections installation path
-
- 31 May, 2022 1 commit
-
-
user4543 authored
**Description** Add support to run sb command inside docker image - install missing dependency.
-
- 08 Feb, 2022 1 commit
-
-
Ziyue Yang authored
This commit adds GDR-only nccl-tests for Nvidia machines. Also bump NCCL to v2.10.3-1 to achieve peak performance in this test.
-
- 13 Dec, 2021 1 commit
-
-
Hossein Pourreza authored
**Description** Add mlc memory bandwidth and latency micro benchmark to Superbench. **Major Revision** - Add mlc benchmark with test and example files
-
- 10 Dec, 2021 1 commit
-
-
guoshzhao authored
**Description** Add ONNXRuntime inference benchmark based on ORT python API. **Major Revision** - Add `ORTInferenceBenchmark` class to export pytorch model to onnx model and do inference - Add tests and example for `ort-inference` benchmark - Update the introduction docs.
-
- 30 Oct, 2021 1 commit
-
-
Ziyue Yang authored
**Description** This commit does the following: 1) Adds CPU-initiated copy benchmark; 2) Adds dtod benchmark; 3) Support scanning NUMA nodes and GPUs inside the benchmark program; 4) Change the name of gpu-sm-copy to gpu-copy.
-
- 02 Sep, 2021 1 commit
-
-
Yifan Xiong authored
__Description__ Resolve "too many open files" issue when runnning NCCL/RCCL on multiple nodes using Docker images, increase nofile number in limits.conf.
-
- 01 Sep, 2021 2 commits
- 29 Jul, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Support rocm in third_party/makefile. **Major Revision** - Split rocm and cuda target in makefile - Add target in dockerfile
-
- 16 Jul, 2021 2 commits
-
-
Yuting Jiang authored
Add perftest as a submodule and add build logic
-
Yuting Jiang authored
Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic.
-
- 16 Jun, 2021 1 commit
-
-
Yifan Xiong authored
Update packages and add build cache for CUDA 11.1.1 Dockerfile: * Remove duplicate cmake and ompi, which are already in base image * Add hpcx and sharp lib * Add cache for gitmodules build * Sort apt-get packages
-
- 01 Jun, 2021 2 commits
- 18 May, 2021 1 commit
-
-
guoshzhao authored
* call build script in Makefile. * add cppbuild command for testing and docker env.
-
- 17 May, 2021 1 commit
-
-
Yifan Xiong authored
* add GitHub Action to build and push image * update Dockerfile to copy from context
-
- 14 Apr, 2021 1 commit
-
-
Yifan Xiong authored
* Rename dev branch to main and set it as default.
-
- 13 Apr, 2021 1 commit
-
-
Yifan Xiong authored
* fix missing package in dockerfile * update benchmark list and parameters * catch runtime errors * refine logging info
-
- 12 Apr, 2021 1 commit
-
-
Yifan Xiong authored
* add cuda11.1.1 dockerfile
-