- 22 Mar, 2022 1 commit
-
-
user4543 authored
**Description** Remove fp16 samples type converting time for training cnn and lstm inference.
-
- 17 Mar, 2022 1 commit
-
-
user4543 authored
**Description** Remove fp16 samples type converting time for cnn and lstm models.
-
- 06 Mar, 2022 1 commit
-
-
Jeff Daily authored
**Description** The BatchNorm operator is not numerically stable in fp16. PyTorch documentation recommends to keep the BN op in fp32 for fp16 AMP models. Refer to https://pytorch.org/docs/stable/amp.html#ops-that-can-autocast-to-float32. Preserving BN in fp32 for superbench more accurately reflects real workloads.
-
- 10 Feb, 2022 1 commit
-
-
user4543 authored
**Description** Add support for pytorch>=1.9.0 of init_process_group. **Major Revision** - Use PrefixStore(TCPStore) to init_process_group manully for each model run
-
- 28 Jan, 2022 1 commit
-
-
guoshzhao authored
**Description** Please write a brief description and link the related issue if have. **Major Revision** - Sync (do allreduce max) the E2E training results among all workers. - Avoid using ':0' in metric name if there has only one rank having output.
-
- 19 Jan, 2022 1 commit
-
-
guoshzhao authored
**Description** Add 50th, 90th, 95th, 99th, 99.9th latency metrics for ORT and pytorch inference benchmarks.
-
- 13 Dec, 2021 1 commit
-
-
Yifan Xiong authored
Add transformers for TensorRT inference.
-
- 09 Dec, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Unify metric names of benchmarks.
-
- 28 Sep, 2021 1 commit
-
-
guoshzhao authored
**Description** Fix typo when set force_fp32 option.
-
- 27 Sep, 2021 1 commit
-
-
guoshzhao authored
**Description** Add option `force_fp32` to use fp32 instead of tf32, only takes effect on Ampere or newer GPUs.
-
- 26 Sep, 2021 1 commit
-
-
Yifan Xiong authored
**Description** Cherry-pick bug fixes from v0.3.0 to main. **Major Revisions** * Docs - Upgrade version and release note (#209) * Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210) * Benchmarks: Update - Update benchmarks in configuration file (#208) * CI/CD - Update GitHub Action VM (#211) * Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203) * CI/CD - Fix bug in build image for push event (#205) * Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204) * Tool: Fix bug - Fix function naming issue in system info (#200) * CI/CD - Push images in GitHub Action (#202) * Bug - Fix torch.distributed command for single node (#201) * CLI - Integrate system info for node (#199) * Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196) * CI/CD - Add ROCm image build in GitHub Actions (#194) * Bug: Fix bug - fix bug of hipBusBandwidth build (#193) * Benchmarks: Build Pipeline - Restore rocblas build logic (#197) * Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198) * Bug - Revise 'docker run' in sb deploy (#195) * Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190) Co-authored-by:
Yuting Jiang <v-yujiang@microsoft.com> Co-authored-by:
Guoshuai Zhao <guzhao@microsoft.com> Co-authored-by:
Ziyue Yang <ziyyang@microsoft.com>
-
- 06 Aug, 2021 2 commits
- 29 Jul, 2021 1 commit
-
-
Yifan Xiong authored
__Description__ Cherry-pick bug fixes from v0.2.1 to main. __Major Revisions__ * Fix bug of VGG models failed on A100 GPU with batch_size=128. * Fix Ansible connection issue when running in localhost. * Update version in packages and docs.
-
- 28 Jun, 2021 2 commits
- 21 Jun, 2021 1 commit
-
-
guoshzhao authored
Benchmarks: Add Feature - Add DistributedImpl and DistributedBackend arguments for micro benchmark. (#100)
-
- 16 Jun, 2021 1 commit
-
-
Yifan Xiong authored
Fix bugs and refine log in single GPU benchmarks: * Fix none framework issue * Fix empty parameter bug * Remove missed mobilenet_v3 models * Change benchmark registration log to debug level * Add pid in logging * Add missing benchmarks in default config * Fix deprecated logging warn
-
- 07 Jun, 2021 1 commit
-
-
guoshzhao authored
* Clean up the cache.
-
- 04 Jun, 2021 1 commit
-
-
guoshzhao authored
* fix return code reset issue
-
- 19 May, 2021 1 commit
-
-
Yuting Jiang authored
-
- 26 Apr, 2021 2 commits
- 20 Apr, 2021 2 commits
- 16 Apr, 2021 2 commits
- 12 Apr, 2021 1 commit
-
-
guoshzhao authored
Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 08 Apr, 2021 1 commit
-
-
guoshzhao authored
* revise result process interface * add more comments Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 26 Mar, 2021 1 commit
-
-
guoshzhao authored
Benchmarks: Add Benchmark - Add Pytorch BERT benchmarks, including bert-base and bert-large. (#20) * add pytorch bert benchmarks. * revise code * fix issue * revise code. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 22 Mar, 2021 2 commits
-
-
guoshzhao authored
* move benchmarks registration from registry.py to __init__.py * revise __init__. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
guoshzhao authored
Benchmarks: Add Feature - Add benchmark finish check according to num_warmup/num_steps and duration in ModelBenchmark class. (#25) * add is_finished function * reuse current time. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 18 Mar, 2021 1 commit
-
-
guoshzhao authored
* add sample_count argument. * handle more condidatins. * address comments. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 15 Mar, 2021 1 commit
-
-
guoshzhao authored
Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 11 Mar, 2021 1 commit
-
-
guoshzhao authored
* add random dataset. * install pytorch-cpu for test docker. * fix typo * add more test cases. * address comments. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 09 Mar, 2021 2 commits
-
-
guoshzhao authored
* add flag to disable GPU. * fix spelling * fix unittest. * address comments. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
guoshzhao authored
Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 08 Mar, 2021 2 commits
-
-
guoshzhao authored
* add pytorch base class * address comments Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
guoshzhao authored
* add optimizer definition and function to create torch optimizer. * move optimizer enum into model_base module. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 04 Mar, 2021 1 commit
-
-
guoshzhao authored
Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-