"dockerfile/cuda12.1.dockerfile" did not exist on "faeee0a7cc0636655dda479b977d00e9d88ef82c"
- 25 Apr, 2022 1 commit
-
-
user4543 authored
**Description** Fix bug of duration feature for model benchmarks in distributed mode. **Major Revision** - Add all_reduce to sync the result of is_finished(the function to judge whether the model benchmark should be stopped) in each step - to avoid inconsistency between different ranks to determine duration end (some rank may enter one more step and can never finish) - Add torch.cuda.synchronize() before and after step time measuring in train_step() for all model benchmarks - some operations in train_step() maybe async resulting incorrect step time records (for example, lstm)
-
- 21 Apr, 2022 1 commit
-
-
user4543 authored
**Description** Fix bugs in sync results on root rank for e2e model benchmarks. Bugs: - results were not changed to sync results (grammer) - sync results not applyed to all ranks but only root rank - output result on local_rank 0 not global root rank
-
- 09 Dec, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Unify metric names of benchmarks.
-
- 27 Sep, 2021 1 commit
-
-
guoshzhao authored
**Description** Add option `force_fp32` to use fp32 instead of tf32, only takes effect on Ampere or newer GPUs.
-
- 28 Jun, 2021 1 commit
-
-
guoshzhao authored
* replace torch.optim.AdamW with transformers.AdamW.
-
- 07 Jun, 2021 1 commit
-
-
guoshzhao authored
* Clean up the cache.
-
- 12 Apr, 2021 2 commits
-
-
guoshzhao authored
Co-authored-by:
Guoshuai Zhao <guzhao@microsoft.com> Co-authored-by:
Yifan Xiong <yifan.xiong@microsoft.com>
-
Yifan Xiong authored
* skip unnecessary tests according to env var * remove useless tests
-
- 08 Apr, 2021 1 commit
-
-
guoshzhao authored
Benchmarks: Code Revision - Revise BenchmarkRegistry interfaces for integration with executor. (#33) * revise BenchmarkRegistry interfaces. * address comments Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 22 Mar, 2021 1 commit
-
-
guoshzhao authored
* move benchmarks registration from registry.py to __init__.py * revise __init__. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 17 Mar, 2021 1 commit
-
-
guoshzhao authored
* add pytorch base tests. * add more test cases. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-