1. 28 Apr, 2022 1 commit
  2. 25 Apr, 2022 1 commit
    • user4543's avatar
      Bug - Fix bug of duration feature for model benchmarks in distributed mode. (#347) · b5b1c3da
      user4543 authored
      **Description**
      Fix bug of duration feature for model benchmarks in distributed mode.
      
      **Major Revision**
      - Add all_reduce to sync the result of is_finished(the function to judge whether the model benchmark should be stopped) in each step 
        - to avoid inconsistency between different ranks to determine duration end (some rank may enter one more step and can never finish)
      - Add torch.cuda.synchronize() before and after step time measuring in train_step() for all model benchmarks
        - some operations in train_step() maybe async resulting incorrect step time records (for example, lstm) 
      b5b1c3da
  3. 21 Apr, 2022 1 commit
  4. 19 Apr, 2022 1 commit
  5. 18 Apr, 2022 1 commit
  6. 11 Apr, 2022 2 commits
  7. 10 Apr, 2022 1 commit
  8. 08 Apr, 2022 1 commit
  9. 01 Apr, 2022 1 commit
  10. 24 Mar, 2022 1 commit
  11. 22 Mar, 2022 1 commit
  12. 21 Mar, 2022 1 commit
  13. 17 Mar, 2022 1 commit
  14. 16 Mar, 2022 1 commit
    • rafsalas19's avatar
      Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce
      rafsalas19 authored
      **Description**
      Modifications adding GPU-Burn to SuperBench.
      - added third party submodule
      - modified Makefile to make gpu-burn binary
      - added/modified microbenchmarks to add gpu-burn python scripts
      - modified default and azure_ndv4 configs to add gpu-burn
      ff51a3ce
  15. 15 Mar, 2022 2 commits
  16. 09 Mar, 2022 1 commit
  17. 07 Mar, 2022 1 commit
  18. 06 Mar, 2022 1 commit
  19. 24 Feb, 2022 1 commit
  20. 22 Feb, 2022 1 commit
  21. 20 Feb, 2022 2 commits
  22. 15 Feb, 2022 1 commit
  23. 10 Feb, 2022 1 commit
  24. 09 Feb, 2022 1 commit
  25. 08 Feb, 2022 2 commits
  26. 07 Feb, 2022 1 commit
    • Ziyue Yang's avatar
      Benchmarks: Revise Code - Reduce result variance in gpu_copy benchmark (#298) · 85389055
      Ziyue Yang authored
      **Description**
      This commit does the following to optimize result variance in gpu_copy benchmark:
      1) Add warmup phase for gpu_copy benchmark to avoid timing instability caused by first-time CUDA kernel launch overhead;
      2) Use CUDA events for timing instead of CPU timestamps;
      3) Make data checking an option that is not preferred to be enabled in performance test;
      4) Enlarge message size in performance benchmark.
      85389055
  27. 30 Jan, 2022 1 commit
  28. 29 Jan, 2022 3 commits
  29. 28 Jan, 2022 2 commits
    • guoshzhao's avatar
      Benchmarks: Add Feature - Sync the E2E training results among all workers for each step. (#287) · d03d110f
      guoshzhao authored
      **Description**
      Please write a brief description and link the related issue if have.
      
      **Major Revision**
      - Sync (do allreduce max) the E2E training results among all workers.
      - Avoid using ':0' in metric name if there has only one rank having output.
      d03d110f
    • guoshzhao's avatar
      Benchmarks: Add Feature - Add timeout feature for each benchmark. (#288) · d877ca23
      guoshzhao authored
      **Description**
      Add timeout feature for each benchmark.
      
      **Major Revision**
      - Add `timeout` config for each benchmark. In current config files, only set the timeout for kernel-launch as example. Other benchmarks can be set in the future.
      - Set the timeout config for `ansible_runner.run()`. Runner will get the return code 254:
         [ansible.py:80][WARNING] Run failed, return code 254.
      - Using `timeout` command to terminate the client process.
      d877ca23
  30. 27 Jan, 2022 1 commit
  31. 25 Jan, 2022 1 commit
  32. 24 Jan, 2022 2 commits