1. 24 Mar, 2022 1 commit
  2. 22 Mar, 2022 1 commit
  3. 21 Mar, 2022 1 commit
  4. 17 Mar, 2022 1 commit
  5. 16 Mar, 2022 1 commit
    • rafsalas19's avatar
      Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce
      rafsalas19 authored
      **Description**
      Modifications adding GPU-Burn to SuperBench.
      - added third party submodule
      - modified Makefile to make gpu-burn binary
      - added/modified microbenchmarks to add gpu-burn python scripts
      - modified default and azure_ndv4 configs to add gpu-burn
      ff51a3ce
  6. 15 Mar, 2022 2 commits
  7. 09 Mar, 2022 1 commit
  8. 07 Mar, 2022 2 commits
  9. 06 Mar, 2022 1 commit
  10. 28 Feb, 2022 2 commits
  11. 25 Feb, 2022 1 commit
  12. 24 Feb, 2022 2 commits
  13. 22 Feb, 2022 1 commit
  14. 21 Feb, 2022 1 commit
  15. 20 Feb, 2022 2 commits
  16. 15 Feb, 2022 2 commits
  17. 10 Feb, 2022 1 commit
  18. 09 Feb, 2022 2 commits
  19. 08 Feb, 2022 2 commits
  20. 07 Feb, 2022 1 commit
    • Ziyue Yang's avatar
      Benchmarks: Revise Code - Reduce result variance in gpu_copy benchmark (#298) · 85389055
      Ziyue Yang authored
      **Description**
      This commit does the following to optimize result variance in gpu_copy benchmark:
      1) Add warmup phase for gpu_copy benchmark to avoid timing instability caused by first-time CUDA kernel launch overhead;
      2) Use CUDA events for timing instead of CPU timestamps;
      3) Make data checking an option that is not preferred to be enabled in performance test;
      4) Enlarge message size in performance benchmark.
      85389055
  21. 30 Jan, 2022 1 commit
  22. 29 Jan, 2022 3 commits
  23. 28 Jan, 2022 2 commits
    • guoshzhao's avatar
      Benchmarks: Add Feature - Sync the E2E training results among all workers for each step. (#287) · d03d110f
      guoshzhao authored
      **Description**
      Please write a brief description and link the related issue if have.
      
      **Major Revision**
      - Sync (do allreduce max) the E2E training results among all workers.
      - Avoid using ':0' in metric name if there has only one rank having output.
      d03d110f
    • guoshzhao's avatar
      Benchmarks: Add Feature - Add timeout feature for each benchmark. (#288) · d877ca23
      guoshzhao authored
      **Description**
      Add timeout feature for each benchmark.
      
      **Major Revision**
      - Add `timeout` config for each benchmark. In current config files, only set the timeout for kernel-launch as example. Other benchmarks can be set in the future.
      - Set the timeout config for `ansible_runner.run()`. Runner will get the return code 254:
         [ansible.py:80][WARNING] Run failed, return code 254.
      - Using `timeout` command to terminate the client process.
      d877ca23
  24. 27 Jan, 2022 1 commit
  25. 25 Jan, 2022 1 commit
  26. 24 Jan, 2022 2 commits
  27. 23 Jan, 2022 1 commit
  28. 21 Jan, 2022 1 commit