1. 20 Jun, 2025 1 commit
    • WenqingLan1's avatar
      Benchmark - Add Grace CPU support for CPU Stream (#719) · 0b8d1fd4
      WenqingLan1 authored
      
      
      **Description**
      Added support for Grace CPU neo2 architecture in CPU Stream. Now CPU
      Stream supports dual socket benchmarking.
      
      Example config for this arch support:
      ```yaml
          cpu-stream:numa0:
            timeout: *default_timeout
            modes:
            - name: local
              parallel: no
            parameters:
              cpu_arch: neo2
              numa_mem_nodes: 0
              cores: 0 1 2 3 4 5 6 7 8
          cpu-stream:numa1:
            timeout: *default_timeout
            modes:
            - name: local
              parallel: no
            parameters:
              cpu_arch: neo2
              numa_mem_nodes: 1
              cores: 64 65 66 67 68 69 70 71 72
          cpu-stream:numa-spread:
            timeout: *default_timeout
            modes:
            - name: local
              parallel: no
            parameters:
              cpu_arch: neo2
              numa_mem_nodes: 0 1
              cores: 0 1 2 3 4 5 6 7 8 64 65 66 67 68 69 70 71 72
      ```
      
      ---------
      Co-authored-by: default avatardpower4 <dilipreddi@gmail.com>
      0b8d1fd4
  2. 18 Jun, 2025 1 commit
    • WenqingLan1's avatar
      Benchmarks - Add GPU Stream Micro Benchmark (#697) · 4eddd50a
      WenqingLan1 authored
      Added GPU Stream benchmark - measures the GPU memory bandwidth and
      efficiency for double datatype through various memory operations
      including copy, scale, add, and triad.
      - added documentation for `gpu-stream` detailing its introduction,
      metrics, and descriptions.
      - added unit tests for `gpu-stream`. Example output is in
      `superbenchmark/tests/data/gpu_stream.log`.
      4eddd50a
  3. 05 Feb, 2025 1 commit
  4. 28 Nov, 2024 1 commit
    • pdr's avatar
      Benchmarks - Add LLaMA-2 Models (#668) · 249e21c1
      pdr authored
      Added llama benchmark - training and inference in accordance with the
      existing pytorch models implementation like gpt2, lstm etc.
      
      - added llama fp8 unit test for better code coverage, to reduce memory
      required
      - updated transformers version >= 4.28.0 for LLamaConfig
      - set tokenizers version <= 0.20.3 to avoid 0.20.4 version
      [issues](https://github.com/huggingface/tokenizers/issues/1691
      
      ) with
      py3.8
      - added llama2 to tensorrt
      - llama2 tests not added to test_tensorrt_inference_performance.py due
      to large memory requirement for worker gpu. tests validated separately
      on gh200
      
      ---------
      Co-authored-by: default avatardpatlolla <dpatlolla@microsoft.com>
      249e21c1
  5. 22 Nov, 2024 1 commit
  6. 08 Dec, 2023 1 commit
  7. 24 Mar, 2023 1 commit
  8. 21 Mar, 2023 1 commit
  9. 13 Feb, 2023 1 commit
  10. 11 Apr, 2022 1 commit
  11. 16 Mar, 2022 1 commit
    • rafsalas19's avatar
      Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce
      rafsalas19 authored
      **Description**
      Modifications adding GPU-Burn to SuperBench.
      - added third party submodule
      - modified Makefile to make gpu-burn binary
      - added/modified microbenchmarks to add gpu-burn python scripts
      - modified default and azure_ndv4 configs to add gpu-burn
      ff51a3ce
  12. 08 Feb, 2022 1 commit
  13. 21 Jan, 2022 1 commit
  14. 13 Dec, 2021 1 commit
  15. 10 Dec, 2021 1 commit
  16. 25 Nov, 2021 1 commit
  17. 12 Nov, 2021 1 commit
  18. 09 Nov, 2021 1 commit
  19. 30 Oct, 2021 1 commit
  20. 27 Oct, 2021 1 commit
  21. 22 Oct, 2021 1 commit
  22. 12 Oct, 2021 1 commit
  23. 30 Aug, 2021 2 commits
  24. 27 Aug, 2021 1 commit
  25. 30 Jul, 2021 1 commit
  26. 26 Jul, 2021 1 commit
  27. 23 Jul, 2021 2 commits
  28. 13 Jul, 2021 1 commit
  29. 02 Jun, 2021 1 commit
  30. 01 Jun, 2021 1 commit
  31. 31 May, 2021 1 commit
  32. 19 May, 2021 2 commits
  33. 26 Apr, 2021 1 commit
  34. 20 Apr, 2021 2 commits
  35. 16 Apr, 2021 2 commits