"dockerfile/rocm5.0.x.dockerfile" did not exist on "3f135e4669a8a89b5be6335ef89d75ae06d4ab76"
  1. 23 Apr, 2026 1 commit
  2. 02 Apr, 2026 2 commits
  3. 01 Apr, 2026 1 commit
  4. 31 Mar, 2026 1 commit
  5. 20 Mar, 2026 1 commit
  6. 19 Mar, 2026 2 commits
    • one's avatar
      Enhance DTK platform support and GPU detection · 1a57f2d6
      one authored
      - Added Platform.DTK in the microbenchmark framework.
      - Introduced new DTK hipblaslt benchmark class and corresponding tests.
      - Updated Dockerfile to include hipblaslt-bench and its permissions.
      - Registered DTK benchmarks in the benchmark registry for various performance tests.
      - Enhanced GPU detection logic to recognize HYGON GPUs.
      
      This update improves the benchmarking capabilities for DTK, ensuring compatibility and performance testing across platforms.
      1a57f2d6
    • one's avatar
      Update DTK dockerfile and microbenchmarks · c4f39919
      one authored
      - Update rocm_commom.cmake for CMake>=3.24
      - Prevent isolation build
      - Add BabelStream as a submodule
      - Update dockerignore
      c4f39919
  7. 17 Mar, 2026 1 commit
  8. 11 Mar, 2026 1 commit
  9. 28 Jan, 2026 1 commit
  10. 06 Nov, 2025 1 commit
  11. 01 Oct, 2025 1 commit
  12. 25 Jun, 2025 1 commit
  13. 30 Apr, 2025 1 commit
  14. 21 Mar, 2025 1 commit
  15. 21 Nov, 2024 1 commit
  16. 06 Nov, 2024 1 commit
    • pdr's avatar
      Dockerfile - Add support for arm64 build (#660) · 47949127
      pdr authored
      Add support for arm64 build:
      
      - Updated dockerfile for arm64 build
      - extend cpu stream compilation for neoverse 
      - handle onnxruntime-gpu installation
      - third party builds filtering based on arch
      - disable cuda decode perf build for non x86
      47949127
  17. 10 Oct, 2024 1 commit
  18. 13 Aug, 2024 1 commit
  19. 28 Jul, 2024 1 commit
  20. 22 Apr, 2024 1 commit
  21. 18 Apr, 2024 1 commit
  22. 21 Mar, 2024 1 commit
  23. 08 Jan, 2024 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.10.0 (#607) · 2c88db90
      Yifan Xiong authored
      **Description**
      
      Cherry-pick bug fixes from v0.10.0 to main.
      
      **Major Revisions**
      
      * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
      * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
      * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
      * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
      * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
      * CI/CD - Add ndv5 topo file #597
      * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
      * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
      * Dockerfile - Bug fix for rocm docker build and deploy #598
      * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
      * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
      * Monitor - U...
      2c88db90
  24. 09 Dec, 2023 1 commit
  25. 07 Dec, 2023 2 commits
  26. 22 Nov, 2023 3 commits
  27. 23 Oct, 2023 1 commit
  28. 22 Aug, 2023 1 commit
  29. 06 Jul, 2023 1 commit
  30. 03 Jul, 2023 1 commit
  31. 28 Jun, 2023 1 commit
  32. 14 Apr, 2023 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.8.0 (#517) · 51761b3a
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.8.0 to main.
      
      **Major Revisions**
      
      * Monitor - Fix the cgroup version checking logic (#502)
      * Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
      * Fix wrong torch usage in communication wrapper for Distributed
      Inference Benchmark (#505)
      * Analyzer: Fix bug in python3.8 due to pandas api change (#504)
      * Bug - Fix bug to get metric from cmd when error happens (#506)
      * Monitor - Collect realtime GPU power when benchmarking (#507)
      * Add num_workers argument in model benchmark (#511)
      * Remove unreachable condition when write host list (#512)
      * Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
      * Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
      * Docs - Upgrade version and release note (#508)
      Co-authored-by: default avatarguoshzhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      51761b3a
  33. 21 Mar, 2023 1 commit
  34. 06 Mar, 2023 1 commit
  35. 13 Feb, 2023 1 commit