1. 19 Mar, 2026 1 commit
    • one's avatar
      Enhance DTK platform support and GPU detection · 1a57f2d6
      one authored
      - Added Platform.DTK in the microbenchmark framework.
      - Introduced new DTK hipblaslt benchmark class and corresponding tests.
      - Updated Dockerfile to include hipblaslt-bench and its permissions.
      - Registered DTK benchmarks in the benchmark registry for various performance tests.
      - Enhanced GPU detection logic to recognize HYGON GPUs.
      
      This update improves the benchmarking capabilities for DTK, ensuring compatibility and performance testing across platforms.
      1a57f2d6
  2. 21 Jun, 2023 1 commit
  3. 04 Jan, 2023 1 commit
    • Yifan Xiong's avatar
      Benchmarks - Support FP8 in BERT models (#446) · 5197cdf5
      Yifan Xiong authored
      Support FP8 in PyTorch BERT models:
      
      * add fp8 hybrid/e4m3/e5m2 in precision arguments
      * build BERT encoders with `te.TransformerLayer` to repalce
      `transformers.BertModel`
      * wrap forward steps with fp8 autocast
      5197cdf5
  4. 21 Oct, 2021 1 commit
  5. 21 Jun, 2021 1 commit
  6. 29 Mar, 2021 1 commit
  7. 24 Feb, 2021 1 commit